
1.13.6 • Public • Published

WowYow Vision Node.js SDK

Getting Started


npm install '@wowyow/vision-node'


const Vision = require('@wowyow/vision-node');


Vision.APIKey = '';

WowYow API Key can be obtained from WowYow AI Studio.


let vision = new Vision('https://www.youtube.com/watch?v=IgiB3dyxUvc');




There are several ways of initializing the process. As soon as Vision class is instantiated, the processing will begin. Entire process consists of of several subprocesses depending on file type or link. They are: upload, frame extraction & detection.


let vision = new Vision('https://www.youtube.com/watch?v=IgiB3dyxUvc');

Supported sites: YouTube, Dailymotion, Vimeo, Discovery, FOX, Instagram, Facebook & more.

Video Upload

let vision = new Vision('./people_dancing.mp4');

Only mp4 & webm file types are supported.


SDK Offers range of configuration options to optimize performance or help debugging. Configuration is passed as second argument in SDK.

Option Type Default Description
models Array ['DETECT_SCENE', 'DETECT_PEOPLE'] List of models to use. Available are: DETECT_SCENE, DETECT_PEOPLE, DETECT_CLOTHING
fps Integer 4 Processing FPS. Values can be between 4 and 30. This option is ignored if stream is used. See RTSP
target String '' URL of the machine used for processing
directConnection Boolean false Specify whether or not client should establish direct connection to target
preview Boolean false If true, response data will contain preview images with skeleton for DETECT_PEOPLE model and index numbers for pathing.
logPerf Boolean false If true, performance metrics will be logged to console.
skipFrames Boolean true This option is used only for streams. If true, processing will skip as many frames as it's necessary to keep up with video playback. If false, frames will be queued.
footage Object Include to optimize detection for specific camera type.
footage.lens String 'standard' Type of camera lens. Options are: standard, fisheye, wide
footage.mod String '' Video modification applied. Options are: dewarped
footage.movement String '' Movement of camera. Options are: static, handheld, pan
ffmpeg String '' If you are experiencing issues with RTSP streams please include path to your own ffmpeg binary via this parameter. If ffmpeg is available in your system PATH, you can just pass 'ffmpeg' as parameter. For more help on ffmpeg installation visit Official FFMPEG Download Page.
Config Example

let vision = new Vision(e, {


vision.on('progress', ({ event, progress }) => {}); // events are: upload, frame-extraction & detection.
vision.on('data', (data) => {}); // See Data Schema section for details
vision.on('start', () => {});
vision.on('pause', () => {});
vision.on('resume', () => {});
vision.on('end', () => {});
vision.on('error', (err) => {});
vision.on('metadata', (metadata) => {}); // See Metadata section for details
vision.on('stream', (stream) => {}); // See Stream section for details





Static Methods

// check if there is an avilable server
let available = Vision.available({
  target: ''
console.log(available); // true, false

// fetch complete algorithms
let algorithms = Vision.algorithms({
  target: ''
console.log(algorithms); // list of algorithms

// fetch available nodes
let nodes = Vision.nodes({
  target: ''
console.log(nodes); // list of nodes available to be used in algorithms

Data Schema

Field Type Description Note
data Object top-level object
data.{model} Object
data.{model}.mediaId String WowYow Media Identifier
data.{model}.frame Integer Frame Number
data.{model}.timestamp Decimal Processing FPS. * Not used in RTSP
data.{model}.width Integer Frame Width
data.{model}.height Integer Frame Height
data.{model}.model String Model name
data.{model}.predictions Array List of predictions generated by this model. * Check [Model Schema](#Model Schema) to see schema for each model.
data.{model}.source Object
data.{model}.source.base64 String Base64 link of frame
data.{model}.source.jpeg String JPEG link of frame
data.{model}.preview Object * Included only if preview config is used.
data.{model}.preview.base64 String Base64 link of frame including model overlay information
data.{model}.preview.jpeg String JPEG link of frame including model overlay information


Field Type Description Note
duration Number Seconds of video duration. In case of RTSP Stream it will have Infinity value


Field Type Description Note
url String Url to access streamable video file from the server which can be directly added as source of video html element

Model Schema


Field Type Description Note
prediction.index Integer Scene identifier * Do not expect index to always be incremented by 1


Field Type Description Note
prediction Object top-level object
prediction.score Decimal
prediction.index Integer Person identifier used for pathing * Do not expect index to always be incremented by 1
prediction.keypoints Array List of keypoints
prediction.keypoints[].part String Name of the body part. Options are: nose, leftEye, rightEye, leftEar, rightEar, leftShoulder, rightShoulder, leftElbow, rightElbow, leftWrist, rightWrist, leftHip, rightHip, leftKnee, rightKnee, leftAnkle, rightAnkle
prediction.keypoints[].position Object
prediction.keypoints[].position.x Decimal Position of keypoint on x axis
prediction.keypoints[].position.y Decimal Position of keypoint on y axis
prediction.segments Object Object containing each person segment
prediction.segments.body Object
prediction.segments.body.bbox Object Bounding Box of person body segment
prediction.segments.body.bbox.x0 Integer Coordinate top offset * Check BBox.
prediction.segments.body.bbox.y0 Integer Coordinate left offset * Check BBox.
prediction.segments.body.bbox.x1 Integer Coordinate top offset * Check BBox.
prediction.segments.body.bbox.y1 Integer Coordinate left offset * Check BBox.
prediction.segments.body.preview Object * Included only if preview config is used.
prediction.segments.body.preview.base64 Object Base64 link of segment cut-out.
prediction.segments.body.preview.jpeg Object JPEG link of segment cut-out.
prediction.segments.face Object
prediction.segments.face.bbox Object Bounding Box of person face segment
prediction.segments.face.bbox.x0 Integer Coordinate top offset * Check BBox.
prediction.segments.face.bbox.y0 Integer Coordinate left offset * Check BBox.
prediction.segments.face.bbox.x1 Integer Coordinate top offset * Check BBox.
prediction.segments.face.bbox.y1 Integer Coordinate left offset * Check BBox.
prediction.segments.face.preview Object * Included only if preview config is used.
prediction.segments.face.preview.base64 Object Base64 link of segment cut-out.
prediction.segments.face.preview.jpeg Object JPEG link of segment cut-out.
prediction.position String Human position detection. Options are: sitting, standing
prediction.clothing Object Object containing clothing * Only if DETECT_CLOTHING model is used. Schema is same as this model's main schema.


Field Type Description Note
prediction Object top-level object
prediction.score Decimal
prediction.index Integer Clothing identifier used for pathing * Do not expect index to always be incremented by 1
prediction.type String Type of the clothing. Options are: Leggings, Jodhpurs, Capris, Shorts, Jeans, Joggers, Skirt, Gauchos, Culottes, Sweatshorts, Trunks, Cutoffs, Sarong, Sweatpants, Chinos, Halter, Hoodie, Henley, Parka, Cardigan, Tank, Bomber, Peacoat, Top, Poncho, Button-Down, Anorak, Sweater, Blouse, Turtleneck, Blazer, Jacket, Jersey, Tee, Flannel, Jeggings
prediction.bbox Object Bounding Box of clothing item
prediction.bbox.x0 Integer Coordinate top offset * Check BBox.
prediction.bbox.y0 Integer Coordinate left offset * Check BBox.
prediction.bbox.x1 Integer Coordinate top offset * Check BBox.
prediction.bbox.y1 Integer Coordinate left offset * Check BBox.
prediction.preview Object * Included only if preview config is used.
prediction.preview.base64 Object Base64 link of segment cut-out.
prediction.preview.jpeg Object JPEG link of segment cut-out.


RTSP processing differs from videos with fixed duration. Serveral things need to be kept in mind when using it:

  1. Response data frame and timestamp use start of processing as reference point, starting both from 0.
  2. By default, frames are dropped if processing is lagging behind video playback for any reason. Whether it's because of low processing fps or network issues.






Package Sidebar


npm i @wowyow/vision-node

Weekly Downloads






Unpacked Size

103 kB

Total Files


Last publish


  • wowyow-npm