VideoObjectTracking

VideoObjectTracking[video]

detects objects of interest in video and tracks them over video frames.

VideoObjectTracking[objects]

corresponds to and tracks objects, assuming they are from video frames.

VideoObjectTracking[detector]

uses detector to find objects of interest in the input.

Details and Options

  • VideoObjectTracking, also known as visual object tracking, tracks unique objects in frames of a video, trying to handle occlusions if possible. Tracked objects are also known as tracklets.
  • Tracking could automatically detect objects in frames or be performed on a precomputed set of objects.
  • The result, returned as an ObjectTrackingData object, includes times, labels and various other properties for each tracklet.
  • Possible settings for objects and their corresponding outputs are:
  • {{pos11,pos12,},}tracking points as kposij
    {{bbox11,bbox12,},}tracking boxes as kbboxij
    {label1{bbox11,bbox12,},,}tracking boxes as {labeli,j}bbox
    {lmat1,}relabeling segments in label matrices lmati
    {t1obj1,}a list of times and objects
  • By default, objects are detected using ImageBoundingBoxes. Possible settings for detector include:
  • fa detector function that returns supported objects
    "concept"named concept, as used in "Concept" entities
    "word"English word, as used in WordData
    wordspecword sense specification, as used in WordData
    Entity[]any appropriate entity
    category1|category2|any of the categoryi
  • Using VideoObjectTracking[{image1,image2,}] is similar to tracking objects across frames of a video.
  • The following options can be given:
  • Method Automatictracking method to use
    TargetDevice Automaticthe target device on which to perform detection
  • The possible values for the Method option are:
  • "OCSort"observation-centric SORT (simple, online, real-time) tracking; predicts object trajectories using Kalman estimators
    "RunningBuffer"offline method, associates objects by comparing a buffer of frames
  • When tracking label matrices, occlusions are not handled. They can be tracked with Method"RunningBuffer".
  • With Method->{"OCSort",subopt}, the following suboptions can be specified:
  • "IOUThreshold"0.2intersection over union threshold between bounding boxes
    "OcclusionThreshold"8number of frames for which history of a tracklet is maintained before expiration
    "ORUHistory"3length of tracklet history to step back for tracklet re-update
    "OCMWeight"0.2observation-centric motion weight that accounts for the directionality of moving bounding boxes
  • With Method->{"RunningBuffer",subopt}, the following suboptions can be specified:
  • "MaxCentroidDistance"Automaticmaximum distance between the centroids for adjacent frames
    "OcclusionThreshold"8number of frames for which the history of a tracklet is maintained before expiration
  • Additional "RunningBuffer" suboptions to specify the contribution to the cost matrix are:
  • "CentroidWeight"0.5centroid distance between components or bounding boxes
    "OverlapWeight"1overlap of components or bounding boxes
    "SizeWeight"Automaticsize of components or bounding boxes

Examples

open allclose all

Basic Examples  (2)

Detect and track objects in a video:

Detect and track faces in a video:

Extract the first frame from each sub-video:

Scope  (5)

Data  (4)

Detect and track objects in a video:

Detect and track objects in a list of images:

Track a list of bounding boxes:

Track a list of points:

Detectors  (1)

Automatically detect objects and track them:

Specify a detector function to find objects:

Specify the category of object to detect and track:

Options  (5)

Method  (4)

"OCSort"  (3)

In OCSort, motion is predicted using Kalman estimators. Higher values for "OCMWeight" increase the cost when boxes move away from the predicted positions.

Set up a problem with two sets of moving boxes:

By default, direction of trajectories will be very flexible. Notice that in the region of large intersection between blue and red boxes, tracking may change direction suddenly:

Increasing the "WeightOCM" decreases chances of sudden direction changes:

The "IOUThreshold" suboption specifies a threshold for intersection over union between boxes in order to consider them as potentially the same object.

Set up a problem with a set of moving bounding boxes with a gap:

A higher threshold for intersection splits the object trajectory into two parts:

A lower "IOUThreshold" merges the trajectories:

The "OcclusionThreshold" suboption deals with objects that disappear for some time (due to poor detection or occlusion).

Set up a problem with a moving bounding box and remove a couple of frames from the trajectory:

Without an occlusion threshold, the object is not re-associated to the tracklet once it re-emerges:

With a higher occlusion threshold defined, the trajectory is linked back:

"RunningBuffer"  (1)

The "RunningBuffer" method can typically better track objects whose trajectories have a jump due to occlusion or fast movement:

The "OCSort" method results in different instances of the same hummingbird:

"RunningBuffer" links the trajectories together to track the bird as one:

TargetDevice  (1)

By default, if no detection function is specified, the detection is performed on CPU:

Set the TargetDevice option to "GPU" to perform the detection on GPU:

Applications  (12)

Basic Uses  (3)

Detect and track objects in a video:

Highlight objects on the video; notice all are labeled with their detected classes:

Track the detected objects:

Highlight tracked detected objects with their corresponding indices:

Track labeled components from matrices:

Define a segmentation function that works on each frame:

Segment video frames and show components:

Track the components across frames and show tracked components:

Use the YOLO V8 network from the Wolfram Neural Net Repository to perform the detection:

Retrieve the network and its evaluation function:

Detect and track the object using the YOLO V8 network:

Highlight the tracked detected objects:

Count Objects  (3)

Count the number of detected objects in a video:

Track objects and find unique instances:

Get the final counts:

Count occurrences of a specific object:

Track objects and find unique instances:

Get the final counts:

Count the number of elephants in a video:

Extract Tracked Objects  (1)

Detect and track the contents of a video:

Extract the first of the detected labels:

Extract the sub-video corresponding to the first tracked object:

Visualize Motion Trajectories  (1)

Track pedestrians in a railway station:

Detect the bounding boxes and show them over the original video:

Track the boxes:

Plot the trajectories of the centroids of the boxes:

Overlay the trajectories onto the original video:

Analyze Wildlife Videos  (3)

Track a herd of migrating elephants:

Highlight frames with the tracked elephants:

Track a herd of galloping horses:

Track a flock of sheep entering a barn:

Analyze Human Videos  (1)

Estimate age from the face of each person in a video:

Detect and track faces:

Find tracked faces with longest duration in the video:

Detect and track faces:

Find tracked faces with longest duration in the video:

Construct a timeseries of selected labels:

Compute facial emotions for each each tracked face:

Compute median estimated age for each face:

Wolfram Research (2025), VideoObjectTracking, Wolfram Language function, https://reference.wolfram.com/language/ref/VideoObjectTracking.html.

Text

Wolfram Research (2025), VideoObjectTracking, Wolfram Language function, https://reference.wolfram.com/language/ref/VideoObjectTracking.html.

CMS

Wolfram Language. 2025. "VideoObjectTracking." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/VideoObjectTracking.html.

APA

Wolfram Language. (2025). VideoObjectTracking. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/VideoObjectTracking.html

BibTeX

@misc{reference.wolfram_2024_videoobjecttracking, author="Wolfram Research", title="{VideoObjectTracking}", year="2025", howpublished="\url{https://reference.wolfram.com/language/ref/VideoObjectTracking.html}", note=[Accessed: 20-January-2025 ]}

BibLaTeX

@online{reference.wolfram_2024_videoobjecttracking, organization={Wolfram Research}, title={VideoObjectTracking}, year={2025}, url={https://reference.wolfram.com/language/ref/VideoObjectTracking.html}, note=[Accessed: 20-January-2025 ]}