Acquire Image and Body Data Using Kinect V2
In Detect the Kinect V2 Devices, you see that the two sensors on
the Kinect® for Windows® device are represented by two device
IDs, one for the color sensor and one of the depth sensor. In that
example, Device 1 is the color sensor and Device 2 is the depth sensor.
This example shows how to create a videoinput
object
for the color sensor to acquire RGB images and then for the depth
sensor to acquire body data.
Create the
videoinput
object for the color sensor.DeviceID
1 is used for the color sensor.vid = videoinput('kinect',1);
Note that you do not need to provide the video format as you do for a Kinect V1 device, since only one format is used in Kinect V2 devices (
RGB_1920x1080
).Look at the device-specific properties on the source device, which is the color sensor on the Kinect V2 camera.
src = getselectedsource(vid); src Display Summary for Video Source Object: General Settings: Parent = [1x1 videoinput] Selected = on SourceName = Kinect V2 Color Source Tag = Type = videosource Device Specific Properties: ExposureTime = 4000 FrameInterval = 333333 Gain = 1 Gamma = 2.2
The output shows that the color sensor has a set of device-specific properties. These properties are read-only for Kinect V2. You can set properties on Kinect V1 devices, but not on Kinect V2 devices. The Kinect V2 device can change the properties, depending on conditions.
Device-Specific Property – Color Sensor Description ExposureTime
Indicates the exposure time in increments of 1/10,000 of a second. FrameInterval
Indicates the frame interval in units of 1/1,000,000 of a second. Gain
Indicates a multiplier for the RGB color values. Gamma
Indicates gamma measurement. Preview the color stream by calling
preview
on the color sensor object you created.preview(vid);
When you are done previewing, close the preview window.
closepreview(vid);
Create the
videoinput
object for the depth sensor. Note that a second object is created (vid2
), andDeviceID
2 is used for the depth sensor.vid2 = videoinput('kinect', 2);
Look at the device-specific properties on the source device, which is the depth sensor on the Kinect V2 camera.
src = getselectedsource(vid2); src Display Summary for Video Source Object: General Settings: Parent = [1x1 videoinput] Selected = on SourceName = Kinect V2 Depth Source Tag = Type = videosource Device Specific Properties: EnableBodyTracking = off
The output shows that the depth sensor has one device-specific property associated with body tracking. This property is specific to the depth sensor.
Device-Specific Property – Depth Sensor Description EnableBodyTracking
Indicates tracking state. When set to on
, it returns body metadata. The default isoff
.Collect body metadata by turning on body tracking, which is off by default.
src.EnableBodyTracking = 'on';
Start the second
videoinput
object (the depth stream).start(vid2);
Access body tracking data as metadata on the depth stream using
getdata
. The function returns:Frames of size
512x424
inmono13
format anduint16
data typeTime stamps
Metadata
% Get the data on the object. [frame, ts, metaData] = getdata(vid2); % Look at the metadata to see the parameters in the body data. metaData metaData = 11x1 struct array with fields: IsBodyTracked: [1x6 logical] BodyTrackingID: [1x6 double] BodyIndexFrame: [424x512 double] ColorJointIndices: [25x2x6 double] DepthJointIndices: [25x2x6 double] HandLeftState: [1x6 double] HandRightState: [1x6 double] HandLeftConfidence: [1x6 double] HandRightConfidence: [1x6 double] JointTrackingStates: [25x6 double] JointPositions: [25x3x6 double]
These metadata fields are related to tracking the bodies.
MetaData Description IsBodyTracked
A 1 x 6 Boolean matrix of true/false values for the tracking of the position of each of the six bodies. A 1
indicates the body is tracked, and a0
indicates it is not. See step 9 below for an example.BodyTrackingID
A 1 x 6 double that represents the tracking IDs for the bodies. ColorJointIndices
A 25 x 2 x 6 double matrix of x- and y-coordinates for 25 joints in pixels relative to the color image, for the six possible bodies. DepthJointIndices
A 25 x 2 x 6 double matrix of x- and y-coordinates for 25 joints in pixels relative to the depth image, for the six possible bodies. BodyIndexFrame
A 424 x 512 double that indicates which pixels belong to tracked bodies and which do not. Use tis metadata to acquire segmentation data. HandLeftState
A 1 x 6 double that identifies possible hand states for the left hands of the bodies. Values include: 0
unknown1
not tracked2
open3
closed4
lassoHandRightState
A 1 x 6 double that identifies possible hand states for the right hands of the bodies. Values include: 0
unknown1
not tracked2
open3
closed4
lassoHandLeftConfidence
This is a 1 x 6 double that identifies the tracking confidence for the left hands of the bodies. Values include: 0
low1
highHandRightConfidence
A 1 x 6 double that identifies the tracking confidence for the right hands of the bodies. Values include: 0
low1
highJointTrackingStates
A 25 x 6 double matrix that identifies the tracking states for joints. Values include: 0
not tracked1
inferred2
trackedJointPositions
This is a 25 x 3 x 6 double matrix indicating the location of each joint in 3-D space. See the Joint Positions section for a list of the 25 joint positions. Look at any individual property by drilling into the metadata. For example, look at the
IsBodyTracked
property.metaData.IsBodyTracked ans = 1 0 0 0 0 0
In this case the data shows that of the six possible bodies, there is one body being tracked and it is in the first position. If you have multiple bodies, this property is useful to confirm which ones are being tracked.
Get the joint locations for the first body using the
JointPositions
property. Since this is the body in position 1, the index uses1
.metaData.JointPositions(:,:,1) ans = -0.1408 -0.3257 2.1674 -0.1408 -0.2257 2.1674 -0.1368 -0.0098 2.2594 -0.1324 0.1963 2.3447 -0.3024 -0.0058 2.2574 -0.3622 -0.3361 2.1641 -0.3843 -0.6279 1.9877 -0.4043 -0.6779 1.9877 0.0301 -0.0125 2.2603 0.2364 0.2775 2.2117 0.3775 0.5872 2.2022 0.4075 0.6372 2.2022 -0.2532 -0.4392 2.0742 -0.1869 -0.8425 1.8432 -0.1869 -1.2941 1.8432 -0.1969 -1.3541 1.8432 -0.0360 -0.4436 2.0771 0.0382 -0.8350 1.8286 0.1096 -1.2114 1.5896 0.1196 -1.2514 1.5896 0.2969 1.2541 1.2432 0.1360 0.5436 1.1771 0.1382 0.7350 1.5286 0.2096 1.2114 1.3896 0.0196 1.1514 1.4896
The columns represent the X, Y, and Z coordinates in meters of the 25 points on body 1.
Optionally view the segmentation data as an image using the
BodyIndexFrame
property.% View the segmentation data as an image. imagesc(metaDataDepth.BodyIndexFrame); % Set the color map to jet to color code the people detected. colormap(jet);
Joint Positions
The EnableBodyTracking
property indicates whether body metadata is
collected. When set to on
, this list displays the order of the joints
returned by the Kinect V2 adaptor in the JointPositions
property.
SpineBase = 1; SpineMid = 2; Neck = 3; Head = 4; ShoulderLeft = 5; ElbowLeft = 6; WristLeft = 7; HandLeft = 8; ShoulderRight = 9; ElbowRight = 10; WristRight = 11; HandRight = 12; HipLeft = 13; KneeLeft = 14; AnkleLeft = 15; FootLeft = 16; HipRight = 17; KneeRight = 18; AnkleRight = 19; FootRight = 20; SpineShoulder = 21; HandTipLeft = 22; ThumbLeft = 23; HandTipRight = 24; ThumbRight = 25;