Convert Ground Truth Labeling Data for Object Re-Identification
This example shows how to convert a groundTruth object to the re-identification training data format.
Overview
Re-identification (ReID) plays a vital role in visual object tracking, addressing temporary occlusion or objects leaving the camera's field of view, which complicates consistent tracking in real-world scenarios. To train a ReID network created using the reidentificationNetwork object, the ground truth data must be processed so that the training data only consists of the people within the ground truth bounding boxes. These cropped images must have consistent labeling for each object. In this example, convert a fully labeled ground truth video to the required ReID training format.
Load Ground Truth Labeling Data
To convert ground truth data into a format usable for training a ReID network, ensure that the groundTruth object has the required format. The ground truth for each object must have a rectangular region of interest (ROI) and a numeric attribute for the object ID. To learn how to label data for object tracking and generate the ground truth data, see the Automate Ground Truth Labeling for Object Tracking and Re-Identification example. In this example, the ROI is labeled as Person.
Download the video containing the ground truth data, and load the groundTruth object.
helperDownloadLabelVideo();
Downloading Pedestrian Tracking Video (90 MB)
load("groundTruth.mat","gTruth");
Convert Ground Truth for Object Re-Identification
Once the data has been fully labeled and exported from a labeler, use gatherLabelData to obtain all ROI data and the writeFrames groundTruth method to export all video frames. Then, create imageDatastore and boxLabelDatastore objects from the extracted data and combine them using combine.
Process the ground truth for training and store input images for the network to use. Use the helperCropImagesWithGroundtruth helper function to crop out all the labeled test data within the video frames using the groundTruth object. Use the function to resize the cropped images to 256-by-128 pixels and organize the labels into individual folders under the root directory trainingDataFolder.
trainingDataFolder = fullfile("trainingData"); imageFrameWriteLoc = fullfile("videoFrames"); dataSize = [256 128]; if ~isfolder(trainingDataFolder) helperCropImagesWithGroundtruth(gTruth,trainingDataFolder,imageFrameWriteLoc,dataSize); end
Write images extracted for training to folder:
videoFrames
Writing 150 images extracted from PedestrianLabelingVideo.avi...Completed.
Cleaning up videoFrames directory.
Done.
Load the cropped and organized training images into an ImageDatastore object. To use all the data in trainingDataFolder, specify the IncludeSubfolders name-value argument as true. To use the corresponding folder names as the training data labels, specify the LabelSource name-value argument as "foldernames".
imds = imageDatastore(trainingDataFolder,IncludeSubfolders=true,LabelSource="foldernames");Display a set of image frames from the training data using the montage function.
rng(0) previewImages = cell(2,4); for i = 1:4 previewIdx = randi(numel(imds.Files)); previewImages{1,i} = readimage(imds,previewIdx); previewImages{2,i} = imds.Labels(previewIdx); end montage(previewImages(1,:),Size=[1 4],ThumbnailSize=dataSize)

Display the labels for each image from left to right.
strcat("ID = ",string(previewImages(2,:)))ans = 1×4 string
"ID = 7" "ID = 8" "ID = 1" "ID = 8"
To verify the accuracy of the labels, survey the values in the corresponding ID folder in the trainingDataFolder.
Next Steps
After you convert ground truth labeling data to the required format, you can employ it for training a ReID network using the trainReidentificationNetwork function. To learn how to configure, train, and evaluate a ReID network, see the Reidentify People Throughout a Video Sequence Using ReID Network example.
Supporting Functions
helperDownloadLabelVideo
Download the pedestrian labeling video.
function helperDownloadLabelVideo videoURL = "https://ssd.mathworks.com/supportfiles/vision/data/PedestrianLabelingVideo.avi"; if ~exist("PPedestrianLabelingVideo.avi","file") disp("Downloading Pedestrian Tracking Video (90 MB)") websave("PedestrianLabelingVideo.avi",videoURL); end end
helperCropImagesWithGroundtruth
Crop all source images in the ground truth data gTruth with the bounding box labels gTruth. Store the cropped images in organized subdirectories in dataFolder.
function helperCropImagesWithGroundtruth(gTruth,dataFolder,imageFrameWriteLoc,dataSize) % Use gatherLabelData to obtain the label data from groundtruth. [labelData,timestamps] = gatherLabelData(gTruth,labelType.Rectangle,SampleFactor=1); % Write all of the video frames from the groundTruth data source. imgFileNames = writeFrames(gTruth,imageFrameWriteLoc,timestamps); imds = imageDatastore(imgFileNames{:}); blds = boxLabelDatastore(labelData{:}); combinedTrainingDs = combine(imds,blds); labelData = timetable2table(gTruth.LabelData); writeall(combinedTrainingDs,imageFrameWriteLoc,WriteFcn=@(data,info,format)helperWriteCroppedData(data,info,format,labelData,dataFolder,dataSize)) % Remove the video frame images. fprintf(1,"\nCleaning up %s directory.\n",imageFrameWriteLoc); rmdir(imageFrameWriteLoc,"s") fprintf(1,"\nDone.\n"); end
helperWriteCroppedData
Crop, resize, and store image ROIs from a combined datastore.
function helperWriteCroppedData(data,info,~,labelData,dataFolder,dataSize) num = 1; imageIdx = info.ReadInfo{1,2}.CurrentIndex; frame = num2str(imageIdx); imageLabelData = struct2table(labelData{imageIdx,2}{:}); attributeIDs = imageLabelData{:,2}; for i = 1:size(data{1,2},1) personID = string(attributeIDs(i)); personIDFolder = fullfile(dataFolder,personID); if ~isfolder(personIDFolder) mkdir(personIDFolder) end imgPath = fullfile(personIDFolder,strcat(frame,"_",num2str(num,'%02.f'),".jpg")); roi = data{1,2}(i,:); croppedImage = imcrop(data{1,1},roi); if ~isempty(croppedImage) resizedImg = imresize(croppedImage,dataSize); imwrite(resizedImg,imgPath); num = num + 1; end end end