Implementing KLT face tracking on live webcam

Hi all,
to work with a live webcam stream, I have managed to initiate the webcam input as an object and can run the detection and tracking okish :) but will have to work through this which I should be able to (dont like to ask for help unless I really have to).
One bit which I do need help with is for me to perform further processing on each detected ROI I need to crop the frame to ROI and reorientate and possibly scale, the processing I need to do will be relative changes between frames mean intensities in each channel and then some jiggery pokery :) again all this I can work through and cope with.
The Demo code uses geometric transform to display a bounding box as a polygon around the detected section of image which is what I require.
I am just really banging my head against a wall with the reverse transform and crop.
Any help or pointers will be appreciated

回答(1 个)

It's not clear what you need help with. You said you needed help with processing but then said you could work through it, presumably by yourself.
I know what cropping is, and to crop, use imcrop(), or simple indexing. But I don't know what a "reverse transform" means to you. What sort of reverse transforming do you want to do before the crop operation?

7 个评论

Thanks for quick replying
Sorry it wasn't clear what I require,
1 I am using klt method for face detect tracking but using a webcam source not video file, and can solve most problems myself, best way for me to learn.
2 the example code performs a transformation on a bounding box into a polygon, imagine you rotate your head the box around your face (roi) also rotates tracking your movements.
3 this will provide me with the required data, mean g channel r channel b channel etc
4 I cannot crop the image frame to the roi no matter what I try, and getting frustrated
5 tried imcrop (xy....... from the bounding box but it's not a horizontal rectangle so it doesn't work?
6 I just don't know where to start
If you can have a look at example code from link in first post you can see the polygon and transformation if anyone can suggest starting points or methods that they think may work that would be great.
Some ideas I've had but don't know how to implement.
Mask to the polygon, crop the frame to the mask after splitting into the rgb channels so I have binary images?? Then extract the angle from the transform function used and rotate back to horizontal using imrotate (image, Angle).
I've read all I can find on web 2 week's worth of browsing and trying, with no luck sorry if it's still vague but I don't know how else to explain
I'm annoyed with myself that I carnt figure this out so any help will be very much appreciated
Dan
I haven't seen the demo. If the polygon is some arbitrary polygon - not a simple rectangle - then you need to use poly2mask. You can still crop to the bounding box if you want though. Just find the top and bottom lines (y values) and the left and right columns (x values)
mask = poly2mask(xCoordinates, yCoordinates, rows, columns));
% Mask the image.
maskedRgbImage = bsxfun(@times, rgbImage, cast(mask,class(rgbImage)));
% Crop
topLine = min(yCoordinates);
bottomLine = max(yCoordinates);
leftColumn = min(xCoordinates);
rightColumn = max(xCoordinates);
height = (bottomLine - topLine + 1); % Height in pixels.
croppedImage = imcrop(maskedRgbImage, [leftColumn, topLine, width, height]);
You are a star :)
will give this a go later today and see what happens.
Thanks you again
Sorry for delay not had chance this week to check out will let you know asap
Thanks again and merry christmass
apologies for delay,
tried you code can understand what the crop does but getting errors on the video player output (cannot resize input 1)
clc; % Clear command window.
clear; % Delete all variables.
close all; % Close all figure windows except those created by imtool.
workspace;
% Create a cascade detector object. This ustilises Computer Vision Toolbox
faceDetector = vision.CascadeObjectDetector();
% Read a video frame and run the face detector. Find method of polling next
% live fram from a webcam. bbox = boundary box which can be used as
% cordinates to imgcrop
videoFileReader = vision.VideoFileReader('tilted_face.avi');
videoFrame = step(videoFileReader);
bbox = step(faceDetector, videoFrame);
%reduce the area for roi to just the eyes
eyeDetector = vision.CascadeObjectDetector ('EyePairSmall');
faceImage = imcrop(videoFrame,bbox);
eyebbox = step(eyeDetector,faceImage);
% make relavant to frame not to original bbox
eyebbox(1,1:2) = eyebbox(1,1:2) + bbox(1,1:2);
%image refferrenced from top left positive accross and down
x = eyebbox(1); y = eyebbox(2); w = eyebbox(3); h = eyebbox(4);
bboxPolygon = [x, y, x+w, y, x+w, y+2*h, x, y+2*h];
eyebbox = [x-1, y-1, eyebbox(3), 2* (eyebbox (4))];
%sets original eyebbox to new bbobPolygon dimensions to include in corner
%detect
% Draw the returned bounding box around the detected face. all the displays
% can be commented out but for testing leave in
shapeInserter = vision.ShapeInserter('Shape', 'Polygons', 'BorderColor','Custom',...
'CustomBorderColor',[255 255 0]);
videoFrame = step(shapeInserter, videoFrame, bboxPolygon);
figure; imshow(imcrop(videoFrame,(eyebbox))); title('Detected Eyes');
% Crop out the region of the image containing the face, and detect the
% feature points inside it. take this into algorythms split bbox roi into
% rgb rframe= ???(:,:,1); gframe= ???(:,:,2); bframe= ???(:,:,3);
cornerDetector = vision.CornerDetector('Method', ...
'Minimum eigenvalue (Shi & Tomasi)');
points = step(cornerDetector, rgb2gray(imcrop(videoFrame, eyebbox))); %Name,Value bbox changed from eyebox
% The coordinates of the feature points are with respect to the cropped
% region. They need to be translated back into the original image
% coordinate system.
points = double(points);
points(:, 1) = points(:, 1) + double(eyebbox(1));
points(:, 2) = points(:, 2) + double(eyebbox(2));
% Display the detected points.
markerInserter = vision.MarkerInserter('Shape', 'Plus', ...
'BorderColor', 'White');
videoFrame = step(markerInserter, videoFrame, points);
%figure, imshow(videoFrame), title('Detected features');
% Create a point tracker and enable the bidirectional error constraint to
% make it more robust in the presence of noise and clutter.
pointTracker = vision.PointTracker('MaxBidirectionalError', 6);
% Initialize the tracker with the initial point locations and the initial
% video frame.
initialize(pointTracker, double(points), rgb2gray(videoFrame));
videoInfo = info(videoFileReader);
videoPlayer1 = vision.VideoPlayer('Position',...
[100 100 videoInfo.VideoSize(1:2)+30]);
geometricTransformEstimator = vision.GeometricTransformEstimator(...
'PixelDistanceThreshold', 4, 'Transform', 'Nonreflective similarity');
% Make a copy of the points to be used for computing the geometric
% transformation between the points in the previous and the current frames
oldPoints = double(points);
while ~isDone(videoFileReader)
% get the next frame
videoFrame = step(videoFileReader);
% Track the points. Note that some points may be lost.
[points, isFound] = step(pointTracker, rgb2gray(videoFrame));
visiblePoints = points(isFound, :);
oldInliers = oldPoints(isFound, :);
if ~isempty(visiblePoints)
% Estimate the geometric transformation between the old points
% and the new points.
[xform, geometricInlierIdx] = step(geometricTransformEstimator, ...
double(oldInliers), double(visiblePoints));
% Eliminate outliers
visiblePoints = visiblePoints(geometricInlierIdx, :);
oldInliers = oldInliers(geometricInlierIdx, :);
% Apply the transformation to the bounding box
boxPoints = [reshape(bboxPolygon, 2, 4)', ones(4, 1)];
boxPoints = boxPoints * xform;
bboxPolygon = reshape(boxPoints', 1, numel(boxPoints));
% Insert a bounding box around the object being tracked
videoFrame = step(shapeInserter, videoFrame, bboxPolygon);
tempX = bboxPolygon;
tempY = bboxPolygon;
tempX(1:2:end) = []; %extracts the x values
tempY(2:2:end) = []; %extraacts the y values
%maxDiffX = range(tempX);
rightColumn = max(tempX);
leftColumn = min(tempX);
%maxDiffY = range(tempY);
bottomLine = max(tempY);
topLine = min(tempY);
width = (bottomLine - topLine + 1); % Height in pixels. switched these around and it works
height = (rightColumn - leftColumn + 1); % Width in pixels
% Display tracked points
videoFrame = step(markerInserter, videoFrame, visiblePoints);
videoFrame =(imcrop(videoFrame, [topLine, leftColumn, width, height]));
%figure, imshow(subImage), title('Detected features');
% Reset the points
oldPoints = visiblePoints;
setPoints(pointTracker, oldPoints);
end
% Display the annotated video frame using the video player object
step(videoPlayer1,videoFrame);
%step(videoPlayer1,cutvideoFrame);
end
% Clean up
release(videoFileReader);
release(videoPlayer1);
release(geometricTransformEstimator);
release(pointTracker);
so i am detecting eyes and drawing bounding box around the eyes and nose, this is the area I am interested in. I need to crop each frame (of input video/ live webcam) to this box for debugging the next level of my code.
Because I am using the geometric transformation between the points tracked from one frame to the next is it possible to use this data and do an inverse transform on the bounding box region of interest only. This is then sent to a video player object.
sorry for being such a newbie, what i have to do with the data after this is the main part of the work I am doing and I can understand that part it is just this tracking and cropping that is causing me the headache.
Thanks for you patients and any assistance is really appreciated Dan
hello Daniel i was trying your code it works fine what i stuck with insted of video player i want to display the 'videoFrame' in a montage
videoFrame =(imcrop(videoFrame, [topLine, leftColumn, width, height]));
as a in Montage function but all the example i came across just shows how to convert montage from a set of images and not as videoFrames help me if you can

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Computer Vision with Simulink 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by