How to Track Multiple Objects at Once | Understanding Sensor Fusion and Tracking, Part 5
From the series: Understanding Sensor Fusion and Tracking
Brian Douglas
This video describes two common problems that arise when tracking multiple objects: data association and track maintenance. We cover a few ways to solve these issues and provide a general way to approach all multi-object tracking problems.
We cover data association algorithms like global nearest neighbor (GNN) and joint probabilistic data association (JPDA) and look at the criteria for deleting and creating tracks. We talk about gating observations so that we don’t waste computational resources. At the end of the video, we show an example of GNN and JPDA algorithms operating on two objects in close proximity.
Published: 25 Sep 2019
In this video, we’re going to take what we’ve learned about tracking a single object and expand it to tracking multiple objects all at once. At first glance, it doesn’t seem like this problem is that much harder than tracking a single object. For example, can’t we just take the tracking algorithm like the IMM from the last video and apply one to each object, and be done? Well, as you might expect based on the way I asked that question, the answer is no, not really. At least, there are some additional things we need to consider with multi-object tracking and that’s what we’re going to talk about in this video. So I hope you stick around for it. I’m Brian, and welcome to a MATLAB Tech Talk.
Now, I am going to highlight a general outline so that you get an idea of how these problems can be solved, but by no means do I want you to think that there is only way or one algorithm to get the job done. Every problem is different, because they have a different number of objects, or you have access to different measured data and information, or different expectations on how close objects will be to each other, and you’ll have different computational resources at your disposal, or different requirements on how accurate you need to be, and so on. In a way, figuring out how to tackle your particular problem is more of an art than a hard science. So the thing I hope you take away from this video is to understand the types of things you should be aware of when selecting or developing your own algorithms for multi-object tracking and not necessarily how to solve your exact problem. Okay, enough of this disclaimer. Let’s get to it.
To begin, let’s talk about what makes multi-object tracking difficult, and then we’ll circle back around and talk about how we can approach solving these issues.
All of the problems we’re going to have stem from our uncertainty. This is uncertainty in the observations or detections of the objects and uncertainty in our predictions of the paths that the objects are taking. Remember from the last video that an estimation filter like a Kalman filter works by blending an uncertain measurement with an uncertain prediction. And we looked at tracking an airplane with a radar station. We predicted where the airplane would be in the future and we corrected it with a noisy radar measurement. But instead of a single object, we have lots of objects, each with their own uncertain prediction that we need to correct with its corresponding uncertain measurement.
So already, we come to our first problem. We don’t want to correct a prediction of one object using the measurement of another. But if there’s no identifying information that comes with the detection, like an airplane tail number or some other unique signature, how do we know which object we’ve just detected? For example, if all of the airplanes we are tracking have similar radar cross sections, and this is all of the information we have access to, how can we match an arbitrary ping with the appropriate tracked object?
This isn’t too much of a problem if all of the objects are sparsely distributed and the observations are relatively reliable. It would be a simple matter to claim an observation is of the object that it’s closest to. In this case, we would just assign the measurement to the nearest object and run the estimation filter for it like we would when tracking a single object.
The tricky part, however, comes when objects are close enough to each other, or our uncertainty is great enough that a measurement could be of more than one object. Now we have some figuring out to do. This is the data association problem. We have to associate the detections with the right objects.
Okay, so that’s one problem. Another thing we need to consider is that the number of objects being tracked is not fixed. Sometimes tracks need to be created or removed based on what we observe. We may add a new track when an airplane flies into the radar range, and similarly we may delete a track when one flies out. Now this creation and deletion doesn’t just happen along the edge of the FOV of the sensors. We may have to create and delete tracks anywhere. For example, an airplane may land or take off within the radar range. Or, as another example, if a self-driving car is tracking pedestrians as it drives down a busy street, it has to be aware of the new people and track their walking direction as they come into view in front of the car and it no longer has to care about the pedestrians that it has already passed.
So we need to think about the criteria for creating and deleting object tracks. And a basic way to approach this is to add a track whenever there is a detection that doesn’t match an existing object and to delete a track if an existing object is not detected. The thing that complicates this is that sometimes sensors have false positive measurements. They detect something that isn’t actually there. And sometimes, sensors fail a few times in a row to detect an object that is actually there. So we need to be careful here so that we aren’t creating tracks prematurely, which clutters our view of what’s actually there, and we aren’t deleting tracks prematurely, which decreases the effectiveness of tracking in the first place. This is the track maintenance problem.
So that’s what we’re going to cover for the rest of this video. When tracking multiple objects, what are some ways that we can approach data association, and what some ways we can address track maintenance?
And this brings us back to the flow chart that I briefly showed at the beginning of this video. And while your particular approach might not result in a functional structure that looks exactly like this, it will provide a convenient way for me to talk about each of the sections. So let’s walk through it.
And start at observations. An observation of an object may contain measured quantities like range, range rate, or line of sight. These are quantities that represent the kinematic nature of the object. But observations could also contain measured attributes like target type, id number, and object shape. This is what I was hinting at earlier when I said you might get the tail number of an aircraft or some other identifying information with an observation. Getting a tail number is pretty unambiguous for data association, but other types of attributes might still require some interpretation. For example, you might collect micro-doppler radar fluctuations and from that be able to determine the type of aircraft you’re tracking. And the uncertainty in the observation depends on the degree of separation between the two objects. A helicopter might look completely different from a jet aircraft because of the large rotating propellers, whereas a bird and a quadcopter might have similar doppler signatures.
Other things to consider with observations is that if the tracked object is a point target, then the observation would contain at most one detection. So we have to associate one detection with one object. But if the target object is large and the sensor has sufficient resolution, like with LiDAR, for instance, there may be more than one detection per target and we need to consider this when determining how we’re going to handle associating this data. Also, if the resolution of the sensor is low, then it’s possible that two objects may exist within a single detection. In this case, both objects have been observed, so we don’t want to stop tracking either of them but they show up as only a single detection. Our track deletion algorithm will have to handle these situations.
Going forward in this video, we’re only going to be talking about the case where we expect one detection for each tracked object, and where this is handled is in the assignment step, which is next.
Assignment is the process of matching an observation to a tracked object, or if you’re into the whole brevity thing, matching it to a track. Possibly the simplest assignment algorithm to think about is the GNN, the global nearest neighbor. This simply assigns a track to the nearest observation. But the interesting thing here is that it’s not necessarily the nearest Euclidean or geometric distance, but the nearest probabilistic distance like you get with the Mahalanobis distance. Here’s why.
With a probability distribution, like we have with both our predictions and our measurements, the lowest Euclidean distance doesn’t always indicate that a prediction and a measurement are the best fit. This is because we have more confidence in our predictions and measurements in the directions with lower standard deviations. For example, in this simple image, we have a prediction for the location of two different objects and a single detection that lies between them. If we used Euclidean distance, we’d assume that the detection is of object 2 since it’s closer. But if we look at the probability distributions of the two predictions, then we can see that it’s more probable that the detection is of object 1. This is what the Mahalanobis distance does for us. It’s the distance normalized by the standard deviation.
Now, GNN works well for sparsely distributed objects, but for cluttered objects another assignment algorithm like the JPDA or joint probabilistic data association algorithm will be better. The JPDA doesn’t make a hard assignment between one observation and one track. Instead, it makes a weighted combination all of the neighboring observations, with closer observations being weighted higher than further ones. This is an improvement over GNN because if there are two observations that could be the object, the JPDA won’t fully commit to one, possibly the wrong one. So if the tracked objects are clustered near each other, and the observations are all clustered near them too, this algorithm can handle that by blending a few of them together rather than jumping around between wrong and right detections.
There are many more assignment algorithms, and you can create your own based on your tracking situation. But the bottom line is that you are trying to figure out how to associate an observation with a track.
Not all observations get assigned, and not all tracks have observations. This is where track maintenance comes in in the form of deleting and creating tracks. But as I said before, we have to be careful so we don’t do anything prematurely. Let’s start with one way to delete tracks in a conservative way. Rather than saying an object is gone as soon as we miss a single observation, we could delete tracks only a track has not been assigned to a detection at least P times during the last R updates. In this case, P and R are parameters that you can tune to your situation. So you might say delete a track if it hasn’t been detected at least four times in the last six updates.
Now, creating a track is a little trickier because you don’t know if a single unassigned observation is a new object right away, but you still have to pay attention to it so you can figure out over time if it’s worth tracking. One way to handle this is to create a tentative track—one that you maintain and pretend like it’s a real object but you don’t actually believe is a real object. You just maintain that track in the background. Once the tentative track has been detected M times in the last N updates, then you move the track to confirmed, which means you think it’s a real object. You can remove a tentative track with the same logic as removing a confirmed track. So in this way, you may have dozens of tentative tracks that you are maintaining due to false positive measurements, but are deleted before they even become confirmed.
With the tracks created and removed and with the observations assigned, we can run a set of estimation filters. This part is identical to single object tracking where we had choices like the interacting multiple model filter or the single model Kalman filter. The predicted state of each tracked object that is assigned an observation (both tentative and confirmed objects) get updated with their respective observation, and then the whole process starts anew. We get more observations, they are assigned to tracks, tracks are created and deleted, and the filters run again.
However, it would be computationally foolish to look at every single observation and consider how likely it is to be assigned to every single track. Therefore, we may choose to ignore observations outside of a defined region for each track. This is called gating and it’s a screening mechanism that determines which detections are valid candidates to look at for assignment and which should be flat out ignored. For example, with JPDA, an observation that is a far distance away from the tracked object would statistically contribute very little to the overall solution, so why spend the computational resources to even calculate this minuscule amount? Especially if you’re tracking dozens or hundreds of objects, this could be extremely wasteful. By ignoring observations outside of a specific region, outside of this gate, we can speed up the assignment process.
And so gating impacts the assignment algorithms so that they only consider the observations that are worth looking at. All right, this is where I am going to leave this video, but before I end it, I want to quickly show you an example in MATLAB that shows the results of two different multi-object tracking algorithms.
In this example, there are two objects that are moving independent of each other and we’re observing them through a tracking radar. The black triangles are the true positions of the objects and the circles are what is detected with the radar station. Notice that when the two objects get close to each other, the detections overlap and it becomes difficult to tell where the object are in this region of ambiguity. So this is the data that we have to work with, so let’s try the algorithms and see how they do.
For the first one, I’m using global nearest neighbor for data association and an IMM for the estimation filter since these objects are maneuvering. Notice that when the objects are far from each other, it’s obvious which objects and detections go together and the GNN algorithm does a pretty good job. Object 1 is assigned to track 1, and object 2 is assigned to track 2. This is what GNN is good at, matching data with tracked objects when they are sparse. But when the objects are close to each other, it doesn’t do so well. You can see that one track was deleted and a new track was added a few detections later, so the algorithm got confused as to how many objects there were at one point. In fact, it jumped from track number 2 to track number 8, which means there were five other tentative tracks that it was maintaining before confirming the sixth one as track 8.
Let’s upgrade our assignment algorithm to the joint probabilistic data association algorithm and see if this fares much better.
Well, it’s interesting. The two tracks are maintained through the whole maneuver, but in the region of ambiguity there is still a lot more error than anywhere else. But this is to be expected since the data overlaps in this region so much. It’s amazing that these algorithms can pull anything out from a situation like this. Now, there is additional cost in complexity with JPDA over GNN, so it’s worth assessing what your requirements are rather than just implementing the most accurate and complex algorithms. If the objects always were far away from each other, then I’d prefer the simple GNN approach since it works fine in those situations and is much easier to interpret and implement.
So that’s the overall gist of the types of things that we need to think about and the kinds of problems we have to solve when tracking multiple objects at once. I want to stress again that there’s no one way to accomplish this. It’s completely dependent on each unique situation. But hopefully with this general overview you’ll be in a better position to understand how to tackle your next multi-object tracking problem.
All right, if you don’t want to miss the next Tech Talk video, don’t forget to subscribe to this channel. Also, if you want to check out my channel, Control System Lectures, I cover more control theory topics there as well. Thanks for watching. I’ll see you next time.