Filtering Noisy GPS Altitude
10 次查看(过去 30 天)
显示 更早的评论
I have GPS data from driving multiple times over a road. The data were sampled at 20 Hz, and the longitude and latitude data are very repeatable. For example, this is longitude vs. travelled distance for 10 passes over the road:

However, the altitude data contain spikes and jumps.

Here, I've forced the starting point through 0 for each pass. This shows how some of the jumps are persistent.

I've stored the data as a cell array of tables, where each table looks like this:

I have a couple of questions:
1) Is a cell array of tables the best way to store these data? Each table has the same columns, but the tables are of slightly different lengths. I could add a column for the lap number and store everything in a single table, but I'm not sure that would be better for analysis.
2) If the data only had spikes, it would be a simple matter of taking the median value, but the persistent jumps complicate things. So what is the best way to deal with those? I've tried low pass filtering, the jumps made that not work. My next thought was to take the median at each distance. Is there a better way of doing this?
3) Since the data are time-sampled and the route is being driven slightly differently each time, correspinding points in space do not absolutely correspond in time. So I think I really need to interpolate the data spacially and then select the best (median?) value within each spatial bin. Suggestions?
0 个评论
采纳的回答
Star Strider
2025-3-26
Filtering the data might be your best option, unless you can determine the causes of the discontinuities and fix that.
There are several approaches to filtering the data. If tthe sampling times are regular and with a constant sampling interval, using an appropriate discrete filter could be appropriate. Fir that, the first approach would be to calculate the Fourier transform of the signal and thendetermine the passband from those results. (Ideally, the noise whould be distinct from the low-frequency position data.)
The Signal Processing Toolbox has several filtering functions. My choice for that would be:
signal_filt = lowpass(signal, cutoff_freq, sampling_freq, ImpulseResponse='iir');
If all your data have the same sampling times, you can filter several columns of it using one lowpass call. (A lowpass filteer would appear to be appropriate heere, however several other options are available.)
Another approach would be to use a Savitzky-Golay filter (sgolayfilt). I usually use a 3-degree polynomial and then adjust the ‘framelen’ value to produce the result I want.
It would help to have your data, or at least a representative a sample of it.
3 个评论
Star Strider
2025-3-27
Afterthought —
If you want to get a measure of the dispersion of those records, you can calculate a 95% confidence interval for them.
Assuming they are all column vectors, one approach would be —
distance = linspace(4000, 6000, 2000).'; % Create Data
altitude = sin(0.5*pi*(distance-max(distance)/2)/max(distance-4000)) .* (1:0.33:5) + distance*0.001 + randn(2000,13)*0.1 - 3.5; % Create Data
figure
plot(distance, altitude)
ylim([0 max(ylim)])
xlabel('Distance')
ylabel('Altitude')
N = size(altitude,2);
mean_alt = mean(altitude,2);
SEM = std(altitude,[],2)/sqrt(N);
ti95 = tinv([0.025 0.975], N-1);
CI95 = mean_alt + SEM * ti95;
figure
patch([distance; flip(distance)], [CI95(:,1); flip(CI95(:,2))], 'r', FaceAlpha=0.25, EdgeColor='none', DisplayName='95% CI')
hold on
plot(distance, mean_alt, 'r', DisplayName='\mu Altitude')
hold off
ylim([0 max(ylim)])
xlabel('Distance')
ylabel('Altitude')
legend(Location='best')
You need to use the mean for that calculation, since that is simply how those statistics are described.
.
更多回答(1 个)
Image Analyst
2025-3-25
How big are the jumps relative to the "normal" small amount of noise you expect to see? Do you want to filter everything or just the jumps/spikes only? How long (how many samples) are the spikes along the x (Dist) axis? Just one element or multiple?
Regarding 1, a table is fine but a regular double matrix might be more convenient, expecially since all your data is numbers (no strings). Regarding 2, why do you want to smooth the data and what happens if you don't smooth them? Regarding 3, if you want to take the median at common time points, then you will have to interpolate/resample each set of data to get them all on the same time axis.
If you have any more questions, then attach your data and code to read it in with the paperclip icon after you read this:
2 个评论
Image Analyst
2025-3-27
I'd try a modified median filter. Basically you filter with a median window wide enough to get rid of the widest spikes/jumps. Then you compute the absolute different of the median filtered signal from the original signal. Look where the jumps are to find out how much the original signal differs from the median filtered signal. Then threshold the difference to find the indexes of where the jumps are - this will create a mask. Then replace only those locations with the median filtered signal. Pseudocode:
outputSignal = originalSignal; % Make copy
mfSignal = medfilt1(originalSignal, windowWidth); % Run it through a smoothing filter, such as a median filter.
diffSignal = abs(mfSignal - mfSignal); % See how much it differs from original to sind out where the psikes live.
mask = diffSignal > someThreshold; % Get a mask of big differences regions.
% Replace original data with median filtered data, but only within the mask regions:
outputSignal(mask) = mfSignal(mask);
The advantage of this is that unlike a median filter, a Savitzky-Golay filter, or a low pass signal, the "fixing" of the bad spikes occurs ONLY at the spikes. The rest of the signal is not changed at all, which it would be if you uses a large enough window in those to smooth out the whole spike.
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!




