Filtering Noisy GPS Altitude

10 次查看(过去 30 天)
I have GPS data from driving multiple times over a road. The data were sampled at 20 Hz, and the longitude and latitude data are very repeatable. For example, this is longitude vs. travelled distance for 10 passes over the road:
However, the altitude data contain spikes and jumps.
Here, I've forced the starting point through 0 for each pass. This shows how some of the jumps are persistent.
I've stored the data as a cell array of tables, where each table looks like this:
I have a couple of questions:
1) Is a cell array of tables the best way to store these data? Each table has the same columns, but the tables are of slightly different lengths. I could add a column for the lap number and store everything in a single table, but I'm not sure that would be better for analysis.
2) If the data only had spikes, it would be a simple matter of taking the median value, but the persistent jumps complicate things. So what is the best way to deal with those? I've tried low pass filtering, the jumps made that not work. My next thought was to take the median at each distance. Is there a better way of doing this?
3) Since the data are time-sampled and the route is being driven slightly differently each time, correspinding points in space do not absolutely correspond in time. So I think I really need to interpolate the data spacially and then select the best (median?) value within each spatial bin. Suggestions?

采纳的回答

Star Strider
Star Strider 2025-3-26
Filtering the data might be your best option, unless you can determine the causes of the discontinuities and fix that.
There are several approaches to filtering the data. If tthe sampling times are regular and with a constant sampling interval, using an appropriate discrete filter could be appropriate. Fir that, the first approach would be to calculate the Fourier transform of the signal and thendetermine the passband from those results. (Ideally, the noise whould be distinct from the low-frequency position data.)
The Signal Processing Toolbox has several filtering functions. My choice for that would be:
signal_filt = lowpass(signal, cutoff_freq, sampling_freq, ImpulseResponse='iir');
If all your data have the same sampling times, you can filter several columns of it using one lowpass call. (A lowpass filteer would appear to be appropriate heere, however several other options are available.)
Another approach would be to use a Savitzky-Golay filter (sgolayfilt). I usually use a 3-degree polynomial and then adjust the ‘framelen’ value to produce the result I want.
It would help to have your data, or at least a representative a sample of it.
  3 个评论
Star Strider
Star Strider 2025-3-26
As always, my pleasure!
I’m glad that approach worked!
Star Strider
Star Strider 2025-3-27
Afterthought —
If you want to get a measure of the dispersion of those records, you can calculate a 95% confidence interval for them.
Assuming they are all column vectors, one approach would be —
distance = linspace(4000, 6000, 2000).'; % Create Data
altitude = sin(0.5*pi*(distance-max(distance)/2)/max(distance-4000)) .* (1:0.33:5) + distance*0.001 + randn(2000,13)*0.1 - 3.5; % Create Data
figure
plot(distance, altitude)
ylim([0 max(ylim)])
xlabel('Distance')
ylabel('Altitude')
N = size(altitude,2);
mean_alt = mean(altitude,2);
SEM = std(altitude,[],2)/sqrt(N);
ti95 = tinv([0.025 0.975], N-1);
CI95 = mean_alt + SEM * ti95;
figure
patch([distance; flip(distance)], [CI95(:,1); flip(CI95(:,2))], 'r', FaceAlpha=0.25, EdgeColor='none', DisplayName='95% CI')
hold on
plot(distance, mean_alt, 'r', DisplayName='\mu Altitude')
hold off
ylim([0 max(ylim)])
xlabel('Distance')
ylabel('Altitude')
legend(Location='best')
You need to use the mean for that calculation, since that is simply how those statistics are described.
.

请先登录,再进行评论。

更多回答(1 个)

Image Analyst
Image Analyst 2025-3-25
How big are the jumps relative to the "normal" small amount of noise you expect to see? Do you want to filter everything or just the jumps/spikes only? How long (how many samples) are the spikes along the x (Dist) axis? Just one element or multiple?
Regarding 1, a table is fine but a regular double matrix might be more convenient, expecially since all your data is numbers (no strings). Regarding 2, why do you want to smooth the data and what happens if you don't smooth them? Regarding 3, if you want to take the median at common time points, then you will have to interpolate/resample each set of data to get them all on the same time axis.
If you have any more questions, then attach your data and code to read it in with the paperclip icon after you read this:
  2 个评论
Jim McIntyre
Jim McIntyre 2025-3-26
The jumps range in length and shape. Some are a single points, others are sustained, and in a couple of cases (the blue and yellow traces below) there's a jump followed by a decay. They are small relative to the absolute value of the data, but large relative to the differential values and what I'm wanting to do with them. Here is a 3D view, showing the vertical scale of 10 runs, forcing each run to start from 0.
1) My purpose in using the array of tables is to be able to refer to each column by name. Wouldn't I lose that with a double matrix?
2) What I'm trying to do is build a table of slope vs. position (or traveled distance). So, even small jumps defeat that.
3) It's really about taking a median vertical value vs. a representative grid of X-Y positions. So, I guess I could try to interpolate the data from the repeated runs to the positions recorded in a single run.
Attaching the data really isn't an option. The data you see in the plot is composed of approximately 75,000 rows.
Image Analyst
Image Analyst 2025-3-27
I'd try a modified median filter. Basically you filter with a median window wide enough to get rid of the widest spikes/jumps. Then you compute the absolute different of the median filtered signal from the original signal. Look where the jumps are to find out how much the original signal differs from the median filtered signal. Then threshold the difference to find the indexes of where the jumps are - this will create a mask. Then replace only those locations with the median filtered signal. Pseudocode:
outputSignal = originalSignal; % Make copy
mfSignal = medfilt1(originalSignal, windowWidth); % Run it through a smoothing filter, such as a median filter.
diffSignal = abs(mfSignal - mfSignal); % See how much it differs from original to sind out where the psikes live.
mask = diffSignal > someThreshold; % Get a mask of big differences regions.
% Replace original data with median filtered data, but only within the mask regions:
outputSignal(mask) = mfSignal(mask);
The advantage of this is that unlike a median filter, a Savitzky-Golay filter, or a low pass signal, the "fixing" of the bad spikes occurs ONLY at the spikes. The rest of the signal is not changed at all, which it would be if you uses a large enough window in those to smooth out the whole spike.

请先登录,再进行评论。

产品


版本

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by