How to traverse table row based on sets

2 次查看(过去 30 天)
Let's say that I have a table in an excel spreadsheet that looks like this (based off of real GPS data, but for the sake of this question, all of these are just random numbers ):
|Set number| PRN | X-coordinate | y-coordinate | z-coordinate ...
| 1 | 17 | 24715 | -1470517 | 223695 |
| 1 | 17 | 24715 | -1470517 | 223695 |
...
| 2 | 17 | 24715 | -1470517 | 223695 |
| 2 | 17 | 24715 | -1470517 | 223695 |
So I know that I can read in the data using the build in MATLAB command:
gpsdata = readtable('spreadsheet.xlsx')
I also know I can loop through the table by row using the command:
for row = 1:height(gpsdata)
However, more specifically, I want to loop through all the table based off the set numbers in each row. One snag is that the number of set numbers varies. Sometimes there will be 8 rows of set 1, other times there will be 9 rows of set 2, etc.
My specific aim is to do specific operations on each of the sets. Looking at the table, normally there is a different PRN for each satellite. That means each measurement is a different satellite. The set is just a way to acknowledge that it is measuring the same thing. For example, the set 1s will have about 8 different satellites all measuring whatever is at set 1. However, each of the satellites will have a different measurement. My goal is to take all the measurements and use iterative least squares to estimate the exact measurement. However, that involves getting measurements from each of the different satellites within each row of the set.
Is there a way that I could loop through based off each of the set numbers to make sure that the ones that I am using are the same for each set? That is, what is the best way to operate on all the rows with set data 1s, then operate on all the data with set 2s, etc.?
While I could do each set manually, I feel as though there is a better, more programmatic way to approach this.
Thanks for your time and help in advanced! I really appreciate it!
  15 个评论
billyjthorton
billyjthorton 2018-10-8
编辑:billyjthorton 2018-10-8
I apologize, I made a mistake with the braces on the third command. If you look closely between the first and second command, you'll noticed that it has been shifted by one to avoid a row. I provided the exact data I'm dealing with. Look at the original question as well as a post above and you'll see an attached spreadsheet with the data. I shifted the data I was looking at by a row to avoid "Set Number", which, while not in the code, is in the data set. As for the third command, I thought I was clear on what it was doing. It was designed to run in the above dataset, without any of the strings (just grabbing the block of data that I was going to use anyways), but it's more or less just a modified version of the first two and no success on that front either. I firmly believe I should be able to do this without trying to remove the strings and just select the data that I want to use as a sub-matrix.
I apologize, I don't know how much more clear to be? I have written a more detailed explanation of exactly what I want to do. Alas, I will try again:
I would like to capture the grouping of all the data under set "1", then do a capture of all the data under set "2" and so on and so forth. Once I have other a sub array or indices, I can then take the data and perform operations. Particularly, go row by row and extract the GPS data to perform geolocation using an ILS method. Does that help answer what you are looking for?
Is there any way you'd be willing to open up the data I sent and be able to provide more specific help? I'm really trying to learn, but I'm really struggling using this data set.
jonas
jonas 2018-10-8
I'll write a more elaborate response in a bit. For now, just two comments. First:
grpstats(gpsdata(2:108,1:17), gpsdata.x1)
Wrong syntax. If you insert a table, then the grouping variable arg should be a string describing the variable name. If you want the column gpsdata.x1 as your grouping variable, then the correct syntax is:
grpstats(gpsdata(:,1:17), 'x1')
This should work as long as each of the 17 first columns contain numerical data only, and one variable is named x1.
Second, I assumed you had imported the table correctly but realized now that this is not the case. Follow Guillaumes advice. If it still does not work, then use the 'range' argument of readtable to skip the first line of the sheet. As for now, all your variables are imported as strings, as the first row in your table is a string. Note that it doesn't matter that you write:
gpsdata(2:108,1:17)
each column is still interpreted as strings.

请先登录,再进行评论。

回答(2 个)

Guillaume
Guillaume 2018-10-8
It's unclear what problem you are actually facing with Jonas' examples. As Jonas says, you can easily use any of the grouping functions that work with tables. Not having the stats toolbox, I use varfun and rowfun. Here is a rowfun example to works with your attached excel file. I use it to calculate the mean 3D distance (from origin) of the satellites in each group. Adapt as required:
%function to use with rowfun. Defined as anonymous function here. Could be an m file instead
%takes 3 column vector inputs, x,y and z coordinates. Return a scalar
func = @(x, y, z) mean(sqrt(x.^2 + y.^2 + z.^2));
%import file
opts = detectImportOptions('GPS_data.xlsx'); %works better at figuring out that the header is on row 2
satdata = readtable('GPS_data.xlsx', opts);
%apply function to each set of satellites
rowfun(func, satdata, 'GroupingVariables', 'SetNumber', 'InputVariables', {'X_coordinate', 'Y_coordinate', 'Z_coordinate'}, 'OutputVariableNames', 'distance')
  8 个评论
billyjthorton
billyjthorton 2018-10-9
编辑:billyjthorton 2018-10-9
Yes, that is correct. What is the alternative to indexed variable names in this case? Leverage each as a function of the loop variable? What would be the loop variable in this case?
Guillaume
Guillaume 2018-10-9
I'm sorry but Ew! One rule that you need to put into practice immediately: numbered variables are always the wrong approach. sv1 should be sv(1), sv{1} or sv(1, :) or something similar. And every time you're repeating more or less the same lines of code, you're doing it wrong. Computers are very good at repeating things, let them do the work.
Your script, written in proper matlab:
% Given measurements
x_true = [1132049, -4903445, 3905453, 85000]; % meters
x_init = [0, 0, 0, 0];
% SV locations
sv = [15764733, -1592675, 21244655
6057534, -17186958, 19396689
4436748, -25771174, 1546041
-9701586, -19687467, 15359118
23617496, -11899369, 1492340
14540070, -12201965, 18352632];
% x_old values
num_it = 5;
x_old = zeros(num_it, 4);
% sigma
sd=5;
% Generate Measurements
rho = sqrt(sum((sv - x_true(1:3)) .^ 2, 2)) + x_true(4) + sd*randn(1);
% Estimation
for i = 1:num_it
rho_hat = sqrt(sum((sv - x_init(1:3)) .^ 2, 2)) + x_init(4);
% partial derivatives
h = [-(sv - x_init(1:3)) ./ (rho_hat - x_init(4)), ones(size(sv, 1), 1)];
% x_init - x_hat
dy = rho - rho_hat;
%x_hat = x_init + inv(H.'*inv(R)*H)*H.'*inv(R)*(y_1 - h)
dx = inv(h'*inv(sd^2)*h)*h'*inv(sd^2)*dy;
x_init = x_init + dx.';
x_old(i, :) = x_init;
end
Also note the warnings in the editor. I'm fairly certain you shouldn't be using inv but that's outside of my domain of expertise.
What I'm missing is what that script has to do with the table. What are the inputs and outputs of that scripts and how do they match the table?

请先登录,再进行评论。


jonas
jonas 2018-10-8
编辑:jonas 2018-10-8
Method 1 - grpstats
For the sake of simplicity, let's use only the first 5 columns in this example.
%%Import data
opts = detectImportOptions('GPS_data.xlsx');
T = readtable('GPS_data.xlsx', opts);
T = T(:,1:5);
T =
109×5 table
SetNumber PRN X_coordinate Y_coordinate Z_coordinate
_________ ___ ____________ ____________ ____________
1 17 2.4716e+06 -1.4705e+07 2.237e+07
1 6 -5.2026e+06 -2.5099e+07 6.9419e+06
...
Continuing...
%%Split and apply custom function
grpstats(T,'SetNumber',@func);
function out=func(x)
out=mean(x);
end
You can write your own function, just remember that the input, x, is always a column vector containing one group and a single variable. This is of course limiting, because you cannot perform complex operations on multiple variables in your function, or at least I do not know how. For this, it is better to use another method.
Method 2 - findgroups/splitapply
Read the data just like before, but transform the table into an array.
A=table2array(T);
Now, split the dataset in groups and perform some operation with all 4 variables (excluding Set Number) at once.
%%First column is grouping set
G = findgroups(A(:,1));
out1 = splitapply(@func,A(:,2:end),G);
function out1=func(vars)
vars
%%perform some calculations here
end
You can see that I have written out the variables passed to the custom function. What is nice here is that you can process all four variables in the same function. You can pass multiple outputs back to the main script if you so desire, however each output should be scalar.
vars =
1.0e+07 *
0.0000 0.2501 -1.4700 2.2370
0.0000 -0.5196 -2.5091 0.6975
...
I've never used rowfun personally but I suppose it's very similar to the second method.
  8 个评论
billyjthorton
billyjthorton 2018-10-9
Ah, that makes sense. Thank you for your help and patience! It follow at least from the error code, but the issue seems to be in the function itself. I followed your code earlier:
function out1=func(vars)
xcoord = vars(:,2)
ycoord = vars(:,3)
zcoord = vars(:,4)
out1 = xcoord + ycoord + zcoord
end
With still the same scalar error. But isn't out1 a scalar in this case already? Additionally, right now it only provides a loop through the first group. Theoretically, should it be looping through all of them? I believe this scalar error is getting in the way, but I was trying to make sure what to expect.
jonas
jonas 2018-10-9
编辑:jonas 2018-10-9
Ugh, thats not scalar. Scalar is a single element. That is a matrix. Poor example from my end.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Data Import from MATLAB 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by