How to search for channel name and numerical data in resulting struct after importing multiple data files?

Question

Scooby921 2019-4-4

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/454501-how-to-search-for-channel-name-and-numerical-data-in-resulting-struct-after-importing-multiple-data

编辑： Scooby921 2019-5-13

Questions:

1.) In avoiding using eval to dynamically name variables, how do I search a resulting struct of data and labels to link a channel name to a data column and then analyze multiple channels all having the same name?

2.) How do I properly write an if or switch case statement to deal with importing a single or multiple data files when the resulting workspace object is either a character array for a single file or a cell array for multiple files?

Background:

Currently using Matlab R2014b. I'm trying to write a script to select which data files, import / load the data, and place the data into a matrix or array or struct or whatever is most useful and appropriate for signal analysis, processing, and plotting afterward.

My data files are an export from a data acquisition tool (ATI VISION). The generated .mat file creates one cell array and one matrix. The cell array contains the text names of the data channels. This is n x 1 in size, where n is the number of channels exported. The matrix contains the numerical data and is m x n in size, where n is again the number of exported data channels and m is the number of samples.

The cell array of names has nearly zero consistency in the organization of the names (not alphabetical, not any order representing a channel number in the recording tool). The only consistency is that the nth row in the cell array shows no name "[]" but is always the "time" channel, and this always corresponds to column 1 in the data matrix. I've attached two mat files for reference. You can see in one file the first three rows are 'AngleSlipPoint2', 'AngleSlipPoint1', and 'AngleSlip', but in the other file the first three rows are 'PosLon', 'FRSpeed', and 'AccActPos'. I know my script can't be as easy as always accessing column 2 for x data and column 6 for y data. I need to search the cell array to link a name to a column in each individual data file.

In the two weeks that I've now been teaching myself how to write scripts and analyze data with Matlab I've apparently learned to do things the ill-advised way. The first obstacle I tried to address was aligning the data name in the cell array with the appropriate column in the data matrix. Because the cell array appears to be an array of text and not characters or strings I could find no other way to pull out the names, link them to data, and generate a variable than to use the frequently unrecommended eval function.

%% Extract Data
uiload
NumVars = numel(Data_Labels); % Establish number of variables to be created.
Time = Data(:,1); % Time values are always the first column of the Data matrix, so it's easy to define and create.
for k = 1:NumVars-1 % Time already created and exists as final channel, so we only need to generate variables for the remaining n-1 variables.
    eval([Data_Labels{k},'=[Data(:,k+1)]']); % Extract variable names and populate with data in workspace.
end

This worked for a single data file as it gets me a workspace full of variables and correctly populates them with the numerical data. I can integrate, derive, filter, and plot whatever I want. It fails miserably as soon as I attempt to load a second file as the next import will overwrite everything created from the first file. Hard to compare longitudinal acceleration in 2wd and 4wd when the newest import overwrites the old, and it's understandably stupid to write the script to append a 1/2/3 to the end of the name so I can have multiple instances in the workspace.

This is what I've come up with for importing multiple files. Still using the two attached files as my test files for writing the script.

%% Select files for 2WD analysis
[selected2wdFiles,pathName2wd] = uigetfile('*.mat','Select 2WD data files for analysis','MultiSelect','on');
if isequal(selected2wdFiles, 0)
   disp('No Files Selected')
   return;
end
for m = 1:length(selected2wdFiles)
  data2wd(m) = load(fullfile(pathName2wd, selected2wdFiles{m}));
end
%% Select files for 4WD analysis
[selected4wdFiles,pathName4wd] = uigetfile('*.mat','Select 4WD data files for analysis','MultiSelect','on');
if isequal(selected4wdFiles, 0)
   disp('No Files Selected')
   return;
end
for n = 1:length(selected4wdFiles)
  data4wd(n) = load(fullfile(pathName4wd, selected4wdFiles{n}));
end

This generates two structs, data2wd and data4wd, which contain the loaded cell arrays and data matrices. Unfortunately this script only works if I am selecting multiple files. If I only select one file it fails because the resulting item is a character array instead of a cell array. I haven't tried to script around that, but I suppose a switch case or if statement should work. Question #2 above...any suggestions?

The next step / steps is where I am lost. I believe I have avoided dynamically named variables, but I don't know how to go about extracting my longitudinal acceleration data from each data set. The specific channel name in the cell array of text is going to be 'AccelForward'. I know I need to search the cell array in row 1, column 2 of the struct to find the row number containing that name. This will tell me which column to access in the matrix stored in row 1, column 1 of the struct. Because it is a cell array of text the strfind command doesn't work. They aren't strings. Similarly they aren't characters either, so the related char commands don't work. Without using eval to extract things, how do I go about searching an array of text?

Once I can find the name, identify the data column, and the locate the actual data, how do I manipulate it without falling back on dynamically named workspace variables? I feel like I'm going to end up with pulling these columns of data back into the workspace as AccelForward_1, AccelForward_2, etc. and then more complicated and dynamic because I will have 2wd and 4wd data being compared and plotted against eacy other. What's the correct way to identify the data, manipulate the data, store the new data, and then access it later for plotting? Do I just keep generating more structs or arrays or matrices to stuff the data into and avoid a ridiculous workspace full of variables?

Now that I'm done writing a novel I suppose I simply don't know what I don't know and it makes it difficult to search and find answers. If anyone can put some labels on the forks in the road and send me in a useful direction I'd appreciate it. Thank you.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Stephen23 2019-4-5

"The only consistency is that the nth row in the cell array shows no name "[]" but is always the "time" channel, and this always corresponds to column 1 in the data matrix."

Ouch!

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Stephen23 2019-4-5

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/454501-how-to-search-for-channel-name-and-numerical-data-in-resulting-struct-after-importing-multiple-data#answer_369192

编辑：Stephen23 2019-4-5

在 MATLAB Online 中打开

You are right to avoid dynamically accessing variable names (e.g. using eval, assignin, evalin, and load without an output variable). Read this to know some of the reasons why:

https://www.mathworks.com/matlabcentral/answers/304528-tutorial-why-variables-should-not-be-named-dynamically-eval

Here is one simple solution for your task, using a non-scalar structure and dynamic fieldnames:

https://www.mathworks.com/help/matlab/matlab_prog/access-multiple-elements-of-a-nonscalar-struct-array.html

https://www.mathworks.com/help/matlab/matlab_prog/generate-field-names-from-variables.html

Using structure fields makes the order of the columns in the numeric matrix totally irrelevant.

[F,P] = uigetfile('*.mat','2WD','MultiSelect','on');
if isnumeric(F)
    error('User quit')
elseif ischar(F)
    F = {F};
end
S = struct('filename',F);
for ii = 1:numel(F)
    T = load(fullfile(P,F{ii}));
    L = [{'Time'};T.Data_Labels(1:end-1)]; % fix "Time" column mismatch
    for jj = 1:numel(L)
        S(ii).(L{jj}) = T.Data(:,jj);
    end
end

The imported data is very easy to access in the structure, you only need to refer to the indices (corresponding to each file) and the fieldnames (corresponding to each data column), e.g:

>> S(1).filename
ans =
MKZ_2WD_LevelSnowAccel.mat
>> S(1).AccelForward([1:4,end-4:end])
ans =
        -0.18
        -0.18
        -0.18
        -0.18
        ... lots of lines
        -1.92
        -1.72
         -1.2
        -2.24
         -4.1
>> S(1).Time([1:4,end-4:end])
ans =
      -5.1505
      -5.1405
      -5.1305
      -5.1205
      ... lots of lines
        23.01
        23.02
        23.03
        23.04
        23.05
>> S(2).filename
ans =
MKZ_4WD_LevelSnowAccel.mat
>> S(2).AccelForward([1:4,end-4:end])
ans =
        -0.07
        -0.07
        -0.07
        -0.07
        ... lots of lines
         0.23
         0.19
          0.3
         0.33
         0.27
>> S(2).Time([1:4,end-4:end])
ans =
      -5.3711
      -5.3611
      -5.3511
      -5.3411
      ... lots of lines
       24.069
       24.079
       24.089
       24.099
       24.109
 

You could also do something similar with tables, timetables, or by rearranging the columns of the numeric array to have the same order.

2 个评论
显示无隐藏无

Scooby921 2019-4-5

Thank you very much. I'll play around with this today!

Scooby921 2019-5-13

编辑：Scooby921 2019-5-13

在 MATLAB Online 中打开

As a follow-up a month later...thank you again! Worked with it a bit and learned a good deal more about working with structs. Wound up extended the script to include calling data from these initially generated structs, deriving acceleration from velocity, appending a lost data point, creating and applying filters, and loading everything back into a new struct of filtered data.

Just in case anyone looks up this question / answer and wants to see my end-result. Added notes at the end for colleagues who might use this script and may not fully understand what I've done.

%% Initialize
close all
clear
clc
%% Select and load 2wd data files into struct
[F2,P2] = uigetfile('*.mat','Select 2WD Data Files','MultiSelect','on');
if isnumeric(F2)
    error('User quit')
elseif ischar(F2)
    F2 = {F2};
end
D2 = struct('filename',F2);
for ii = 1:numel(F2)
    Tmp2 = load(fullfile(P2,F2{ii}));
    L2 = [{'Time'};Tmp2.Data_Labels(1:end-1)]; % fix "Time" column mismatch
    for jj = 1:numel(L2)
        D2(ii).(L2{jj}) = Tmp2.Data(:,jj);
    end
end
clearvars ii jj Tmp2 L2
%% Select and load 4wd data files into struct
[F4,P4] = uigetfile('*.mat','Select 4WD Data Files','MultiSelect','on');
if isnumeric(F4)
    error('User quit')
elseif ischar(F4)
    F4 = {F4};
end
D4 = struct('filename',F4);
for kk = 1:numel(F4)
    Tmp4 = load(fullfile(P4,F4{kk}));
    L4 = [{'Time'};Tmp4.Data_Labels(1:end-1)]; % fix "Time" column mismatch
    for nn = 1:numel(L4)
        D4(kk).(L4{nn}) = Tmp4.Data(:,nn);
    end
end
clearvars kk mm Tmp4 L4
%% Define inertial sensor filter
AccelFilt = designfilt('lowpassiir', 'PassbandFrequency', 5, 'StopbandFrequency', 25, 'PassbandRipple', 1, 'StopbandAttenuation', 40, 'SampleRate', 100, 'MatchExactly', 'passband');
%% Derive wheel accelerations from wheel speeds
WhlAcc2 = struct('filename',F2,'FLAcc',zeros,'FRAcc',zeros,'RLAcc',zeros,'RRAcc',zeros);
for qq = 1:numel(F2)
    WhlAcc2(qq).FLAcc = [diff(D2(qq).FLSpeed);0];
    WhlAcc2(qq).FRAcc = [diff(D2(qq).FRSpeed);0];
    WhlAcc2(qq).RLAcc = [diff(D2(qq).RLSpeed);0];
    WhlAcc2(qq).RRAcc = [diff(D2(qq).RRSpeed);0];
end
WhlAcc4 = struct('filename',F4,'FLAcc',zeros,'FRAcc',zeros,'RLAcc',zeros,'RRAcc',zeros);
for rr = 1:numel(F4)
    WhlAcc4(rr).FLAcc = [diff(D4(rr).FLSpeed);0];
    WhlAcc4(rr).FRAcc = [diff(D4(rr).FRSpeed);0];
    WhlAcc4(rr).RLAcc = [diff(D4(rr).RLSpeed);0];
    WhlAcc4(rr).RRAcc = [diff(D4(rr).RRSpeed);0];
end
clearvars qq rr
%% Filter data and load into new struct
D2f = struct('filename',F2,'AccelxF',zeros,'AccelyF',zeros,'YawRateF',zeros,'FLAccF',zeros,'FRAccF',zeros,'RLAccF',zeros,'RRAccF',zeros);
for nn = 1:numel(F2)
    D2f(nn).AccelxF = filtfilt(AccelFilt,D2(nn).AccelForward);
    D2f(nn).AccelyF = filtfilt(AccelFilt,D2(nn).AccelLateralCorr);
    D2f(nn).YawRateF = filtfilt(AccelFilt,D2(nn).AngRateZCorr);
    D2f(nn).FLAccF = filtfilt(AccelFilt,WhlAcc2(nn).FLAcc);
    D2f(nn).FRAccF = filtfilt(AccelFilt,WhlAcc2(nn).FRAcc);
    D2f(nn).RLAccF = filtfilt(AccelFilt,WhlAcc2(nn).RLAcc);
    D2f(nn).RRAccF = filtfilt(AccelFilt,WhlAcc2(nn).RRAcc);
end
D4f = struct('filename',F4,'AccelxF',zeros,'AccelyF',zeros,'YawRateF',zeros,'FLAccF',zeros,'FRAccF',zeros,'RLAccF',zeros,'RRAccF',zeros);
for pp = 1:numel(F4)
    D4f(pp).AccelxF = filtfilt(AccelFilt,D4(pp).AccelForward);
    D4f(pp).AccelyF = filtfilt(AccelFilt,D4(pp).AccelLateralCorr);
    D4f(pp).YawRateF = filtfilt(AccelFilt,D4(pp).AngRateZCorr);
    D4f(pp).FLAccF = filtfilt(AccelFilt,WhlAcc4(pp).FLAcc);
    D4f(pp).FRAccF = filtfilt(AccelFilt,WhlAcc4(pp).FRAcc);
    D4f(pp).RLAccF = filtfilt(AccelFilt,WhlAcc4(pp).RLAcc);
    D4f(pp).RRAccF = filtfilt(AccelFilt,WhlAcc4(pp).RRAcc);
end
clearvars nn pp
%% Note
% At this point all 2wd data is loaded into a struct named "D2" 
% and all 4wd data is loaded into a struct named "D4".
% Filtered 2wd data for accelerations is loaded into "D2f"
% and filtered 4wd data for accelerations is loaded into "D4f".
% All file names are loaded into the first column of those structs.
% To confirm which data file is in each row use the following syntax:
% D2(r).filename    where 'r' is the row in question
% D4(r).filename    where 'r' is the row in question
% Numerical data can be accessed by calling the struct, row, and name 
% of desired data.
% Example: Call steering wheel angle for first 2wd data file -->
% D2(1).SteeringWheelAngle
% Example: Call front right wheel speed for fifth 4wd data file -->
% D4(5).FRSpeed
% Example: Call filtered long accel for third 2wd data file -->
% D2f(3).AccelxF
% Plotting will use the same syntax for calling a variable -->
% plot(D2(2).Time,D2(2).SteeringWheelAngle)
%

请先登录，再进行评论。

Answer 2

Guillaume 2019-4-5

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/454501-how-to-search-for-channel-name-and-numerical-data-in-resulting-struct-after-importing-multiple-data#answer_369205

在 MATLAB Online 中打开

Considering that one of the variable is time, you may be better off storing your data in a timetable rather than a structure

The principle would be the same, use the cell array of names to name the variables instead of fields.

I'm a bit confused about one thing. If the time is the first column of the matrix, why is it the last element of the cell array. Is the array of name reversed with regards to the data column or does Data_Label(1:end-1) correspond to Data(2:end)?

I'm assuming the time is in seconds:

filepath = 'MKZ_2WD_LevelSnowAccel.mat';  %obtained however you want, with uigetfile for eg.
filecontent = load(filepath);
signals = array2timetable(filecontent.Data(:, 2:end), 'RowTimes', seconds(filecontent.Data(:, 1)), 'VariableNames', filecontent.Data_Labels(1:end-1));

If you want to import multiple files, you can store each timetable in a cell array, or vertically concatenate them into one big timetable. For that, I'd add a column indicating which source file each row came from. The order of the variables in a table does not have to be the same when you vertically concatenate tables, so the mismatched ordering wouldn't be an issue.

6 个评论
显示 4更早的评论隐藏 4更早的评论

Scooby921 2019-4-5

The issue with the array of names is that it's random. I did mention that somewhere in my original wall of text. Understandable if you missed it...wall of text :o. It drops time as column 1 in the data matrix and the name as a blank in the last row of the cell array. I have a theory why (see below). For every named signal in the recorder it appears to process in a random order. I will send a request to that developer to see if they can at least update the export feature to do things alphabetically.

I think time is the first column in the data matrix because I have the settings configured to export all channels on the same fixed time step, so I have an equal number of data points for every signal. The tool is creating and populating the time channel / column first to define the total number of rows of data based on the time-length of the data file and the resolution I've chosen (in this case 10ms / 100Hz). This way the tool knows how many blank cells need to be filled with "last value" for any signal recorded slower than 100Hz.

The time channel having no name and ending up last in the cell array is likely because time is not a channel specified in my recorder. It's a default feature of each piece of data and linked to the x-axis of that signal, but there is no specific channel named "time" in my plotter which is getting exported. Thus after exporting all of the other actual named signals the tool accounts for there being a time channel and drops a character into a final row just to make the number of rows in the cell array match the number of columns in the data matrix.

For what it's worth I do have the ability to export all data as Matlab structs, and this gives me name.signals.values and name.time for my x and y axes for plotting. Unfortunately I end up with things have mis-matched numbers of data points, or a few signals having one more or one less data point based on when the recorder started and stopped and when that CAN signal last updated. Using the signal processing toolbox to resample data imparts undesired noise in the automatic filter that function applies.

Scooby921 2019-4-5

在 MATLAB Online 中打开

Misunderstood your question. Yes the columns and names do match, just offset by 1 due to the time data being column 1 yet row n in the array of names.

So the name in row 1 of the cell array is column k+1 in the data matrix. That is what I was accounting for in my first script above with the eval command. The first line accounts for not needing to process the last, nameless row of the cell array. The "k+1" in the following line accounts for the shift due to time being the first column.

for k = 1:NumVars-1 % Time already created and exists as final channel, so we only need to generate variables for the remaining n-1 variables.
    eval([Data_Labels{k},'=[Data(:,k+1)]']); % Extract variable names and populate with data in workspace.
end

Obviously the end-game is to not use eval, but however else I do this with tables, structs, matrices, or arrays, I should still be able to search the cell array to get a name, and the row number containing that name +1 identifies the column in the data matrix containing the numerical values. Looking for options to search the array inside the struct or to better import the data files for ease of access without the dynamic naming.

Scooby921 2019-4-5

Since there seem to be concerns with releases and available features...I started using R2014b because that's what we use for Simulink modeling and are stuck on that release for the moment to maintain model / s-function compatibility with customers. With my data analysis likely being a stand-alone function that I or other team members are going to run separate from model development I shouldn't have a problem upgrading to R2018b or 19a. If that opens up more options and makes life easier I'll go ahead and do that. Wasn't an initial thought or concern simply because I already had a version of Matlab loaded and working on my computer.

Guillaume 2019-4-5

在 MATLAB Online 中打开

the original question mentions "Currently using Matlab R2014b..."

That, I did indeed miss in the wall of text (and the fact that the Release was tagged, I should have looked at that).

Yes the columns and names do match, just offset by 1 due to the time data being column 1 yet row n in the array of names

Then, both answers account for that. The timetable or structure use the names in whichever order they come to name the matching column.

Neither timetables or structures care about the ordering of the fields/variables when you operate on them (well as long as you are using the names and not numeric indices), so it does not matter if they're not in the same order from file to file.

%tables work the same way as timetables
t1 = array2table(rand(10, 3), 'VariableNames', {'Speed', 'Slip', 'Pitch'})
t1.Slip   %will return the 2nd column of the table
t2 = arrat2table(rand(10, 3), 'VariableNames', {'Pitch', 'Speed', 'Slip'})
t2.Slip  %will return the 3rd column of the table

请先登录，再进行评论。

How to search for channel name and numerical data in resulting struct after importing multiple data files?

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

采纳的回答

2 个评论
显示无隐藏无

更多回答（1 个）

6 个评论
显示 4更早的评论隐藏 4更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

How to search for channel name and numerical data in resulting struct after importing multiple data files?

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

采纳的回答

2 个评论 显示 无隐藏 无

更多回答（1 个）

6 个评论 显示 4更早的评论隐藏 4更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

2 个评论
显示无隐藏无

6 个评论
显示 4更早的评论隐藏 4更早的评论