advice on table storage in a struct or a mat file

Question

Stephen Devlin 2017-5-8

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/339279-advice-on-table-storage-in-a-struct-or-a-mat-file

评论： Stephen Devlin 2017-6-5

Hi, I have a large script which produces a very useful results-table summary of the part I am testing. As the test is standardised it would be useful to have a way to store all of the tables. It is likely that one of the parts may get tested multiple times and I still need to have a results-table stored for each instance of its having been tested. Then I would like to access these results-tables and be able to plot a value from them, maybe one value(element) from each of the results-tables so I can plot how that parameter has varied over a year etc.

Will this be easier to achieve by using a table/cell array within a mat file or with a structure? I have little experience with both of these so it makes sense to seek some general advice first.

%Example of a results table for Part001 (reduced to 2 columns of results)
%mask an array of dummy data;
A=num2cell(ones(10,3));
A(:,1)={'H1';'H2'; 'H3';'H4';'H5';'H6';'H7';'H8';'H9';'H10'};
A(:,2)={6.6;4.5;3.3;8.5;8;6;7;7.2;7.7;6.9};
A(:,3)={1.36;0.35;2.003;1.005;0.88;0.201;1.118;2.011;0.613;0.012}
%convert the array into a dummy table
vars={'PartID','Length','Variation'}
B=cell2table(A)
B.Properties.VariableNames = {'PartID','Length','Variation'}

I would have a table like this for many parts, the amount of these tables will increase as the year(s) go on. It is likely I would be interested in plotting two things, comparing Part001 with say Part087 thus wanting to see the values of both on the same axes. Also to only select an element of Part001, say the length and variation of 'H6' for each result table I have, and plotting all of those in a single graph to show the trend over a year etc.

I'm not sure if this will be easier in the long run to use datasets, structs or a matfile to do this, or maybe a 3D table. Let me know if this makes sense, happy to explain further.

2 个评论
显示无隐藏无

Peter Perkins 2017-5-9

Stephen, I think you're gonna need to provide a short, clear, and specific example.

Stephen Devlin 2017-5-11

Hi Peter, Thanks, have edited the original post to give further details and an example of a cutdown version of the results table.

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

MRS 2017-6-3

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/339279-advice-on-table-storage-in-a-struct-or-a-mat-file#answer_269448

编辑：MRS 2017-6-3

I am not sure how "big" each test gets but I have been handling this differently and a friend of mine has been attempting to tackle the same problem with tables within tables. Not sure yet which is the best way.

I think about each part as a row in a table. So I set up a "database" in a table format. Each column is some information or test for that sample/part. The samples/parts are "identical" in that they are being evaluated by roughly the same tests and we are trying to determine if they are different or not (ie. different manufacturing lots for the same product line).

Each test can produce 1 data point or "XY" data that we need to further calculate. Each test I shove into a custom made structure that allows me to compare test to test stuff using simple for loops. These test summaries can then go into columns of my table. I try and reduce the XY data into single points to capture some "summary" of that test. Unfortunately, some of these tests can be very complicated so you need to study the test, just as much as you have to understand the part.

Not sure if we could do this in 1 table, with tables within that table.

Example of a structure for one test....

dataVault = structure...
dataVault.testRunNum
dataVault.product
dataVault.lot
dataVault.productionTime
dataVault.testTime
dataVault.testSummary
dataVault.partialPressure == vector of pressures run for this test
dataVault.adsorp == vector of measured adsorption
dataVault.calc1 == vector of some reduced data
dataVault.bigCalcs
dataVault.bigCalcs.xData
dataVault.bigCalcs.yDataMatrix

So, I can put the testRunNum in the row for that sample, test Summary data in other columns associated with that test but when I want to look at test to test differences, I have created other files to store dataVault as a bigger structure hence...

dataVaultBig{1} = dataVault.testRunNum = 1
dataVaultBig{2} = dataVault.testRunNum = 4

etc. etc....

You can then use for loops to whip through the big structure, plotting the XY data in the structure. Depending on how "big" your "bigCalcs" yDataMatrix is, each row can be some process data with each column being time. Since, some of our tests run longer, I have had to use structures since the partial pressure table is not "fixed". Some tests are adaptive, adding time, so to speak, when the response is too big from one point to another, shoving a mid point in. Hence, using structures, when plotting XY data for the same thing (i.e. time and temperature), my vectors aren't always the same length...

Maybe this resonates, maybe not. Unfortunately, each row of my table has something like 10 tests associated with each part and 1 test produces image files that need to be analyzed. Other tests, as I mentioned may run for 24 hours or 2 weeks or even months depending on the program. Caputuring a month long "test" in 12 columns of a table as a summary sometimes only scratches the surface of all the details in that test. Some tests give me a single value, making it easy to put into the table.

Hopefully this gives you food for thought

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Stephen Devlin 2017-6-5

Thankyou MRS, I'm very grateful for both the content and the amount of time you have taken to respond, it's much appreciated.

How do you get around vectors not being the same length? I have a workaround at the moment that does what I need it to, but as more data comes in and the analysis becomes more of how all of the parts perform over time I think a structure might be a better way to go so I will pursue the structure.

Many thanks

Steve

请先登录，再进行评论。

Answer 2

Peter Perkins 2017-5-12

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/339279-advice-on-table-storage-in-a-struct-or-a-mat-file#answer_266835

在 MATLAB Online 中打开

First thing is probably to get rid of all those cell arrays:

PartID = {'H1';'H2'; 'H3';'H4';'H5';'H6';'H7';'H8';'H9';'H10'};
Length = [6.6;4.5;3.3;8.5;8;6;7;7.2;7.7;6.9];
Variation = [1.36;0.35;2.003;1.005;0.88;0.201;1.118;2.011;0.613;0.012];
B = table(PartID,Length,Variation)

It would be easier to answer your question with a more specific example of what you want to do. But I guess it's sort of chicken and egg, if you could write code to show what you want to do, you might not have to ask the question.

I'm not clear on whether each "part" has one and only one of these tables, or if the same part is tested multiple times. Also not sure if the example table is typical of the actual size. So to a certain extent, I'm just guessing.

You almost certainly do not want a flat structure array. You could create one table for all parts, using an indicator value to sow what part each row corresponds to. If one part is tested multiple times, you'd need some kind of time stamp too. That kind of flat layout allows to to do any comparison or selection you want -- one part vs. another, all H1's across all parts, only result with Length > 6, and so on. It's at the expense of some storage inefficiency (storing 'Part001' 10 times) and at the expense of ease of access for one part's results (you do something like B(B.Part=='Part001',:)).

An alternative might be a table with one row for each part, with maybe a timestamp for the test and some other data that are constant for that part, and then one variable that is a cell array, each cell of which contains a table just like you example. That makes it really easy to get one part's data (using row names would make it something like B.Results('Part001')) but much harder to compare across all parts. If you didn't have any "constant" data, a scalar struct with each field named like B.Part001 containing a table, would be essentially equivalent.

Hope this helps.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Stephen Devlin 2017-5-15

编辑：Stephen Devlin 2017-5-15

Hi Peter, Thanks for your answer, I will have a think about this today.

Each part contains 4 subparts - each subpart is the same (with some manufacturing variation) - so 4 parts lashed together comprise what I define as a 'part'.

Each part has a number of 'pads', numbered say 1-1000. So for each tested part I could have the average length and variation of the pads overall, and plot that trend, or have the average length and variation of the subparts and plot those over the year, etc.

Along with the data table I described above, there is the raw data table, which is much bigger (400,6), and the tester information (10,2). So ideally I want to have a way of incorporating all three tables in some way so that I can look at the time, machine it was tested on, fluids etc. I didn't mention this aspect before because I was only told late Friday that that was also a requirement.

I had not considered a table incorporating a cell array, thank you for suggesting that, I will look into it.

Very much appreciated assistance Peter. thank you.

请先登录，再进行评论。

advice on table storage in a struct or a mat file

2 个评论
显示无隐藏无

采纳的回答

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

更多回答（1 个）

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

advice on table storage in a struct or a mat file

2 个评论 显示 无隐藏 无

采纳的回答

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

更多回答（1 个）

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

2 个评论
显示无隐藏无

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论