Comparison of data in two tables when certain variables match

20 次查看(过去 30 天)
I have two tables of unequal sizes containing similar types of data with key differences in how the data was produced (one tables contains raw results values, the other table contains results that have been produced using different filters).
Table one (filtered_table) is as such (tables were imported from excel):
Table two (unfiltered_table) is similar but doesn't have any filter number or frequency data. In this way, the filtered_table is much larger than unfiltered_table due (I have 3 different filter numbers and frequencies, so 9 variations for each subject, and 4 ROIs per subject). It is also worth it to note that I have 9 different subjects, whose number IDs are non-sequential (I have subjects 5, 6, 7, 10, 11, 12, 13, 15, 17).
I would like to find the difference between the unfiltered results and each filtered result such that I end up with a comparison table wherein the 'avg_disp_y' is actually the difference between 'avg_disp_y' for that subject/frequency/ROI/filter number and the unfiltered 'avg_disp_y' for that subject/ROI.
I'm struggling with determining the best way to do this - I'm sure I could create a new unfiltered data table wherein the unfiltered data is just repeated as many times as necessary such that the tables are exactly the same size allowing me to just subject all the table values outright, but I can't imagine this is actually the best way to do this and I would like to have a better understanding of how to work with complex data like this going forwards (I have around 10 variables besides avg_disp_y, and while right now I only have 9 subjects, later I'll have 60 and making a large duplicate table by hand can get unwieldy and is prone to errors).
If someone could help me with efficiently querying data from two tables simultaineously to produce a third table of values, that would be very helpful.
Thank you in advance!
  4 个评论
dpb
dpb 2022-7-29
You'll have a lot better likelihood somebody will actually play at solving the problem if you give sample datasets for folks to use -- it's a lot of effort to try to create test data that may/may not actually represent the problem space even if done.
Gwendolyn Williams
Gwendolyn Williams 2022-7-29
That's a great point!
I've attached here sample tables of both as .mat files!
I realize it might be too much for a question here. I couldn't find any similar questions regarding this type of indexing and querying through tables unfortunately.

请先登录,再进行评论。

回答(1 个)

Ayush Modi
Ayush Modi 2023-9-22
Hi Gwendolyn,
I understand you want to know an efficient way to compare the values of “avg_disp_y” for a specific “Subject” and “ROI” column from “unfiltered_table” with every combination of Subject, ROI, Frequency and Filter values from “filtered_table”.
You can achieve this by following steps mentioned below:
  1. The optimized approach would be using “join” function on the variable subset (Subject and ROI), and joining the two tables.
joined_table = join(filtered_table, unfiltered_table, 'Keys',{'Subject_ID','ROI'});
This will give you a new table where the values of both the tables will be compared and if any combination of “Subject” and “ROI” column is available in both the tables, it will reflect in the “joined_table”.
2. After that, you can perform a subtraction between the two columns of “avg_disp_y” and add another column to store the difference value back to the table.
Please refer to the documentation for more information on “join” function:
https://www.mathworks.com/help/matlab/ref/table.join.html
Best regards,
Ayush

类别

Help CenterFile Exchange 中查找有关 Tables 的更多信息

标签

产品


版本

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by