Problem with average two cells array
显示 更早的评论
Dear expert,
I have this kind of problem: I have one cell array with 4420 cells, and in every cell there are 90 double values (in every cell I have one time series with 90 points). I need to keep the cells from 1 to 85 (every cell represent a brain's region) and make an average with the cells from 2211 to 2295 (the same 85 regions of the same brain, but it is different run, so different values). In other words I need to make an average between two time series of the same brain's region but from two different runs. After that I need to keep cells from 86 to 170 and make a average with cells from 2296 2380, and so on, until I will finish the 4420 array cell (I have 26 different brain for run1 and the same 26 brain for run2 ---> 26x85x2=4420). I post below the code that generate this cell array (AllResult).
%%Root Path
pathroot = 'C:\Temporal_series';
%%first level folder
MyExamDir = [30852 22061 20769 21734 21735 21977 20856 21976 20086 30697 30630 19993 30018 28832 19725 22440 28333 22439 22587 22586 21403 30944 21405 30943 22337 30948];
% convert it to string : easier to treat as folder Names.
MyStringDir = cellfun(@num2str,num2cell(MyExamDir),'UniformOutput',false);
% Initialize The Output Data (which will contain all the results
% Here, I assume all the files containts a 90x1 vector, so i will concatenate to create an array.
AllResult = [];
%%loop on every Exam folder
for i = 1:length(MyExamDir)
%%get all ".gz.txt" in the run1 folder
CurrentDir = fullfile(pathroot,MyStringDir{i},'run1');
AllFile = dir(fullfile(CurrentDir,'*gz.txt'));
% loop for each file
for j = 1:size(AllFile,1)
% current file
CurrentFile = fullfile(CurrentDir,AllFile(j).name);
% try to open
[fid, errormsg] = fopen(CurrentFile, 'r+');
if ~isempty(errormsg)
warning('failed to open %s due to %s', CurrentFile, errormsg);
else
A1 = fscanf(fid,'%f %f', [90 1]);
A1 = A1';
AllResult{end+1,1} = A1;
fclose(fid);
end
end
%%same operation for run2
CurrentDir = fullfile(pathroot,MyStringDir{i},'run2');
AllFile = dir(fullfile(CurrentDir,'*gz.txt'));
% loop for each file
for j = 1:size(AllFile,1)
% current file
CurrentFile = fullfile(CurrentDir,AllFile(j).name);
% try to open
[fid, errormsg] = fopen(CurrentFile, 'r+');
if ~isempty(errormsg)
warning('failed to open %s due to %s', CurrentFile, errormsg);
else
A1 = fscanf(fid,'%f %f', [90 1]);
A1 = A1';
AllResult{end+1,1} = A1;
fclose(fid);
end
end
end
Could someone help me? Thanks in advance for your attention.
Lorenzo
回答(1 个)
Guillaume
2014-11-21
There isn't any reason to store your data in a cell array to start with, if it contains matrices of identical size.
Anyway, is this what you want:
regioncount = 85
braincount = 26
allbrainsregionsruns = cell2mat(AllResults); %convert cell to matrix, each row is a region
endrun1 = braincount * regioncount + 1; %row at which first run 1
brainsregionsrun1 = allbrainsregionsruns(1:endrun1, :); %all brains and regions of first run
brainsregionsrun2 = allbrainsregionsruns(endrun1+1:end, :); %all brains and regions of second run
brainregaverage = mean(cat(3, brainsregionsrun1, brainregionsrun2)); mean of both runs
brainregaverage is an (85x26) x 90 matrix, where the first 85 rows are the two runs average of the first brain, the next 85 the two runs average of the 2nd etc.
You could then divide that into a cell array of brains with:
brainaverage = mat2cell(brainregaverage, ones(1, braincount) * regioncount, size(brainregaverage, 2));
Each cell of brain average is the 85 x 90 matrix of a brain.
22 个评论
Lorenzo
2014-11-21
Guillaume
2014-11-21
What are the sizes of the matrices then?
Lorenzo
2014-11-22
When reporting an error, copy the entire error message, including the bit that shows the line on which it occurs.
Since, you didn't do that, I assumed the error was on the last line, the one with a cat, rather than the one with cell2mat.
Anyway, if cell2mat does not work, it's because AllResults is not as you have stated. "in every cell I have one time series with 90 points" is not the case.
What are the values of sizeelem and diffcell when you do the following:
sizeelem = size(AllResult{1})
diffcell = find(cellfun(@(e) ~isequal(sizeelem, size(e)), AllResult)
Lorenzo
2014-11-22
Guillaume
2014-11-22
Have a look at the cells whose indices are in diffcell. These are all cells that do not have 90 double values.
Since there are 85 of them, which is your region count, it looks like one of your brain run isn't right.
Lorenzo
2014-11-22
Lorenzo
2014-11-22
Guillaume
2014-11-22
Since it's only one of the run that's missing a value. How do you expect to average it with the 90 values of the other run?
The simplest thing would be for you to work out the position of the missing value and insert a NaN there:
misingcolumn = ???;
for c = diffcell
AllResult{c} = [AllResult{c}(1:missingcolumn-1) NaN AllResult{c}(missingcolumn:end)];
end
Lorenzo
2014-11-22
Lorenzo
2014-11-24
Guillaume
2014-11-24
There's still one or more cell in your array that is not the same size as the others. So, again:
sizeelem = size(AllResult{1})
diffcell = find(cellfun(@(e) ~isequal(sizeelem, size(e)), AllResult)
will tell you which one(s).
Guillaume
2014-11-24
Note that my code is fairly simple. it convert your cell array into a matrix (hence why all cells need to be the same size), split that matrix in two (one for each run), and rejoin it along the third dimension, then calculate the average along that third dimension.
Lorenzo
2014-11-24
Guillaume
2014-11-24
The problem is that your storage structure doesn't reflect your data structure. You would have been better off with a cell array of brains, each of these cell arrays a cell array of regions.
I think the simplest thing to do now is to add a dummy value to your shorter sequences so they're all the same size:
shorterbrain = ? %index of shorter brain
regioncount = 85;
braincount = 26;
startrun1 = (shorterbrain-1) * regioncount + 1;
startrun2 = (shorterbrain+braincount-1) * regioncount + 1;
for row = [startrun1:startrun1+regioncount startrun2:startrun2+regioncount]
Allresults{row} = [Allresultst{row} NaN];
end
You can then apply my initial algorithm.
The alternative is to use loops.
Lorenzo
2014-11-24
Guillaume
2014-11-24
At the end of my original post, I convert the matrix of averages back to a cell array where each cell is a brain. At that point you just removed the extra element from the relevant brain:
brainaverage = mat2cell(brainregaverage, ones(1, braincount) * regioncount, size(brainregaverage, 2));
brainaverage{shorterbrain} = brainaverage{shorterbrain}(:, 1:end-1);
Lorenzo
2014-11-25
Guillaume
2014-11-25
The whole purpose of cell arrays is to be able to store matrices of different sizes, so yes it will work. That's just what my last answer did. Why didn't you try it?
Lorenzo
2014-11-25
Stephen23
2014-11-25
+1 to Guillaume for pointing out that "The problem is that your storage structure doesn't reflect your data structure. You would have been better off with a cell array of brains, each of these cell arrays a cell array of regions."
Good design of the data structures goes a long way to helping solve many data-processing problems...
Guillaume
2014-12-5
Hello Lorenzo,
This is not how this forum works. You ask a question, and when the question is answered, you accept the answer so that the answerer gets reputation points.
If you then have another question, as is the case here, you start a new question. It gives the new question more visibility and gives other people a chance to answer and get the reputation points.
类别
在 帮助中心 和 File Exchange 中查找有关 Matrix Indexing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!