how to select particular rows from a matrix

3 次查看(过去 30 天)
Dear all,
I have a panel data set(countries, years) of dimension 20000 by 200. My goal is
1) to choose those countries that correspond to a specific period (1970-2013) AND
2) for which period there are no NaN values Is there a way to do that? Here is a sample matrix Many thanks in advance.
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
};

采纳的回答

Image Analyst
Image Analyst 2017-3-6
Try converting it to a table to make it easier:
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ 42] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
};
t = cell2table(A(2:end,:), 'VariableNames', A(1,:))
yearsInRange = t.year >= 1921 & t.year <= 1923
nonNanRows = ~isnan(t.v9)
% Find out where both are true:
% year in range, and v9 column is not nan
rowsToKeep = yearsInRange & nonNanRows;
% Extract only those rows
newTable = t(rowsToKeep,:)
Echoed to the command window, you can see the steps:
t =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ ___ ___ ___ ___
'Albania' 1919 0 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1920 1 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1921 0 0 0 0 0 0 1 0 NaN 1 0 0
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
'Albania' 1923 0 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1924 1 0 1 0 0 0 2 1 NaN 1 2 0
'Albania' 1933 0 0 0 0 0 0 0 0 NaN 1 0 0
yearsInRange =
7×1 logical array
0
0
1
1
1
0
0
nonNanRows =
7×1 logical array
0
0
0
1
0
0
0
newTable =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ __ ___ ___ ___
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
>>
  3 个评论
Image Analyst
Image Analyst 2017-3-7
Sure it will. Here is proof:
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ 42] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Canada' [1922] [1] [0] [1] [0] [0] [ 0] [2] [1] [ 11] [ 1] [ 2] [ 0]
'USA' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ 12] [ 1] [ 0] [ 0]
'Germany' [1922] [1] [0] [1] [0] [0] [ 0] [2] [1] [ 13] [ 1] [ 2] [ 0]
'Belgium' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ 14] [ 1] [ 0] [ 0]
};
t = cell2table(A(2:end,:), 'VariableNames', A(1,:))
yearsInRange = t.year >= 1921 & t.year <= 1923
nonNanRows = ~isnan(t.v9)
% Find out where both are true:
% year in range, and v9 column is not nan
rowsToKeep = yearsInRange & nonNanRows;
% Extract only those rows
newTable = t(rowsToKeep,:)
And look at all the countries it found:
newTable =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ __ ___ ___ ___
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
'Canada' 1922 1 0 1 0 0 0 2 1 11 1 2 0
'USA' 1923 0 0 0 0 0 0 0 0 12 1 0 0
'Germany' 1922 1 0 1 0 0 0 2 1 13 1 2 0
'Belgium' 1923 0 0 0 0 0 0 0 0 14 1 0 0
so explain why you said that - why you think it's getting countries outside the range or not including countries inside the year range. For example you can give me a cell array that proves what you said.

请先登录,再进行评论。

更多回答(1 个)

Rik
Rik 2017-3-6
You can use ismember to select the country, then you can convert those rows (excluding the first column) to a matrix with cell2mat. With logical indexing you can select the correct range of years, and with isnan you can find out the rows with NaN values.
I think you should be able to figure out what to do with this pointer. If you have trouble writing the code, don't hesitate to reply.
  2 个评论
ektor
ektor 2017-3-6
Dear Rik,thank you for this. I am not sure how to choose the years. For example, I am interested in the 1945-2015 time period. However, if I set, for example
x>=1945&x<=2015
I might get counties with time period 1989-2009. But I want countries over 1945-2015. My logical indexing is wrong. Any suggestion on that?
Rik
Rik 2017-3-7
I would say that your data is not complete then, because there is nothing wrong with your logical indexing.
I also like to add that the solution from Image Analyst is more elegant and will be easier to understand if you come back to your code after a year. You do need to make sure that the all machines you run your code on have 2013b or later, because older versions don't have the table data structure.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Resizing and Reshaping Matrices 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by