Read numeric data with csvread
显示 更早的评论
Hello,
I got a csv-file that looks like this.
* text here
* more text...
1,20,3,4
2,30,4,5
* text again
3,4,6,7
*text
And so it goes on.
How do I read the csv-file and only get the numeric data. Everything that has a "*" and text after should be disgarded.
Thank you.
采纳的回答
doc textscan % NB: optional 'commentstyle' parameter
8 个评论
Okey. I created TestFile.csv with the data and text in as in my question.
Now my code is:
fileID=fopen('TestFile.csv')
N=4
cdata=textscan(fileID,'%f %f %f %f', ...
N,'CollectOutput',1,'CommentStyle','*')
I get:
cdata =
[1x4 double]
I cant figure out how to get the data from each column in "cdata"?
Thank you.
For these cases where there's no need for a cell array at all I wrap textscan in cell2mat as--
cdata=cell2mat(textscan(fileID,'%f %f %f %f', ...
N,'CollectOutput',1,'CommentStyle','*'));
In general you dereference a cell array with the "curlies" as
cdata(:)
for the full array or "nested indexing" of
cdata(1){r,c)
for a given array element.
See the doc on cell arrays for the fuller details.
But the short story here is that there's no need for the cell arrray and it's unfortunate there's not a way to tell textscan to forego the needless creation of one when isn't needed.
Thank you! My cdata looks like below when I use cell2mat:
cdata =
1 NaN NaN NaN
"1" is from row 1 and column 1 in my TestFile.csv I thought that it could be a bad csv-file but I tried to open other files to but it gives the same answer.
Am I using the wrong formatSpec?
Dunno...you don't show what you did in context...w/ the sample file copied into a text file here the example worked fine. NaN indicates a conversion of something not recognizable as a number so perhaps there's an embedded hidden character in the file or somesuch???
Okey. There should not be andy hidden characters in the file. That is confirmed.
This is my script:
---
fileID=fopen('TestFile.csv')
N=4
cdata=cell2mat(textscan(fileID,'%f %f %f %f',N,'CollectOutput',1,'CommentStyle','*'))
---
And this is the result from Matlab:
---
fileID = 8
N = 4
cdata = 1 NaN NaN NaN
---
And you have the exact same thing and it works for you? That is strange.
Thanks anyway!
Ayup...
>> type test.csv
* text here
* more text...
1,20,3,4
2,30,4,5
* text again
3,4,6,7
*text
>> fid=fopen('test.csv');
>> cell2mat(textscan(fid,repmat('%f',1,4),'delimiter',',', ...
'commentstyle','*', ...
'collectoutput',1))
ans =
1 20 3 4
2 30 4 5
3 4 6 7
>>
ADDENDUM
Oh, I see it isn't exact same thing; you don't need/want the repeat count specifier. That tells it to apply the format string N times but your file isn't consistent so it breaks when finds a non-numeric form. It would possibly work that way if 'commentstyle' were to force the whole file to be processed, the comment lines removed, then that file processed, but textscan works sequentially, not globally, simply skipping a line beginning with the comment character when it finds one and trying to convert the next line.
Thank you for your help! It works fine now. So if I had five columns instead of four i would write "1,5". Now I get how it works.
Ayup; it's the silly way C implemented it's format strings ignoring the long-existing pattern used in Fortran wherein there can be a repeat specifier. Just to show they were smarter; the implementers reversed the order of the width field and the conversion type so there's no way to now write a repeat count unambiguously. In Fortran FORMAT it would be 4F8.0; in Matlab which uses C i/o libraries one has to use repmat to double up or write them all explicitly. On the newsgroup am working with a guy at this instant with a 159-column file...writing %f 159 separate times is rather painful as his initial plea noted until one either has the "a-ha!" moment one's self or somebody shows you the trick (S Lord pointed it out to me years ago; I had never thought of repmat for strings for the purpose despite complaining for years. At one time I wrote a mex file that accepted Fortran FORMAT strings and used the Fortran i/o and passed the values back. Unfortunately I lost the source in the retirement move and haven't had the gumption to re-invent it since.
OK, enough geezer stories/griping... :)
更多回答(0 个)
类别
在 帮助中心 和 File Exchange 中查找有关 Data Type Conversion 的更多信息
标签
另请参阅
选择网站
选择网站以获取翻译的可用内容,以及查看当地活动和优惠。根据您的位置,我们建议您选择:。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
