importing tab delimited text file

63 次查看(过去 30 天)
Hi,
I am downloading a text file "A" using textscan. I know that the file is table delimited with unknown number of columns and 34000 rows. Some columns are numbers ,others are strings. I used textscan('A.txt','%s') and what I am getting is 1x1 cell called data . This 1 cell data is 34000x1 where each row contains all the data of the respective rows in the original file (i.e not separated to columns and there isn't any space that separates the rows). Any help about how to download the file is appreciated
  1 个评论
Walter Roberson
Walter Roberson 2011-7-30
textscan('A.txt','%s') is going to give you the cell array {'A.txt'} -- if the first argument to textscan() is a string, then the string itself is considered to be the input to be scanned.

请先登录,再进行评论。

回答(5 个)

Fangjun Jiang
Fangjun Jiang 2011-7-28
try a=importdata('A.txt') to see what you got. Many times, it will give your well-formatted data.
  4 个评论
Zeenat Islam
Zeenat Islam 2017-8-9
importdata ALL THE WAY! Worked like a charm for me
Walter Roberson
Walter Roberson 2017-8-9
Note that in more recent versions, importdata by default now returns text as string objects instead of as cell arrays of character vectors. We are seeing people getting caught by that.

请先登录,再进行评论。


Walter Roberson
Walter Roberson 2011-7-30
What is the delimiting character? Can any of the strings contain the delimiting character, and if so then how is it indicated that that delimiter is a part of the string rather than marking the end of the string?
  2 个评论
Danielle Leblanc
Danielle Leblanc 2011-7-30
the delimiting character is ' ', i.e columns are separated by a space
Danielle Leblanc
Danielle Leblanc 2011-7-30
also teh columns are a mix of strings and numerics. I tried textscan('A.txt','%q') and ('A.txt','%c') but I obtained the same output as ('A.txt','%s')

请先登录,再进行评论。


Danielle Leblanc
Danielle Leblanc 2011-7-30
Hi again,
sorry to bother you with this problem but I am a matlab beginner. I opened one of tmt text files with excel.The data has 33 columns and I am putting the transpose of the columns below as each has a different format (col1 col2 etc.. is the column number and what follows it is the data format):
col1 5
col2 6154
col3 T
col4 ABN.GG
col5 ABN
col6 00077T
col7 AA2
col8 N
col9 N
col10 4000
col11 A
col12 104.61
col13 +
col14 7.226596
col15 A
col16 20090109
col17 10:21:00
col18 0
col19 @
col20 A
col21 Y
col22 7
col23 104.61
col24 +
col25 7.226596
col26 104.61
col27 +
col28 7.226596
col29 104.61
col30 +
col31 7.226596
col32 11636
col33 20090109
excel recognized the file correctly. I have 400 of these text files so importing them from text to excel then to matlab would take a lot of time. How can I import this text file to matlab directly
  1 个评论
Fangjun Jiang
Fangjun Jiang 2011-7-30
It's hard to read your data due to format. Can you paste 3 or 4 lines of your text file here and apply the code format to it?

请先登录,再进行评论。


Walter Roberson
Walter Roberson 2011-7-30
You continue to have the same problem that I warned about earlier: textscan() with a string as its first argument reads the string, not a file denoted by the string. The older textread() routine expected a filename as the first argument, but textscan() never does.
fid = fopen('A.txt','rt');
data = textscan(fid, '%f %f %s %s %s %s %s %s %s %f %s %f %s %f %s %f %s %f %s %s %s %d %f %s %f %f %s %f %f %s %f %f %f');
fclose(fid);
The spaces within the quoted string are not important and can be left in or removed as desired.
I coded this in such a way that the dates such as 20090109 are read as numbers, but the time such as 10:21:00 is read as a string. textscan() is not able to directly read formatted times as times.
The output, data, will be a cell array, containing one column vector per column of input, so for example data{2} would be a column vector of floating point numbers corresponding to column 2, one entry per line of input.
  3 个评论
Danielle Leblanc
Danielle Leblanc 2011-7-30
I tried it on other files as well. The output is different but the content of each cell of the cell array never exceeds 2 rows of data although I have 35000 rows per file. for example I am getting:
[95;2722]
[102.680000000000;9730]
<2x1 cell>
<2x1 cell>
<2x1 cell>
<2x1 cell>
<2x1 cell>
<2x1 cell>
<2x1 cell>
[NaN;9000]
<2x1 cell>
[NaN;102.050000000000]
<2x1 cell>
[NaN;8.80793700000000]
<2x1 cell>
[NaN;11]
<2x1 cell>
[NaN;0]
<2x1 cell>
<2x1 cell>
<2x1 cell>
0
NaN
''
NaN
NaN
''
NaN
NaN
''
NaN
NaN
NaN
Fangjun Jiang
Fangjun Jiang 2011-7-30
You need to check the consistency of your text file. As long as the data format is consistent. The code Walter provided should return correct result. I construct three lines of text using your data. Here is the result:
A.txt
5 6154 T ABN.GG ABN 00077T AA2 N N 4000 A 104.61 + 7.226596 A 20090109 10:21:00 0 @ A Y 7 104.61 + 7.226596 104.61 + 7.226596 104.61 + 7.226596 11636 20090109
5 6154 T ABN.GG ABN 00077T AA2 N N 4000 A 104.61 + 7.226596 A 20090109 10:21:00 0 @ A Y 7 104.61 + 7.226596 104.61 + 7.226596 104.61 + 7.226596 11636 20090109
5 6154 T ABN.GG ABN 00077T AA2 N N 4000 A 104.61 + 7.226596 A 20090109 10:21:00 0 @ A Y 7 104.61 + 7.226596 104.61 + 7.226596 104.61 + 7.226596 11636 20090109
data 1x33 5640 cell
>> data{1}
ans =
5
5
5
>> data{33}
ans =
20090109
20090109
20090109
>> data{18}
ans =
0
0
0

请先登录,再进行评论。


Stephan Koehler
Stephan Koehler 2011-9-7
I wrote a routine for importing tsv files generated by excel. look at http://www.mathworks.com/matlabcentral/fileexchange/32782

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by