Problem importing large text file (.txt) of data

2 次查看(过去 30 天)
Hey everybody,
I have a space delimited textfile with about 30.000 rows looking like this: | |
Ping: 37.639 ms
Download: 34.35 Mbit/s
Upload: 5.59 Mbit/s
Start: Tue May 31 08:18:01 2016
10.|-- zrh04s08-in-x07.1e102.net 0.0% 10 23.5 26.6 22.9 42.8 5.9
9.|-- fra02s15-in-x0d.1e102.net 10.0% 10 91.4 105.4 27.3 184.9 66.2
Ping: 56.94 ms
Download: 28.81 Mbit/s
Upload: 5.66 Mbit/s
Start: Tue May 31 08:20:01 2016
9.|-- zrh04s08-in-x07.1e102.net 0.0% 10 25.1 23.4 19.3 32.2 3.6
9.|-- fra02s15-in-x0d.1e102.net 0.0% 10 20.9 21.9 19.7 24.2 1.1
.
.
.||
with the "Import Data" option in Matlab it is easy extracting the data I want. The problem is that as soon as I import more than 10.000 cells at once for example the Time (Format HH:mm:ss) changes from this (<10.000 cells):
'NaT'
'08:16:01'
'NaT'
'NaT'
'NaT'
'NaT'
'NaT'
'08:18:01'
'NaT'
'NaT'
'NaT'
'NaT'
'NaT'
'08:20:01'
'NaT'
'NaT'
'NaT'
to this (>10.000 cells):
'NaT'
'13-Jun-2016'
'NaT'
'NaT'
'NaT'
'NaT'
'NaT'
'13-Jun-2016'
'NaT'
'NaT'
'NaT'
'NaT'
'NaT'
'13-Jun-2016'
'NaT'
'NaT'
'NaT'
There is no error message whatsoever. I know I could just import the data in blocks, but since I have more Data to analyze it would be lots of unnessesary work and I would really like to understand what the problem is rather than working around it. Thanks a lot! (btw I am pretty new to Matlab)
  3 个评论
Ritesh Naik
Ritesh Naik 2016-6-15
编辑:Ritesh Naik 2016-6-15
Hi Marcus,
As mentioned in the following documentation while importing dates and times, MATLAB interprets them as text strings unless one specify that they should be interpreted as date and time information:
Hence,'textscan' would be a good choice to read formatted data from text file which gives a better control to specify formats. If you choose to generate script from the 'Import' dialog, you would see that even it uses 'textscan' to read the formatted data from the file.
Refer the following documentation for more details on 'textscan':
Since you also mentioned about reading in blocks, for this you could refer to the following documentation:
However, it is strange that for rows till 10,000 you saw one format and then different format.Once you imported data from the file using 'Import Data', did you try to change the format of the columns to check if you see the same behavior?
-Ritesh
per isakson
per isakson 2016-6-16
编辑:per isakson 2016-6-16
  • '13-Jun-2016' is the date of your test. It doesn't come from the text file - I guess.
  • Have you inspected the end of the file in an editor? I guess, the "time-strings" differ after row 10.000.

请先登录,再进行评论。

回答(2 个)

Shameer Parmar
Shameer Parmar 2016-6-16
Hello Marcus,
Try using fopen() command instead of importdata()
clear all;
count = 1;
fid = fopen('ascii_file.txt');
tline = fgetl(fid);
while ischar(tline)
disp(tline);
if (tline ~= -1)
data(count,:) = {tline};
else
data(count,:) = {''};
end
count = count + 1;
tline = fgetl(fid);
end
fclose(fid);
then use array variable 'data' for next operation.

Dr. Oscar Gaete
Dr. Oscar Gaete 2018-9-16
Same problem. Using the Import Data GUI, if importing >10.000 the datetime values get corrupted. Solution: In the GUI, instead of importing the data directly, generate a function. Now, call that function from the command window or from a script. That worked for me in version R2016b. Cheers

类别

Help CenterFile Exchange 中查找有关 Large Files and Big Data 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by