Read non-tabular text file using datastore or mapreduce
显示 更早的评论
How can I read non-tabular text file using datastore or mapreduce? Thanks
回答(2 个)
yashwanth annapureddy
2014-11-19
0 个投票
Hi Darek,
As you might already know datastore is primarily meant for large tabular data sets. Non-tabular/blocked reading is something we want to look at also in the near future.
However you should be able to play with the TextscanFormats to read the data you care about just as you would using textscan for reading non-tabular data. For example, as you may already know, you can set the TextscanFormats as '%q' and read in a table with one variable name with the block of text for further processing.
I could probably help more, if you provide a sample text file?
hope this helps.
Darek
2014-11-19
0 个投票
3 个评论
yashwanth annapureddy
2014-11-21
How about
numRows = value that is suitable for your memory requirements;
>> ds = datastore('file.txt', 'ReadV', false, 'Delimiter', '', 'RowsPerRead', numRows);
while hasdata(ds)
t = read(ds); % post process the data
end
I guess you should be able to control RowsPerRead to not run out of memory, unless I am missing something.
thanks
Darek
2014-11-24
yashwanth annapureddy
2014-12-4
By records if you mean lines in the file, datastore is probably not the right tool to count the number of records in a file. Lower or even system level functions like the ones mentioned in this thread might be useful for this kind of analysis.
类别
在 帮助中心 和 File Exchange 中查找有关 Data Import and Export 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!