Filter data from text file whilst importing
显示 更早的评论
Hi, I have a large csv file, around 1.5 million rows, that I would like to filter whilst importing. I'm not sure I have enough memory to import the complete file as a table and then filter once in Matlab, and since this will be running unaccompanied over the long weekend I want something that will definitely work. I currently have the following code that imports everything. I want to only import rows where the 9th variable is equal to 1. Can I adapt this code to do that?
readfrom = 'test.csv'
fileID = fopen(readfrom);
keep = textscan(fileID,'%s%d%f%d%f%f%d%d%d%d%f%f%f%f%f%f%f%f%f\r', ...
'HeaderLines',1,'delimiter',',');
fclose(fileID);
Many thanks
回答(1 个)
dpb
2016-11-10
0 个投票
Only way I see to do this is to either read line-by-line and parse each line to see whether to keep or not or on same idea to read in blocks of (say) 10000 lines or so and do the selection block-by-block.
Not knowing anything of the file format itself, it's possible one could preprocess it before reading with Matlab using a batch editor or regular expressions or the like. Of course, depending on just how big the file is (having 1M lines isn't necessarily out of hand depending on the length of each record), you might be able to read as character/cellstr array and do the cleanup in memory as character before conversion, too....too many possibilities depending on unknown details to say unequivocally what would be best approach. But, the first two will work, just time-consuming.
类别
在 帮助中心 和 File Exchange 中查找有关 Text Data Preparation 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!