Error in using datastore function to read big data
8 次查看(过去 30 天)
显示 更早的评论
I am usng datastore function along with 'hasdata()' to read big data, it has 7 colomns by 46 million rows, when I read just one colomn it works fine but when I try to read all seven colomns it gives a error telling "Error using matlab.io.datastore.TabularTextDatastore/readdata. Mismatch between file and format character vector.
I can work with chunks, read the first chunk, perform calculations,write to a file and take the next one. Apprecaite a solution for this problem.
0 个评论
回答(1 个)
Jeremy Hughes
2017-3-24
Hi Eric,
I'm guessing datastore is running into an issue parsing the file. This often happens when there is a non-number entry in a '%f' column.
Without seeing the contents of the file, it's hard to offer a solution, but to diagnose the issue, try setting elements of TextscanFormats from '%f' to '%q'. The data will read in as character vectors, but you should be able to find the non-numeric data that is causing the problem.
If there is an entry like "NA" you can use the TreatAsMissing property to import those as NaN. If the data isn't really numeric, you can just import the column as characters.
Less frequently, there may be suffixes like "123mm"; If the suffix is common to all the entries, you can change the format to '%f mm' to remove the literal suffix, however, if there are any rows without the suffix, datastore will fail, and you can use %q then post process the data.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Predictive Maintenance Toolbox 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!