Is it possible to improve fread/fwrite performance and further speed up loading/writing of binary data?
72 次查看(过去 30 天)
显示 更早的评论
I am working on a project that requires reading and writing large amounts of binary data from disk (10's to 100's of GB). I've managed to optimize the code to the point where I/O is taking up a rather large chunk of the total run time, and is realistically the only thing that I can further improve (all the other non-trivial parts of the code execution are either double-vectorized as matrix-matrix multiplications or use bsxfun when matrix multiplications arent possible).
Ive managed to set up the code such that the entire block of data is read/written in a continuous block using a single call to fread/fwrite. From what Ive seen looking this issue up on Google, this seems to be the best possible situation for fread/fwrite. However, I I know that it isnt utilizing the full capabilities of the hardware. I've run tests with other programs and know that the disk is capable of using the full SATA3 bandwidth (for example, with "dd" I get speeds of ~550-575 MB/s). fread/fwrite, however, give me a pretty constant speed of ~110 MB/s, which is ~5x slower than what the disk should be capable of.
I suspect this is a CPU bottle neck - fread/fwrite are both single threaded, and the CPU I'm using is an older (Sandy Bridge) 16 core xeon, which has pretty good multi-threaded performance but is a bit lacking in single threaded performance. This is just a guess though.
Is there any way to further speed up the read/write process?
Are there any other functions (either built in or publically available m-file/mex function) that might be faster than fread/fwite?
Could I possibly parallelize it and have multiple threads access the file at the same time (with specified byte ranges to avoid accessing the same part of the same file simultaneously)?
It was also suggested that I might be able to get around this by using datastores and tall arrays...if I go this route would the execution time for the rest of the code be affected?
The machine I am using has a ton of memory (allowing me to load the whole dataset and analyse it directly in memory very efficiently), so if the main computations take a performance hit from using tall arrays (or some other form of memory mapping procedure) then I would highly doubt that the overall execution time would be less. I will actually test this at some point though to confirm.
2 个评论
回答(1 个)
Yair Altman
2017-7-31
You might try some of the suggestions mentioned here:
- http://undocumentedmatlab.com/blog/improving-fwrite-performance
- http://undocumentedmatlab.com/blog/explicit-multi-threading-in-matlab-part1
- http://undocumentedmatlab.com/blog/explicit-multi-threading-in-matlab-part2
- http://undocumentedmatlab.com/blog/explicit-multi-threading-in-matlab-part3
- http://undocumentedmatlab.com/blog/explicit-multi-threading-in-matlab-part4
"In general, when doing binary file I/O, touch the file as few times as possible. Reading or writing the entire file in one go is fastest. Make use of fread's (and fwrite's) skip parameter when possible. When performance is truly critical, read the data into memory as a uint8 array and parse it yourself using typecast, rather than relying on fread to do the parsing for you. It can be complicated and tricky and error-prone, but it's tons faster."
Additional I/O performance tips can be found in chapter 11 of my book "Accelerating Matlab Performance" (CRC Press, 2014).
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Whos 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!