Compute mean and diff faster
20 次查看(过去 30 天)
显示 更早的评论
Hello everyone. I am working on a FD code and need to do a lot of averaging in very large 3D/2D matrices. I want to do the following task, take 1D as an example.
A vector is: A=[a,b,c,d,e,f], and I want to get the average value in between each two neighboring values, so I do A=(A(2:end)+A(1:end-1))/2. And I also do diff a lot, e.g. diff(A,1). And it is the same in 2D or 3D cases. But it becomes very slow when I am dealing with very large matrix, say 1000*1000*1000. Is there any faster way to do this?
Thank you very much.
0 个评论
采纳的回答
John D'Errico
2017-12-28
编辑:John D'Errico
2017-12-28
Get more memory. Huge problems require more memory, or a cup of coffee. Sit down, relax, take out that old copy of War and Peace, and read away.
If your matrix is 1000x1000x1000, then it has 1e9 elements, each of which will require 8 bytes of RAM to store. So that matrix uses roughly 8 gigabytes of RAM to store.
Now, when you compute some operation on your array, like diff or the local average between consecutive elements, this creates a NEW array that is almost the same size. So a new array is formed that also requires 8 gigabytes of RAM.
Every copy of that array forces MATLAB to allocate 8 more gigabytes of RAM. How many gigs of RAM does your computer have? For example, mine is now just a bit old, so it has only 8 gigs in total.
What does MATLAB do when it runs out of RAM? It starts swapping things around, using virtual memory. That gets SLOW, real fast, even if it can find the disk space to do so.
So if you want your computations to be faster, you need more memory.
A poor alternative might be to use singles, instead of doubles. Create your matrix as a single array, and it will now require 4 gigabytes of RAM. It is still gonna be a memory hog, but a slightly leaner one. The cost of course is a loss of precision in your computations.
2 个评论
John D'Errico
2017-12-28
Two copies of an 8 GB array require 16 GB. Don't forget that MATLAB itself consumes some RAM. So it does not matter how the computation is done, you will start having problems as soon as you start to push that limit. If your computations can tolerate the use of single, go in that direction.
If you have an actual disk drive that uses s spinning platter, the best thing you can do is replace the disk drive with a SSD drive. That can hugely increase your VM access speed, limited by the bus speeds between memory and your drive. SSD drives are not that expensive. Mine was well worth the money spent to keep my computer hopping along happily for a few more years.
更多回答(2 个)
Benjamin Kraus
2017-12-28
It may be time to start looking into the Parallel Computing Toolbox or some of the new Big Data capabilities in MATLAB (such as tall arrays). Some links to check out:
0 个评论
David Santos
2019-8-20
I will recomend you to put all your data in a big .mat matrix using matfile(doesn't load all the data in memory just the necessary) and process in chunks, preferably by columns.
Doing this you can control the ammount of data you put into memory and been able to process very long matrix (> 1TB).
Tall arrays are ok if you don't need to acess to all the data because once you increase the number of acces it becames slower than matfiles
All the best
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Logical 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!