When using "inv()" in a thread-based parallel environment, can multiple threads be applied to a single instance of "inv()"?

I have 48 threads and 256 GB of RAM available to invert as large an impedance matrix as possible, which will be painful. I do need the mobility matrix to then find the impedance matrix of the unforced partition by inverting only that partition. I do not want to miss an oppurtunity to relieve the pain. The documentation says that the "inv()" command is fully compatiable with a thread based environment; however, I can only take this to mean that the "inv()" command may be used on individual threads and that there is not necessarily a way to use multiple threads to apply to a single instance of the "inv()" command. If not, I will probably implment a heirarchical blocked matrix approach.

 采纳的回答

The inv function is multithreaded. You don't need to do anything to make use of multithreading here.
Here are more details:
When the documentation says that a function can be used in a thread-based environment, it means that you can run it on a thread-based worker. That is, you could do something like
parfor i=1:10
A = somefunctionToBuildA();
result{i} = inv(A)
end
It's about inverting many matrices in parallel rather than using multiple threads to invert one matrix.
The parallelism you are interested here is different. You want to think about built in Multithreading MATLAB Multicore - MATLAB & Simulink. Many functions are implicitly multithreaded and inv is probably one of them. There is not a definitive list of such functions but there are ways to find out.
By default, MATLAB will use a number of threads equal to the number of cores on your machine. You can check with the maxNumCompThreads function. On my machine:
>> maxNumCompThreads()
ans =
8
Let's do an experiment to see if inv is muiltthreaded.
>> a = rand(10000);
>> tic;b = inv(a);toc
Elapsed time is 10.543774 seconds.
I now tell MATLAB to use 1 thread and repeat the calculation
>> maxNumCompThreads(1);
>> tic;b = inv(a);toc
Elapsed time is 34.646851 seconds.
That's more than 3x slower than when I used 8 threads. So, I conclude that inv is making use of multithreading.

5 个评论

Your answer saves me much trial and error in learning to work with a threaded environment. The next step will be to see how to most effeciently manage time and memory while using different sets of threads to invert different sets of blocks in parallel. It could result in something that seems like fractal inversion. I gather this is something best done in a thread based environment and may become the basis for a subsequent question.
Multiple threads don't always give you a huge benefit. At some point, they just start to get in each other's way. I've got 16 cores on my machine. I'll use a little larger array to push the limits as far as I am willing to wait.
a = rand(15000);
maxNumCompThreads(16);
tic;b = inv(a);toc
Elapsed time is 9.623250 seconds.
maxNumCompThreads(8);
tic;b = inv(a);toc
Elapsed time is 9.154560 seconds.
maxNumCompThreads(4);
tic;b = inv(a);toc
Elapsed time is 10.507131 seconds.
maxNumCompThreads(2);
tic;b = inv(a);toc
Elapsed time is 12.691411 seconds.
maxNumCompThreads(1);
tic;b = inv(a);toc
Elapsed time is 25.064948 seconds.
The difference between 8 and 16 is pretty much down in the noise. So I pulled out the more reliable timeit.
maxNumCompThreads(16);
timeit(@() inv(a))
ans =
9.409088200625
I was also watching my CPU activity. And since I had some other stuff in the background that was grabbing around 1.5 cores, it was only able to get 14 cores or so.
But, with 8 cores, how well could it do?
maxNumCompThreads(8);
timeit(@() inv(a))
ans =
9.207302200625
Which seems intriging to me. I was watching the CPU monitor again, and it was getting a full 8-10 cores the entire time.
maxNumCompThreads(4);
timeit(@() inv(a))
ans =
10.576316825625
With 4 cores, now it was using between 4 and 6 cores. Again, it looks like MATLAB uses a spare extra core or so on the side itself.
The point is though, you don't always get the full benefit you might hope to get, from having multiple cores on a single inv call. Again, those extra cores just step on each other's feet too often.
Note that if you change the threading profile, that you can configure multiple cores per thread (at the expense of reducing the maximum number of simultaneous threads.) There are some cases where using a few cores per thread, and using several threads, is faster than using only one core per thread and more threads.
Yes. If you were doing many computations on many different matrices, you would probably get most benefit from assigning blocks of perhaps at most 2 cores to each sub-problem. But even there, 2 cores is only twice as fast as one core on a single matrix. Therefore, you would arguably do best to process each sub-problem on individual cores, using a parfor loop. This would seem simplest, and might give the best overall throughput.
@John D'Errico, regarding "With 4 cores, now it was using between 4 and 6 cores. Again, it looks like MATLAB uses a spare extra core or so on the side itself."
My guess is that the OS isn't affinitizing the cores to a specific 4, but instead hopping those 4 around to 6 (but never more than 4 at a time?).

请先登录,再进行评论。

更多回答(0 个)

类别

帮助中心File Exchange 中查找有关 Operating on Diagonal Matrices 的更多信息

产品

版本

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by