Parallel is slower than sequential?

Question

0 个投票

I am new with the Parallel Toolbox, and I have many doubts. I was implementing some parallel Jacobi algorithm, and it resulted to be slower than the sequential, using the same precision threshold parameters. I tried several parallel approaches, and none seemed to be fast enough. So I tried some simpler code, as the one below:

     tic;
     ticBytes(gcp);
     n = 500;
     n_mat = 50;
     C = cell(1, n_mat);
     parfor i = 1:n_mat
          A = rand(n);
          B = rand(n);
         C{i} = A * B;
     end
     tocBytes(gcp);
     toc

and it is slower than the same, with 'for' instead of parfor. I got respectively:

             BytesSentToWorkers    BytesReceivedFromWorkers
             __________________    ________________________

    1              16016                  5.2018e+07       
    2              18152                  4.8021e+07       
    Total          34168                  1.0004e+08

Elapsed time is 1.590726 seconds.

for the parallel version,

and: Elapsed time is 0.674556 seconds.

for the sequential version.

What am I doing wrong? I also don't really understand what sliced variables are. Furthermore I noticed that using cell structures instead of arrays inside parfor doesn't give the warning of the overhead, so I always tended to prefer them, but still with the arrays things go usually faster.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Edric Ellis 2018-4-19

0 个投票

There are a couple of reasons that your parfor loop is slower than the for loop equivalent. Firstly, there's the data transfer overhead - you're transferring quite a decent amount of data back to the client from the workers - this has to be serialized (basically like calling save on the data - but without using a file) on the worker, sent to the client, and then deserialized (equivalent of load).

Secondly, and probably most importantly for this case, if you're using only the local cluster type, then unfortunately this particular loop is pretty much guaranteed to be slower using parfor than for. That's because the for loop version is already pretty efficiently multi-threaded using mtimes - essentially, it's already taking full advantage of all the cores on your computer. The workers in a parfor loop default to running in a single-threaded mode, so each individual call to mtimes will be slower. Workers default to running in single-threaded mode to avoid overloading your computer.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Parallel is slower than sequential?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

类别

标签

Community Treasure Hunt

Parallel is slower than sequential?

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

类别

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论