why is this Matlab Code faster than the C++ code below? I want to understand what Matlab internally does better and faster than C++
13 次查看(过去 30 天)
显示 更早的评论
why is this Matlab Code
function primes = sieve_era2(N)
% sieve of Erathostenes without upper bound of search space (could theoretically run forever)
if nargin == 0
N = Inf;
end
primes.number(1) = 2;
primes.counter(1) = primes.number(1);
k = 2;
k1 = 2;
tic;
while k <= N
k = k + 1; % check next k if it is prime
if mod(k,100000) == 0
fprintf("numbers checked: %i, number of primes found: %i, largest prime found: %i, time: %.2f seconds \n", k, k1, primes.number(end), toc);
end
primes.counter = primes.counter - 1; % all counters reduced by 1
if min(primes.counter) == 0
primes.counter((primes.counter == 0)) = primes.number((primes.counter == 0));
continue; % current numer is not a prime
end
k1 = k1 + 1; % no counter was reduced to zero --> current number is a new prime
primes.number(k1-1) = k;
primes.counter(k1-1) = primes.number(k1-1);
end
end
faster than this C++ code:
// sieve_era2.cpp : sieve of Erathostenes without upper bound of search space (could theoretically run forever)
#include <iostream>
#include <vector>
#include <algorithm>
#include <time.h>
#include <chrono>
using namespace std;
using namespace std::chrono;
struct primes
{
std::vector<int> number {2};
std::vector<int> counter {2};
} p;
primes sieve_era2(int N)
{
int k = 2;
bool prime{ true };
while (k <= N)
{
k = k + 1;
for (int j = 0; j <= p.counter.size()-1; j++)
{
p.counter[j] = p.counter[j] - 1;
if (p.counter[j] == 0)
{
p.counter[j] = p.number[j];
prime = false;
}
}
if (prime == false)
{
prime = true;
continue;
}
p.number.push_back(k);
p.counter.push_back(p.number.back());
}
return p;
}
int main()
{
primes p;
int N = 200000;
unsigned __int64 tic = duration_cast<milliseconds>(std::chrono::system_clock::now().time_since_epoch()).count();
p = sieve_era2(N);
unsigned __int64 toc = duration_cast<milliseconds>(std::chrono::system_clock::now().time_since_epoch()).count();
cout << (toc - tic) / 1000 << " Seconds " << std::endl;
system("pause");
Matlab runs 12 sec, C++ about 55 sec.
I want to understand what Matlab internally might be doing better and faster than C++
0 个评论
采纳的回答
Chris
2022-6-17
编辑:Chris
2022-6-17
I see an efficiency in primes.counter = primes.counter - 1;
Matlab uses LAPACK for matrix/vector operations, which I think should be faster than a for loop in C.
Same for the if block that follows--especially for the if block, since you're using if once per outer loop in Matlab, and many times per outer loop in C.
You could try timing those operations separately, a few thousand at a time. In Matlab, for instance:
counter = rand(10000,1);
timeit(@() counterTest(counter))
function counterTest(counter)
for idx = 1:1000
counter = counter-1;
end
end
4 个评论
Chris
2022-6-19
Interesting. You're saying 32-bit integer operations should require equal time to 64-bit integers using BLAS functions?
It appears to me that the PetscBLASInt type allows BLAS to handle both data types separately.
https://petsc.org/release/docs/manualpages/Sys/PetscBLASInt.html
更多回答(1 个)
Jan
2022-6-17
编辑:Jan
2022-6-19
Not an answer, but an improvement of the Matlab code, which run in 10.7 sec on my R2018b i5m instead of 13.0 sec of the original version for N=2e5:
function primes = sieve_era2m(N)
% sieve of Erathostenes without upper bound of search space (could theoretically run forever)
if nargin == 0
N = Inf;
end
number(1) = 2;
counter(1) = number(1);
k1 = 2;
show = 1e5;
tic;
for k = 3:N+1
if k == show
fprintf('checked: %i, primes found: %i, largest: %i, time: %.2f s\n', ...
k, k1, number(end), toc);
show = show + 1e5;
end
counter = counter - 1; % all counters reduced by 1
if all(counter) % current numer is not a prime
number(k1) = k;
counter(k1) = number(k1);
k1 = k1 + 1; % no counter was reduced to zero --> current number is a new prime
else
ncounter = ~counter;
counter(ncounter) = number(ncounter);
end
end
primes.number = number;
primes.counter = counter;
end
And with UINT32 and without output it runs in 7.7 sec:
function primes = sieve_era2i(N)
number(1) = uint32(2);
counter(1) = uint32(2);
one = uint32(1);
tic;
k1 = uint32(1);
for k = uint32(3):uint32(N)
counter = counter - one;
if all(counter)
k1 = k1 + one;
number(k1) = k;
counter(k1) = k;
else
for u = one:k1
if ~counter(u)
counter(u) = number(u);
end
end
end
end
primes.number = number;
primes.counter = counter;
end
EDITED: And a version taking 6.8 sec:
function primes = sieve_era2j(N)
piN = ceil(N / log(N));
number = zeros(1, piN, 'uint32'); % Pre-allocation
counter = zeros(1, piN, 'uint32'); % Pre-allocation
number(1) = uint32(2);
counter(1) = uint32(2);
one = uint32(1);
tic;
k1 = uint32(1);
for k = uint32(3):uint32(N)
new = 1;
for u = one:k1
counter(u) = counter(u) - one;
if counter(u) == 0
counter(u) = number(u);
new = 0;
break;
end
end
for v = u+1:k1 % Count the rest without setting [new] again
counter(v) = counter(v) - one;
if counter(v) == 0
counter(v) = number(v);
end
end
if new % New prime found:
k1 = k1 + one;
number(k1) = k;
counter(k1) = k;
end
end
primes.number = number(one:k1);
primes.counter = counter(one:k1);
end
Now to an answer: I cannot profile your C++ code. My guess is that this is the bottleneck:
p.number.push_back(k);
p.counter.push_back(p.number.back());
The iterative growing of arrays is expensive. It looks like Matlab strategies to reduce the effect is more powerful.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Function Creation 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!