How to speed up the code to check if randomly generated number in a certain range is a prime ?
    3 次查看(过去 30 天)
  
       显示 更早的评论
    
I am trying to generate random numbers between 1-48593 in a loop for k trials, and counting total number of prime numbers for each trial. The problem is if I set the trial number high the code takes forever to run. Is there any way of speed up the code ? I'm required to run the loop for k = 1:1e6
for k = 1:1e5
  r = randi([1,48593],k,1);
  c = isprime(r); % check if randomly generated numbers are prime.
  M = nnz(c); % counting nonzero elements. 
end
1 个评论
  Jan
      
      
 2022-3-24
				What is the purpose of this code? Why do you create a growing number of random values repeatedly, although they are overwritten?
Even without randomness and prime analysis, this is very inefficient:
for k = 1:1e6
   r = zeros(k, 1);
end
This create over 500 GB of data.
What is the purpose of creating the random integers? You could omit the multiples of 2, because they are no primes. You know this in advance and the chance, that a random number is even. 
回答(2 个)
  Jan
      
      
 2022-3-24
        
      编辑:Jan
      
      
 2022-3-24
  
      Avoid to call isprime repeatedly. If the numbers are limited, check only once, if they are prime:
m = false(1, 48593);
m(primes(48593)) = true;
for k = 1:1e4
  r = randi([1, 48593], k, 1);
  c = m(r);
  M = nnz(c);
end
This is much faster, but even here you have an exponentially growing load.
Which problem do you want to solve?
  John D'Errico
      
      
 2022-3-24
        
      编辑:John D'Errico
      
      
 2022-3-24
  
      The simple answer is to generate the primes in that range, ALL of them, and then look to see which of your randomly generated elements are in that list. This completely avoids calls to isprime and should be far more effficient for that reason. I'm not at all sure why you are doing it the way you are doing it. Of course, if the range were far too high, then there would be a problem. But this is trivial:
Plist = primes(48593);
numel(Plist)
So there are only 4999 primes. (That may be why the number 48593 was chosen.) It was very fast to generate them, and you need only do so once. You might want to do something like this:
N = 1e4;
Primecount = zeros(1,N);
for k = 1:N;
  r = randi([1,48593],k,1);
  Primecount(k) = sum(ismember(r,Plist));
end
plot(1:N,Primecount,'.')
I used only N = 1e4, since this uses the online MATLAB used by Answers to execute. But perhaps this is something along the lines of what you need. Remember that isprime will be moderately slow compared to primes itself. In fact, I often enough have reason to generate the list of all primes less than 1e9, and even that takes only a few seconds. I have no idea what your goal is in this code though. I could make some wild guesses, but my wild guesses are always wrong. Maybe you actually want to compute the relative fraction of elements in that set that are prime, or something like that.
2 个评论
  John D'Errico
      
      
 2022-3-25
				
      编辑:John D'Errico
      
      
 2022-3-25
  
			I'm a bit confused as to your goal.
As long as the samples from randi are uniformly distributed among the integers in the interval [1,48593], then the anser is trivial. And randi really is uniform.
I have no idea why you want to do a massive simulation. There are 
Pmax = 48593;
Plist = primes(Pmax);
numel(Plist)
4999 primes in that interval. Again, randi is a UNIFORM generator. So if you choose a number randomly in that interval, the probability of it being prime is just
format long g
Pbinom = numel(Plist)/Pmax
At this point, the number of events is just a binomial random variable. 
So this is basic probability and statistics at this point, since you sampled uniformly on the interval from 1 to Pmax. 
If you sample k such events, then the expected number of positive events you will see is just Pbinom*k. So sample a million numbers uniformly from that interval, as long as they are truly uniform, then the expected number of events is just 1e6*Pbinom. TRY IT! 
k = 1e6;
r = randi([1,Pmax],[1,k]);
So the expected number of prime events will be:
ExpectedPrimeCount = k*Pbinom
We observed 
ActualPrimeCount = sum(ismember(r,Plist))
such events. So we were pretty close. Again, this is standard, basic prob and stats. (Which may be what you are doing this for, as some sort of class project,)
The variance around that value is just k*Pbinom*(1-Pbinom). So the standard deviation is the sqrt of the variance.
V = k*Pbinom*(1-Pbinom)
S = sqrt(V)
So in context, how close was the prediction? Remember, since the law of large numbers will apply here, the resulting count of the primes we observe will be INCREDIBLY well approximated by a Normal distribution (Gaussian for some people.) 1e6 such events is a large number in this context.
(ActualPrimeCount - ExpectedPrimeCount)/S
I would expect a number roughly between +/- 2 or so, and that is entirely reasonable.
But my point in all of this is the probability will NOT change. As long as the upper end of the interval is FIXED and immutable, the there will always be EXACTLY 4999 primes in that interval. And that means the probabiltiy is not going to change. EVER.
What I am wondering is that MAYBE you are thinking the probability changes, because the density of the primes decreases as the upper limit decreases. For example, suppose I did the same test for a larger value of Pmax? This gets into things like the prime counting function, thus the number of primes less than some number N. For example, now suppose we did this same test, but made the upper limit 
Pmax = 1e7;
numel(primes(Pmax))
So there are 664579 primes that are less then 1e7. And that means the probability of a RANDOMLY chosen number in the interval [1,1e7] is
Pbinom = numel(primes(Pmax))/Pmax
So a wee bit more than 6.6%.
As I suggested, this gets into things like the prime counting function, often called pi(N). That is not the NUMBER pi. Mathematicians ran out of variables to use for a name I guess. :)
A decent approximation to pi(N), although not truly a great one, but adequate for many purposes. is just
    pi(N) ~ N/log(N)
That is the natural log, by the way. So the number of primes that do not exceed N is APPROXIMATELY that fraction.
pi_approx = @(N) N./log(N);
pi_approx(48593)
100*(ans/4999 - 1)
Do you remember what the actual number was? 4999. Around 10% error, so not too far off. If we go farther out, that approximation is still off, but relatively less so.
pi_approx(1e7)
100*(ans/664579 - 1)
So now only around 6% low. And all of this is now getting into some pretty deep stuff, but that approximation eventually gets better and better, at least asymptotically.
We can plot how things do.
Plist = primes(48593);
np = numel(Plist);
plot(Plist,1:np,'b.')
hold on
fplot(@(N) N./log(N),'r')
fplot(@(N) N./log(N) + N./log(N).^2,'g')
ylim([0,np])
grid on
legend('Actual count of primes','N/log(N)','N./log(N) + N./log(N).^2','location','northwest')
All of the above only scratches the surface. The curve in green is a much better approximation to the expected count. Again, I still have no idea what you are doing in all of this, but there are some very neat concepts to be found as you dive deeper and deeper into the problem.
You can find extended tables of how well the first approximation works, for some VERY large numbers in the above link.
Anyway, I still don't really know where you are going with all of this. Is this some sort of basic stats project, where you are learngin about binomial random variables? Or is this some sort of project where you are looing at things like the prime counting function, so some sort of basic number theory class?
另请参阅
类别
				在 Help Center 和 File Exchange 中查找有关 Creating and Concatenating Matrices 的更多信息
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!




