How can I improve the performance of my code? Specifically with the randn function for large arrays in a Monte Carlo simulation.
1 次查看(过去 30 天)
显示 更早的评论
I have a script that runs a Monte Carlo simulation to determine call options of theoretical stocks. I am having serious performance issues on large simulations (1E7+ iterations) of ~8k elements. I have tried three different approaches each one has had its shortcomings. I am wondering if there are any changes I could make to improve performance.
As this is an assignment, I am required to use the old v4 rng generator which from a quick test is about 6-10x slower than the default mode. :(
My first approach was to use a simple loop for each simulation:
disp('starting')
tic%start time
S0=100;
K=100;
sig=0.2;
r=0.05;
T=1/4;
delta=0;
N=8640;
DELTAt=T/N;
term1=(r-sig^2/2)*DELTAt;
term2=sig*sqrt(DELTAt);
logS0 = log(S0);
NbSim=10e4;%10e6,10e8
randn('seed', 8128) %uses old v4 rng gen
%rng(8128);
ST=zeros(NbSim,1);
compStr = sprintf("Computing %.2E Simulations...This may take a while.\nThis message will close when complete",NbSim);
box = msgbox(compStr);
fprintf("Starting %.2E simulations on %G elements. Please Wait...",N,NbSim);
for i=1:NbSim
increments=term1+term2*randn(N,1);
LogPaths=sum([logS0; increments]);
ST(i)=exp(LogPaths);
end
disp("Simulations done");
close(box)
%a bit of calculus
It took my computer hours to compute 10E6 simulations and I tried to make it faster by using only arrays... with my test simulation number this method proved much faster, however I forgot how much memory large arrays take.
%same as above
ST=zeros(1,NbSim);
logS0mat = ones([1,NbSim])*logS0;
increments=term1+term2*randn(N,NbSim);
LogPaths = sum([logS0mat;increments],1);
SPaths = exp(LogPaths);
ST = SPaths(end,:);
%calculus
Once I tried running the actual sim numbers I realized this was not an approach which would work. I came to my third approach which was to "chunk" (?) the data into smaller pieces and then do array math on these.
%same as above
NbSim=10e6;%10e8
chunkSize = 1e4;%(depends on NbSim and N size) pick size you can manage on your machine, bigger = faster but more mem usage(i think)
simChunk = NbSim/chunkSize;% make sure its nice...
ST=zeros(1,NbSim);
increments = zeros(N,chunkSize);
logS0mat = ones([1,chunkSize])*logS0;
%start chunk sim
compStr = sprintf("Computing %.2E simulations in chunks of %.2E.\nThis may take a while (%d iterations).\nThis message will close when complete.",NbSim,chunkSize,simChunk);
box = msgbox(compStr);
for i=1:simChunk
increments=term1+term2*randn(N,chunkSize);
LogPaths = sum([logS0mat;increments],1);
SPaths = exp(LogPaths);
x = (i-1)*chunkSize+1;
ST(x:((x-1)+chunkSize)) = SPaths;
end
disp("Simulations done");
close(box);
%calculus
This was slower than my first approach, from a little testing it seemed that it was the randn function which was eating most of the time (took me about 6s for that line to execute alone). I was hoping to run the simulation for N = 10e8, but as it stands now it will take far too much time.
Why am I not seeing the same performance gains I saw between my loop and my array methodology (tested for smaller N)? Is there a better way?
Many thanks :)
MATLAB Version: 9.8.0.1323502 (R2020a)
0 个评论
采纳的回答
the cyclist
2020-4-4
I did not try to run your code, but did some independent testing.
I ran in chunks of 1e6, which seemed empirically seemed about optimal. (Running not in chunks was about 20% slower, and as you say, hits memory limitations.)
I find that MATLAB is consistently generating about about 1e6 values in 0.03 second (when seeded as you are required to).
You are hoping to generate roughly 1e4 * 1e8 = 1e12 values. So, that would take about 3e4 seconds, or roughly 8 hours.
I don't think there's any way around the fact that you are trying to generate a boatload of random numbers.
更多回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!