Techniques of common subexpression elimination, disorganization in MATLAB, etc..
13 次查看(过去 30 天)
显示 更早的评论
Hello friends,
Yesterday I posed a question about performance gain in matlab by using techniques of common subexpression elimination (CSE). CSE, loosely seaking, refers to reducing computational burden by not repeating an algebraic operation in an algebraic expression. Apparently, finding an optimal CSE seems to be an open problem in science (correct me if I am wrong. I wish I am wrong). My question was "Does matlab have any capability to perform CSE in a systematic way prior to doing the calculations?" If you want to evaluate simple expressions like a^2+2*a*b+b^2 then the optimal CSE is to first evaluate t1=a+b and then t2=t1^2 which is just 2 arithmatic operatrions instead of 6 operations a naive man would do. When expressions get much longer it becomes very difficult to perform CSE ourselves and clearly there is a need for an advanced algorithm which does this for us.
After struggling a lot with this problem I noticed that matlab does have some, perhaps elementary, capabilities to do CSE. The problem is that it does not do this automatically (which I do not understand why???). To me this seems a bug in matlab. I explain this with a concrete example.
I am attaching 3 files now. EZ.txt is a text file showing you a long expression. EZ.m is an mfile which is directly calculated by applying the command 'matlabFunction'. EZ1.m is the same function handle I created by typing
matlabFunction(EZ, 'File', 'EZ1', 'Vars', [Dm0, Dm1, Dm2, Ds0, Ds1, Ds2], 'Outputs', 'EZ');
Now, I do the following performance test:
clc;
n=1000;N=10000;
Dm0=rand(1,n);Dm1=rand(1,n);Dm2=rand(1,n);Ds0=rand(1,n);Ds1=rand(1,n);Ds2=rand(1,n);
tic;
for i=1:N
A = EZ(Dm0,Dm1,Dm2,Ds0,Ds1,Ds2);
end
t1=toc;
tic;
for i=1:N
B = EZ1(Dm0,Dm1,Dm2,Ds0,Ds1,Ds2);
end
t2=toc;
t1=t1/N;t2=t2/N;
vpa([t1 t2 t1/t2])
[0.0033211724199999998960453062579745, 0.00038401423999999997590040767825315, 8.6485657927685188894884049659595]
So, on average the mfile EZ1 is 8 times faster than EZ. I have expressions which are 20-30 times longer. When I repeat this test to another expression which is just 2-3 times longer I get 10-15 times performance gain! Now, my questions in bellow:
1) Isn't this a bug in matlab? Especially for people like me (who are just mathematicians with rather low knowledge of computer science) who
are not an expert in matlab. I do expect matlab to always implement CSE no matter how I make the function handle. I simply showed you a rather stupid and crazy example where people suffer from low performance and do not know how to fix it while the problem can be easily fixed if matlab developers would create a better 'communication' between naive users, like me, and matlab by simply displaying a message like "You can get better performance gain if you ..."
2) I feel that matlab should, or might, have 'better' capabilities to translate long expressions into nicer expressions (in terms of performance gain) using CSE techniques that I might not be aware of. Let me know if you are aware of any.
Thanks for your patience!
Best,
Babak
回答(2 个)
sai charan sampara
2024-1-17
Hello Babak,
Common Subexpression Elimination (CSE) is a property of a compiler that aims to identify and eliminate redundant computations within a program, and it is typically implemented as part of the optimization phase in a compiler.
While common subexpression elimination (CSE) offers significant benefits in terms of reducing redundant computations and improving performance, it is not used by most of the compilers in any programming language for several reasons. CSE involves analysing the program to identify common subexpressions and determining where they can be safely eliminated. This analysis is not straight forward and cannot be generalized, especially in the context of optimizing larger codebases. CSE may lead to increased memory usage, especially when storing common subexpressions in memory for reuse. This trade-off between computation and memory usage needs to be carefully considered. Also, when working with floating point numbers or large integers, CSE can give different results from expected values.
0 个评论
Steven Lord
2024-1-17
If you want to evaluate simple expressions like a^2+2*a*b+b^2 then the optimal CSE is to first evaluate t1=a+b and then t2=t1^2 which is just 2 arithmatic operatrions instead of 6 operations a naive man would do.
"optimal" in what sense?
[By the way, your vpa call doesn't do anything useful. It looks like you may think it gives you more accurate timings, but it doesn't. Those expressions have already been computed by the toc function by the time you call vpa.]
And if you're going to perform timing tests, don't write your code as a script file. Generally speaking MATLAB can better optimize code in a function file.
Are you sure that the result of your t2 matches the result of your original expression in general? You're assuming multiplication is commutative; is that a valid assumption for the work you're doing? It's not a valid assumption in MATLAB if the quantities you're dealing with are non-scalar as you can see from the example below, where a and b have a non-zero commutator.
a = [1 2; 3 4];
b = [5 6; 7 8];
commutator = a*b-b*a
E1 = a^2+2*a*b+b^2
E2a = a+b;
E2 = E2a^2
E1-E2
To me this seems a bug in matlab.
In my opinion, it is not a bug. Asking for Symbolic Math Toolbox to perform more common subexpression elimination for symbolic operations sounds like a reasonable enhancement request (particularly if there are particular patterns that come up frequently in the symbolic expressions with which you're working.) But you haven't proven for your simple example that the rewriting you asked for gives the same answer numerically (because it doesn't in general)!
In your larger example, what is A-B?
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Performance and Memory 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!