Converting script to function causing significant slowdown?
1 次查看(过去 30 天)
显示 更早的评论
I have an inherited project (roughly 5k lines) where the previous author has set everything up as several scripts that all execute in the base workspace, example structure:
- Main
- - Script 2
- - Script 3
- -- Script 3b
etc.
The general application flow is to load in parameters from Excel (currently that is used as a 'front end'), run the calculations, then export the data (both to Excel and other formats). The entire processing time varies with the input parameters, but in general takes ~80 seconds to complete on my machine.
This application eventually needs to be compiled. I'm able to run it as a deployed application and the processing time is approximately the same, no issue there.
However, I have a need to call the compiled application directly from Excel, and pass it parameters while doing this. To accomplish this I simply converted the main.m script into a function, eg below
function main(input1, input2)
When I do this however, in every format I run the application now (deployed, called from Excel, native Matlab/ non-deployed) it takes significantly longer to run through. From ~80 seconds to 10+ minutes.
My question is, am I doing anything fundamentally wrong here just converting main.m to a function like I did? If so, is there a better way to pass parameters to a deployed application? (my ultimate goal) Or does anyone know why changing just my main.m into a function from a script would cause a huge slowdown?
Thanks in advance for any insight..
Win 10, Matlab 2013b (currently for compatibility issues with this project)
21 个评论
Rik
2018-5-9
That slowdown sounds odd. In my experience small functions sped up the process, compared to a single big script. Although I have to admit the only project I actually timed it on was smaller than 500 lines of actual code (so not counting comments and function help). There we managed to increase the fps from single digits to 15-25.
I don't have any solutions/answers/suggestions, so I would be interested to see if others do.
John D'Errico
2018-5-9
If you execute the function in MATLAB, not by calling it from excel, does it take the same 80 seconds, or does that also take 10 minutes? I'm trying to resolve whether this is a problem with the interface with Excel, with the compiled code, or with the conversion from a script into a function. In the last case, that seems to make no sense at all, from what little you have described. So I want to pin down exactly where the extra time comes in.
I would suggest to do some monitoring, trying to look to see exactly where the time is spent.
Greg
2018-5-10
Can you run the profiler with main as both script and function, then share any results that look significant?
profile on
main; % or main(input1,input2) for function mode
profile off
profile viewer
Nick Heinz
2018-5-10
The slowdown occurs when run from Matlab in function mode as well. So it seems it's the conversion from script to function that is causing it, not the Excel link or compiling.
I'm about to do some more testing on it, I'll update as I get some answers from the profiler, etc.
Nick Heinz
2018-5-10
So after going over the profiler results I'm still pretty stumped. There's no 'child' script or function that jumps out as the culprit, it just seems that every single calculation takes significantly longer.
The "setup" portion of the application takes approximately the same time (still slower, but at least same order of magnitude), but the calculations are slower by ~15-20x.
I finally ran through the whole program in "function mode" and actual timing is 22 min to complete as opposed to the 10 min guess I made in the initial post. I've double checked that the same number of iterations and function calls are being run between the two modes, and they are (the calculations portion is an iterative solver, while loop until some tolerance is met).
To cause this ~15x slowdown all I have to do is comment or un-comment the line:
function main(input1, input2)
@Greg - there was barely any noticeable difference in the clearing time, 0.20s vs 0.23s.
So at this point I'm still pretty lost. Does anyone have any suggestions as to how I can pass parameters (really just the Excel file path, and row number for the setup information), without converting to a function?
I'd even entertain crazy things like writing the parameters to a text file in a fixed location from vba and reading them in inside the script... I'd just prefer not to get to this level of hack/ work-around if possible.
Philip Borghesani
2018-5-10
A couple of more things to check or pieces of information that might be helpful:
- Can you try it on a recent version of MATLAB? The timing may be significantly different.
- Does main call the other scripts multiple times? In a loop? If so then you may be much better off turning those into functions too or inlining them in main.
Nick Heinz
2018-5-10
编辑:Nick Heinz
2018-5-10
Ok, so I've tried it on several versions of Matlab. Unfortunately due to a necessity for 3rd party 32bit COM libraries during the export part of my application I'm stuck with 2015b or older (I've started looking into work-arounds, but haven't tried any yet, suggestions??).
2015b in both script and function mode is just as slow (15x slowdown). It doesn't seem to make a huge difference which 'mode' I run in when using 2015b, though function mode is marginally slower.
Just for fun I went and installed 2018a. This actually runs faster in script mode than in 2013b script-mode, but like I mentioned above, I can't export to the proprietary data format I need to without some 32bit COM workaround.
I then ran 2018b in function mode, and it was over 2x slower than in script mode. So it seems whatever issue is still around, whether it be my code or just a difference in how Matlab treats the two types of files.
@Philip - Yes, main calls other scripts (and some functions along the way) multiple times (in several big nasty while loops). I inherited this code and the project is largest I've worked on, and I unfortunately just don't have the time right now to completely refactor the whole thing - though I do hope to do this at some point.
This testing got me wondering though if Matlab for some reason allows more or less system resources for a function vs a script - anyone know? I'm still not really sure why all variables in the 'base' workspace is in every tested case faster than all variables inside a 'main' function, or even why there is a difference - is there some behind-the-scenes variable passing going on even if everything's inside the main function?
At this point I'm under a deadline and am looking for a quick work-around until I can go back and fix it properly later (full refactor/ restructure).
I have a couple thoughts, one would be if anyone knows how to put variables into a deployed applications workspace (like you would with the Matlab automation server, and
Matlab.PutWorkspaceData("var1", "base", var1)
called from VBA for instance. This would definitely be more elegant than my hack idea of creating a text file in vba, and checking if that file exists in my script, then reading in the parameters from it, but honestly that's likely my next try unfortunately.
Greg
2018-5-10
This is a fun one; I wish I had the code to explore.
I'm not familiar with calling MATLAB from external sources, but isn't there some mechanism for evaluating the .m file? If so, can't that same mechanism create 2 variables before calling the .m file?
% Instantiate MATLAB from VBA
% Use whatever is the correct syntax from the VBA side...
input1 = 7;
input2 = 8;
% As long as it's a script, it should have access to the
% "input1" and "input2" variables.
myScript;
Nick Heinz
2018-5-10
Yeah sorry, would love to put the code out there, but it's proprietary.
Thanks Greg. I do know how to instantiate Matlab (automation server) from vba, put in the vars input1 and input2 and then run the script. That's no issue.
However, I don't know how to do this when the application is compiled, which is my ultimate goal. Recently I've just been testing the whole script vs function thing inside of matlab for convenience.
I think it's related to the Matlab Runtime User Data Interface , but I'm not sure how to call this from vba and I'm having trouble finding info on it. Or maybe I'm way off. Anyone know if I'm pointed in the right direction?
Greg
2018-5-11
So, since we're way deep in the weeds and grasping at straws here, what happens if you use a wrapper? Leave main as a script, then use:
function main_wrapper(input1,input2)
main
Nick Heinz
2018-5-11
@Greg - Thanks for the suggestion, unfortunately there's the same slowdown when run like that. So it appears to have some slowdown whenever it's inside of a function workspace, no matter if it's the top level or not.
I ended up implementing my text file hack/ work-around last night, it at least allows me to pass parameters to the deployed application (albeit in a very roundabout manner). I'm waiting for a response from the Mathworks on how to call the Matlab Runtime User Data Interface so that I can do it a little more elegantly, especially since it will help me for other projects.
Thanks for all the help, but it sounds like no one is sure what's causing this, which is concerning since the same issue persists from 2013b up to 2018a.
Nick Heinz
2018-5-12
So I heard back from Mathworks concerning the Matlab Runtime User Data Interface:
"If you would like to pass variables from VBA directory into MCR workspace, then, unfortunately, this is not a supported workflow. The MATLAB Runtime User Data Interface just provides API for C wrapper but not the standalone application. And for the standalone application, the MCR cannot be launch outside the standalone application."
Sounds like I can't do what I would like to do, so I will have to stick with my text file workaround.
This still doesn't answer the question why running as a function is significantly slower than running as a script, but at this point I need to move on until I can find the time to try to completely restructure the application. Unfortunately I don't have a lot of confidence now that all the work required for the restructuring will actually result in a faster run-time. Would hate to put in all that effort to have the end product be slower..
Greg
2018-5-12
编辑:Greg
2018-5-12
At the risk of stereotyping, I suspect a developer that produced 5000 lines of code in 3 separate scripts also ignored a bunch of other best practices throughout those 5000 lines. I would bet a full restructuring would result in a lot of improvements, including runtime.
Could you provide the total number of variables in the workspace and rough distribution of bytes (memory per variable) after running the code? I'm tempted to write some bogus code to see if it's purely due to the magnitude of the situation.
Nick Heinz
2018-5-12
Oh, I would totally agree, there's quite a few things I would like to change about it.
Sorry, the initial Main, Script2, Script3, Script3b was just to illustrate the structure. The actual program is more like 45 files, mostly scripts, some functions (maybe 5). In my testing configuration there are about 830 variables - a mix of struct, cell, griddedInterps, doubles, char. Here's the output of whos with all the variable names redacted (I apologize for the excessive secrecy, just trying not to get in trouble with my company).
Let me know if there's anything else I can do to help replicate the issue. I eventually do plan on restructuring, but it will likely be some time until I can get to it.
Greg
2018-5-13
In short, my first lazy attempt didn't produce any meaningful results. No change in execution between script mode and function mode.
In further detail, I wrote some code to write some bad code for me. It's hard to produce 5000 lines of meaningful fake code. I created 45 files with a normal distribution of # of lines of code to sum up to about 5000. Each script created 17 of the 765 double variables found in your allVars.mat to correct size using randi. I skipped all of the other variables for first go around. I then randomly nested calls from one script to the next and wrapped it all in a main. See my attachment if you're curious.
I may try to fill in the 5000 lines with (very slightly) more realistic code, but that will take longer.
Jan
2018-5-13
@Greg: Wow. Programming from hell. I've opened your code directly after my late and comfortable breakfast. If I have an enemy, I would print out the created code and send it by snail mail. Brrrr.
It's hard to produce 5000 lines of meaningful fake code.
Thanks for solving this job heroically. I cannot vote up your comment.
Philip Borghesani
2018-5-14
Greg, I did not look at your code but think you should concentrate on the number and allocation/creation locations of the different variables.
Nick if you can turn your most heavily used scripts into nested functions in main it may run significantly faster, and not be too difficult given that nested function have full access to enclosing works space variables. Accessing calling work space variables from a script can be quite slow and is linear time with the number of variables in created in the code of the calling function, but i believe not for variables created in called scripts or evals.
Base workplace variables can be found much faster from scripts. The reason for this is all script code is compiled to be run from any function and has no direct access to function work space variables at compile time. We are working on optimizing this and recent MATLAB versions are quite a bit better but without problematic examples we are not sure what to concentrate on...
Greg
2018-5-15
I tried adding some slightly less trivial filler code, and adding in a couple more of the non-double variables. And I installed R2013b as well. Still no luck reproducing the issue.
Nick: can you provide any more detail on the struct and the large cell array? I'm wondering if they are key components because of the relatively large memory size. Also, in general, would you say there is much cross-script variable access? For example, does script3 use or manipulate anything created in script2?
Nick Heinz
2018-5-21
Thanks for continuing to look into this, I really appreciate it. Just getting a chance to work on this project again, the last few days have been extremely busy.
The large cell array is just used in the 'setup' / parameter loading portion of the app, and from the profiler tests I've done this portion of the app is approximately the same between the function and script configurations. Along those same lines, the structure variable is only used in the data export portion, so again shouldn't have an effect on the calculation / run speed.
Yes, the way it was coded there is an extreme amount of cross-script access. There is also a significant amount of variables that unfortunately can't be pre-allocated because their size is unknown at run time. I've tried to do what I can, but with the current program structure not a whole lot can be done.
If you still are interested in creating filler code, the griddedinterps play a significant role in iterative solver routine, and each of those lookups is highly nonlinear, but I think this lends more to the amount of iterations it takes to solve, and not to the root problem of why script mode is faster than function mode. So I have to use a very conservative iteration method, which results in lots and lots of iterations - I will frequently have ~200k iterations at some of the most nested scripts (3rd level nesting, if that makes sense).
@Philip - thanks for the suggestion, I will definitely try that first before a full restructure. I guess that makes, sense - like I mentioned before, function mode seems to run much closer to script mode speed in 2018a, but I am unfortunately unable to use this version due to 3rd party COM compatibility (32bit com object).
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Startup and Shutdown 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!发生错误
由于页面发生更改,无法完成操作。请重新加载页面以查看其更新后的状态。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
亚太
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)