How to I pass off a Simulink application command syntax to MATLAB Job Scheduler on a local cluster?
5 次查看(过去 30 天)
显示 更早的评论
Ewen Chan
2019-9-5
I have a MATLAB/Simulink application that I normally would call from the MATLAB command window with:
app inputfile.m
And then it would run.
If I want to pass that off to the MATLAB Job Scheduler, how would I go about doing that?
I tried following the help documentation about using the cluster profile, creating the parallel job, and then creating the task, except that in the syntax for createTask(j, F, N...)
I tried to put in the "app inputfile.m" as the function into the createTask function and that didn't work.
What would be the better way for me to pass off "app inputfile.m" to MJS?
Your help is greatly appreciated.
Thank you.
回答(4 个)
Edric Ellis
2019-9-6
Where do the results from running app inputfile.m go? Does this create variables directly in the base workspace?
To use app together with createTask, you'd probably need to call it like this:
createTask(job, @app, 0, {'inputfile.m'})
job = batch('app inputfile.m');
wait(job), load(job)
If that does work, then you can simply call batch multiple times.
An alternative might be to use parfor to run app in a parallel loop. But that depends a bit on how the outputs of app are produced...
10 个评论
Ewen Chan
2019-9-6
Edric:
Thank you.
The results are written to files/disk. I don't know enough about the app itself as I recently got transferred into another group such that I am just an end user to this legacy app, so no further development work is being performed on it.
Will the batch command send the job to MJS or will it just run itself in the background?
Basically, what I am trying to do is instead of typing "app inputfile.m" in the MATLAB command window, is to have MJS do that instead.
Thank you.
Edric Ellis
2019-9-6
The batch command runs whatever you specify on the cluster. If you don't specify a cluster object, it uses the result of calling parcluster() to use your default cluster. The command runs in the background on the cluster, and job gives you a way to get the results back. You might find it useful to read through some of this documentation: https://uk.mathworks.com/help/parallel-computing/simple-batch-processing.html
If app produces results by writing them to disk... then that does complicate matters a little. You'll need to carefully specify things for app so that it doesn't try to write to the same location multiple times simultaneously. You'll also need to ensure that it writes the files somewhere you can pick them up later (if you start off using the 'local' cluster, then things are simpler since everything is happening on your computer).
Ewen Chan
2019-9-6
So...here is what I was thinking/what my idea was (again, drawing for my previous experience as a CAE/FEA engineer):
We used to pack up our simulations into a zip file and then upload it to the HPC cluster and then it will unpack said zip files (which will contain all of the input files it needs for the run) and then run it, and then zip up the result files back up, so that we can download it and post-process the results.
The runs that I've got now are almost all entirely self-contained within its own directory. The output files takes the name of the inputfile.m and writes it out to disk as inputfile.csv (for example) as the outputs. So if I want to perform different runs in the same directory, I would change the name of the inputfile from inputfile.m to inputfile1.m (for example) and then the outputs will be written to disk as inputfile1.csv.
I hope that helps and that makes sense.
If I don't change the name, then the app assumes that I am looking to overwrite the existing results, so that will be on me.
If I use batch, will I see the job/task in the MATLAB Job Monitor?
Yeah, I read through the simple batch processing page as well, but I couldn't make the connection between what it states there (where the syntax was showing the example of a function) to my running a Simulink app that was developed in-house.
Thanks.
Edric Ellis
2019-9-6
When you use batch, it (generally speaking) automates the process of getting your MATLAB code to the cluster, running your script, and getting the results back to you. In your case, some minor additional work might be needed to ensure your "inputfile" is definitely sent to the cluster. That's because you're specifying the name of a script that you want app to run. (In the normal course of events, batch performs a MATLAB dependency analysis on the script/function that you specify - but it can't see inside eval commands - and I presume that is essentially how app is running "inputfile").
If you can tell app explicitly where to put the output file, that might be useful.
Yes, jobs created by batch show up in the Job Monitor. I don't think there's necessarily anything specific about your use of Simulink in the batch job - but it might be necessary to attach additional files to the job so that the cluster workers can load all related models.
Ewen Chan
2019-9-6
"but it might be necessary to attach additional files to the job so that the cluster workers can load all related models."
If it is in the search path of the local MATLAB installation/instance, will I still need to do that or will batch know to go look for those additional files via the PATH that has been set for the MATLAB instance?
Also, with batch, does this mean that I will be able to submit as many jobs to the MATLAB job scheduler as there are physical cores in the system?
So again, the way that I am currently doing things is that if there are multiple runs that I need to perform, I run them manually, sequentially.
All of the runs are independent from each other and they're also all serial runs.
Will I be able to use batch to submit multiple jobs to MJS so that the will be executed in parallel from the MJS?
(e.g. my system has a four core processor. Will I be able to submit four jobs to MJS and have MJS pick them all up immediately and run them simultaneously, four serial jobs running in parallel or will it just automate my manual, but serial, sequential execution process that I am currently doing?)
Also, will I need to start multiple instances of MATLAB to submit each job from each MATLAB instance or can I submit all four jobs with a single MATLAB instance?
Thank you.
Edric Ellis
2019-9-6
batch does automatically set up the MATLAB path for the workers. The reference page that I linked to earlier has the full details, see specifically the 'AutoAddClientPath' property - defaults to true.
You can submit as many batch jobs as you wish, and you can submit them all from a single MATLAB desktop session. The cluster will run as many as it can simultaneously, the others will remain queued until they can run.
It's not clear to me whether you are going to use MATLAB Job Scheduler (MJS) - which is part of the MATLAB Parallel Server product, or the "local" cluster, which is part of Parallel Computing Toolbox. By default, the "local" cluster runs as many jobs concurrently as you have cores on your machine. More details on the "local" cluster here https://uk.mathworks.com/help/parallel-computing/program-independent-jobs-on-a-local-cluster.html#bq5skz7-1 .
What happened when you tried to run
batch('app inputfile.m')
?
Ewen Chan
2019-9-6
Edric:
I haven't tried running it with batch just yet.
There's a problem with my latest manual run, so I am re-running it now to try and figure out what the issue it. (I am 99.9999% sure that it's not MATLAB related, but related to an error state in my input file, but I'm checking that right now.)
Initially, I'll be using my local cluster.
What I've proposed to my management is that I will hopefully be able to eventually offload the runs from my local laptop to a server that sits in the office permanently and we just submit our jobs to that as a shared resource. The goal is to alleviate some of the problems that we are encountering with running the jobs locally on our laptop (which I can't get into, nor discuss). And being that I've been told that this is a legacy app, so they're not working on actively developing it/making changes to it anymore, I told my management that I am just going to see if I can put a "wrapper" around it so that the idea is that it would be sent to MJS and processed on that and then we just download the results back (either as the entire run directory or something else) so that the app itself no longer has to be developed to enable this. (It's an automated version of me, sitting on the server, submitting jobs.)
My idea is that if I can prove/demonstrate that it works on my local laptop, that that might help convince them to get a server that sits in the office and I just essentially repeat the process, except rather than submitting to the 'local' cluster profile, it would be sending the jobs to the server instead.
This is the idea.
This is also why I am somewhat fixated on the MJS idea because I presume that eventually/ultimately, if the jobs are sent to a remote server and the server is running the app (as though I was sitting in front of it, physically and doing what I'm currently doing, but manually), then it frees my system up for me to do other stuff.
Thank you.
Ewen Chan
2019-9-9
job = batch('app inputfile.m');
wait(job), load(job)
If I submit multiple jobs to MJS/the cluster/job scheduler; is there a way for me to have it automatically pull down the results when the job is finished while I do other things in the MATLAB session?
If I issue the wait(job), load(job) commands right away, it then goes and waits for the job to finish and then gets the results from the job. But I want it to be able to do that whilst I work on other things in the meantime/while the job is running.
What would be the best way to do that?
Thank you.
Edric Ellis
2019-9-10
If you're interactively waiting for things to complete, then perhaps the Job Monitor might be the way forward - there you'll be able to see when things are complete.
Programattically - you can either call wait with a timeout, or check job.State to see if it is 'finished'.
Ewen Chan
2019-9-10
This would mean that I would have to create a M-script that will submit the job, correct?
The other question that I also have was that in the example (re: submitting a job as batch, by right-click on the M-script file), it automatically assigns the ID.
Is there a way to do that programmatically? i.e. instead of:
job_n = batch('app inputfile.m', ...)
that it will automatically assign an ID to it without having to manually specify the ID.
How would I go about doing that?
Jason Ross
2019-9-5
12 个评论
Ewen Chan
2019-9-5
The Simulink application that I am running is serial only. There is no parallelisation within it and it has already been demarcated as legacy so no further development is going into it.
But, I have to run it over and over again, so if I can throw the jobs over to MJS, then I can run more jobs simultaneously using multiple CPU processing cores rather than performing my series of serial runs sequentially.
Each job is self-contained from each other job, but it uses the same Simulink application.
Ewen Chan
2019-9-9
So I finally got a chance to try batch. It said that it failed because the application is looking for specific folders that are required by the app.
Is there a way to just "automate 'me'" so that I'm not having to sit in front of the console and submit the job to MJS, including knowing where to find the all of the directories and files that the application needs?
When I run it in the MATLAB window, this error doesn't happen. So, again, what I'm trying to do is just automate myself so that I can submit jobs and don't have to be sitting in front of the console, babysitting the run.
Thank you.
Sorry that I can't provide more information about the app itself due to IP. So I have to just talk around it generically.
Thank you.
Jason Ross
2019-9-9
There is some automated dependency analysis that happens, but it may not detect everything. But you can probably get the rest of the way there with another couple steps. There is a specific entry here about how to find model dependencies and set them up for parallel processing.
I suspect using the above two approaches you can cover more ground than submitting and finding the next missing dependency.
Ewen Chan
2019-9-9
Jason:
Thank you.
The other question that I have is that right now, my console session has a bunch of directories that are in the search path (which I can find out when I click on 'Set Path' in the ribbon/toolbar).
However, when I type in path in a script and submit that as a batch job, it returns something different (maybe only the defaults).
Is there a way for me to pass the paths that I have from my console session to the batch job?
My less-than-intelligent way of doing it right now where I am using addpath(genpath 'path')); to add them back in, but I suspect that there has to be a better way of doing this.
Your help is greatly appreciated.
Thanks.
Jason Ross
2019-9-9
batch has the ability to add paths via the 'AdditionalPaths' PV pair, so you could add them that way.
Ewen Chan
2019-9-9
Jason:
Thank you.
So this where I am a little bit confused or maybe I just don't understand it sufficiently.
Right now, on my console, when I type in path, it will give me all of the search folders and directories that have been set/added to the path.
But when I do it with the batch job, it doesn't do that.
Is there a way for me to pass that onto the batch job? I know that the AdditionalPath will be a way to add it, so my question is is there a way to pass off what I currently have set in my console onto that? (i.e. can I "export" it from the console as a character vector or cell array of characters and then have batch "import" it from that?)
In other words, my confusion lies in how would I pass information from my "console" to the batch workers if it dosen't automatically or already take that into consideration?
Would I have to create a M-script file where I set path=path; so that it would then be passed on to the worker clients?
Your help is greatly appreciated.
Thank you.
Jason Ross
2019-9-9
A cell array is a fine way to pass it in:
'AdditionalPaths' — A character vector or cell array of character vectors that defines paths to be added to the MATLAB® search path of the workers before the script or function executes. The default search path might not be the same on the workers as it is on the client; the path difference could be the result of different current working folders (pwd), platforms, or network file system access. The 'AdditionalPaths' property can assure that workers are looking in the correct locations for necessary code files, data files, model files, etc.
Ewen Chan
2019-9-9
Jason:
When I set p=path;, it results in a char array of 1x77290 char.
I have no idea if that's working properly because it is still saying that there is an issue with the path.
Thank you.
Jason Ross
2019-9-9
It may be complaining because of duplicate paths or paths that don't exist? You don't mention your platform, but on Windows dealing with drive letters can be finicky, for example.
Also, is is issuing a warning or an error? In the case of a warning you might just want to ignore it for proof of concept type work, and return to it when you have things generally working.
Ewen Chan
2019-9-9
Jason:
It was an error.
But rather than me defining an additional variable for the AdditionalPaths, I just did this:
job = batch('app inputfile.m', 'Profile', 'local', 'Pool', 0, 'AdditionalPaths', path);
and that seems to be working.
It'll take a little while for my app to run, so I'll just let it go, but it does look like that it is finally running.
I'll have to see if it will produce the outputs and also for me to be able to get them from the job, but at least it is starting the run, so this looks like it's promising progress.
Thank you.
Ewen Chan
2019-9-9
Yes, I am running this on WIndows, and to make matters worse, I think that some of the paths might be a mixture of OSes.
But again, it looks like that with the batch submission syntax that I have now, it appears to be working, so it looks like that it was able to take the mixed OS syntax.
Thanks.
Ewen Chan
2019-9-9
Tangential question - my MATLAB R2015a installation is missing the mdce.bat from C:\Program Files\MATLAB\R2015a\toolbox\distcomp\bin.
That file is apparently required to start the mdce service so that I can start MJS.
How can I "backfill" that missing file?
I'm not really sure why it's missing, but it looks like that is needed to start MJS.
Once again, your help would be greatly appreciated.
Thank you.
2 个评论
Jason Ross
2019-9-9
When you run the installer, make sure that MDCE is selected and it should be there. When you set up the MDCE installation it's a best practice to do an all products install so you can service any inbound request from any arbitrary user.
If you try and backfill files manually it can become a tedious nightmare very quickly -- the installer knows how to do "the right thing".
Ewen Chan
2019-9-9
Ahhh....I might not have much in the way of control over that then because that's handled by the IT administration and the installers are only downloadable by IT, and not by/for end users.
Hmmm....bummer.
Let me talk with my IT to figure out what's going on with that.
Thanks.
Ewen Chan
2019-9-10
So the application ran and I was able to get the results that were written out by the application, but at the end, when MATLAB tries to load the results from the job, this is the error message that I get:
Error using parallel.Job/load (line 33)
Error encountered while running the batch job. The error was:
The task result was too large to be stored.
Caused by:
Error using parallel.internal.cluster.FileSerializer>iSaveMat (line 281)
Data too large to be saved.
Any ideas on how I might be able to resolve this and/or get around it?
Thank you.
4 个评论
Edric Ellis
2019-9-10
If you're using the 'local' cluster type, this can occur if your output data is too large to save in a MAT file using your default MAT file format. You can change your default version to 7.3 by following the instructions here https://uk.mathworks.com/help/matlab/import_export/mat-file-versions.html - that should fix things. (You'll need to change the preference before submitting the job).
This problem should not occur when using an MJS cluster - that automatically knows how to store outputs that are >=2Gb.
Ewen Chan
2019-9-10
Edric:
Being that my application has been designated by the company as a legacy application, and therefore; no further development work is being put into it, is that something that I can do as part of the job submission process or something that happens outside of the app?
This error state does not exist nor manifest when I run the app manually, so I suspect that it has something to do with how the 'local' cluster is handling the saves, which also suggests that there should be a way that I can change how the job submission is being handled in regard to the local cluster so that this won't be an issue any more.
My theory is that because I don't encounter this issue when I am running the app manually, but encounter this error when I use batch, therefore; it's gotta be something with how batch is handling it when submitting to the 'local' cluster profile.
Your help is greatly appreciated.
Thank you.
Edric Ellis
2019-9-10
It is indeed to do with how the 'local' cluster is saving data. But that is derived from your user-level MAT file preferences. If you change your default MAT file version to v7.3, then (unless you specify otherwise), all MAT files created by MATLAB - including those used by the 'local' cluster - will be able to store >= 2Gb. The link I provided should give you instructions as to how to do that.
The reason you don't see this when running your application locally is that the results are returned to you in RAM. Any cluster type has to store the results of a job in some persistent storage somewhere so that you can pick them up later. 'local' uses MAT files on disk for this.
Ewen Chan
2019-9-10
Edric:
Thank you.
Sorry, I had missed that.
I thought that I would have to edit my Simulink application code to enable that.
My apologies.
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!发生错误
由于页面发生更改,无法完成操作。请重新加载页面以查看其更新后的状态。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
亚太
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)