getAllOutputArguments only returns one result per node, not per core
1 次查看(过去 30 天)
显示 更早的评论
I'm running a parallel job on an SGE cluster, asking for 48 workers. I've set procsPerNode = 8 inside parallelSubmitFcn.m, so the job should use 6 nodes, and indeed it does. I can see that while it's running.
The problem is that the result obtained from getAllOutputArguments only contains results for 6 entries, as though there was only 1 worker per node.
My code simply returns 'labindex', and so the result should just be the integers 1..48. Below is the parallel job object, followed by the results, after the run. As you can see, it claims to have run all 48 tasks. However, the result only contains the first 6.
What's going on?
Thanks
-Don --------------------------------------
pjob = Parallel Job ID 144 Information ===============================
UserName : don
State : finished
SubmitTime : Tue May 08 15:32:20 EDT 2012
StartTime : Tue May 08 15:32:21 EDT 2012
Running Duration : 0 days 0h 0m 3s
- Data Dependencies
FileDependencies : /Users/don/math/MVPA/donsPause.m
PathDependencies : {}
- Associated Task(s)
Number Pending : 0
Number Running : 0
Number Finished : 48
TaskID of errors :
- Scheduler Dependent (Parallel Job)
MaximumNumberOfWorkers : 48
MinimumNumberOfWorkers : 48
>> getAllOutputArguments(pjob) ans = [1] [2] [3] [4] [5] [6] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] []
采纳的回答
更多回答(1 个)
Thomas
2012-5-14
I just checked my parallelSubmitFcn file for our SGE cluster. We keep
procsPerNode = 1;
SGE has a different way of thinking about nodes... And I run even 128 processes and get the results back correctly.. WE have a mixture of hardware with some generation having 8 processors per node and some generations having 12 processors per node. procsPerNode = 1; let everything work right with SGE.. and SGE can use the processors, remaining after a couple of them have been taken on each node by other applications.. Your systems may vary, but this works for us and allows us backfill jobs... :)
4 个评论
Thomas
2012-5-14
Don, not sure about how you define nodes/processors in your cluster.. We define a node as a physical node taking 1U rack space. Usually consists of 2 quad core or hex core processors thus getting 8-12 processors per node..
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!