Loop through files of remote server

3 次查看(过去 30 天)
I'm using matlab to perform simulations on a cluster, where the jobs are submitted from a local machine through matlab to the cluster.
A sample submission script might look like this:
%%Get handle to the job scheduler
sched = findResource();
%%Create a job
job = createJob(sched, 'FileDependencies', {'Analysis.m'});
%%Create the tasks
filelist=dir('/dir1/dir2/')
for tidx = 1:length(filelist)
tasks(tidx) = createTask(job,@Analysis, 1, {tidx});
end
%%Submit the job
submit(job)
I'm now trying to obtain and loop through some files on the cluster and run a script on the files, say, Analysis.m
How would I do this to get a file list on the cluster and not on the local machine from which the job-scheduling, and then pass each file to the Analysis.m one at a time?

采纳的回答

Edric Ellis
Edric Ellis 2011-10-10
Perhaps a PARFOR loop on the cluster would be simplest. Something like this:
function x = doStuff
% list the files on the cluster
d = dir( '/path/on/cluster/*.dat' );
% loop over the files - this will spread the work
% among the workers
parfor ii = 1:numel( d )
x(ii) = someFcn( d(ii).name );
end
Then, you need to submit 'doStuff' as a matlabpool job to the cluster, something like this:
job = createMatlabpoolJob( sched, 'MaximumNumberOfWorkers', 4 );
createTask( job, @doStuff, 1, ... );
Or, in this situation, you can even use the BATCH command to submit the job
job = batch( @doStuff, 1, {}, 'Matlabpool', 4 );
  4 个评论
CP
CP 2011-10-10
Ok my attempt is below. It seems one of the workers exited and the local matlab showed the following message:
"Field reference for multiple structure elements that is followed by more reference blocks is an error"
There are no structures in the PrevTargAnalyze.m that is called, so any idea on what is causing the message? (the dir command is probably the only structure I see).
%%%%%%%%%%%%%%%%%%%%%%%%
%%----startJobs.m-----%%
%%%%%%%%%%%%%%%%%%%%%%%%
sched = findResource();
nlabs = 4;
job = createMatlabPoolJob(sched, 'FileDependencies', {'doJobs.m','PrevTargAnalyze.m'}, 'MinimumNumberOfWorkers', nlabs, 'MaximumNumberOfWorkers', nlabs);
task = createTask(job, @doJobs,1,{});
submit(job)
waitForState(job, 'finished')
if ~isempty(task.ErrorMessage)
disp(task.ErrorMessage)
else
y = getAllOutputArguments(job)
end
%%%%%%%%%%%%%%%%%%%%%%%
%%-----doJobs.m------%%
%%%%%%%%%%%%%%%%%%%%%%%
function x = doJobs
dirlist=dir('/scratch/harry/SpatialWM/delay/');
for idxd = 1:length(dirlist)
%grab current target neuron upper and lower range from directory
name
[i j]=strread(dirlist.name(idxd), '%s %s', 'delimiter', '-');
%Set slice time to analyze
sliceTime=2500;
filedir=['/scratch/harry/SpatialWM/delay/' dirlist.name(idxd)];
filelist = dir([filedir '*.dat']);
parfor idxf = 1:length(filelist)
PrevTargAnalyze(['/scratch/harry/SpatialWM/delay/' dirlist(idxd).name '/' filelist(idxf).name],i,j,sliceTime);
end
end
CP
CP 2011-10-12
Ok I managed to fix everything and got this to work. There's only one issue with it, in that it is limited to 16 workers (server setting) per job. How can I split this into several jobs using this method?

请先登录,再进行评论。

更多回答(1 个)

Walter Roberson
Walter Roberson 2011-10-10
You have dirlist.name(idxd) but that should be dirlist(idxd).name
  2 个评论
CP
CP 2011-10-10
Thanks, I then got the message Output argument "x" not assigned, and fixed that by changing function x = doJobs to function doJobs() and now I'm getting "Too many output arguments." This is extremely hard to debug without line numbers, is there some way I can retrieve those, as I don't want to be spamming the forum for every little trivial error.
CP
CP 2011-10-10
I also tried reverting back to:
function x = dojobs
and adding an output to PrevTargAnalyze function as well, and then doing x = PrevTargAnalyze inside the parfor, and it still complains that x is not assigned =/

请先登录,再进行评论。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by