Segmentation Faults and Thread-Safety in Parfor Loops: Part II

1 次查看(过去 30 天)
I am currently running multiple repetitions of an experiment that uses MEX files and the CPLEX 12.4 API in MATLAB 2012a. Although the code for my experiment runs perfectly, I receive segmentation faults when I run it in parallel using the parfor function in MATLAB (< http://www.mathworks.com/matlabcentral/answers/35754-segmentation-faults-when-running-mex-files-in-parallel see former post>). I have recently found out that this is due to the putenv() function in the C library, which is used by one of the functions in the CPLEX 12.4 API.
After posting about this issue on the CPLEX forums, include < pthreads.h > in my MEX file and wrapping it inside a "pthread_mutex_lock." I have done this by changing:
void mexFunction(...) {
CPXENVptr env = NULL;
int status = 0;
env = myNonThreadSafeFunction (&status);
//... more code
}
to
//declare a pthread_mutex_t before the MEX function
static pthread_mutex_t myLock = PTHREAD_MUTEX_INITIALIZER;
void mexFunction(...) {
int status = 1;
CPXENVptr env = NULL;
//wrapped non-thread-safe function with mutex_lock
pthread_mutex_lock(&myLock);
env = myNonThreadSafeFunction (&status)
pthread_mutex_unlock(&myLock);
//... more code
}
Note that I also the pthread_mutex_t as a static variable in a standalone MEX file before the parfor loop - this is to ensure that it is visible to all the workers inside the parfor loop.
Despite this fix, I still receive segmentation faults and have some questions:
1. Should the approach outlined above work?
2. Some people have suggested that mxMalloc and mxFree are not thread-safe. If so, would it help to wrap this with the pthread as well?
3. Is there is any explanation for why I receive these faults when running the experiments in parallel using a Linux cluster, but when I run the experiments in parallel using Mac OSX 10.7?
4.Something else I can't understand. Before implementing the pthreads fix, the segmentation fault would cause one of the workers in my parfor loop to stall (as the parfor loop would just "wait" to hear back from the seg'd worker, which would never happen). Now, the segmentation fault actually shuts down the parfor loop and I get the following error message
Error message: The session that parfor is using has shut down
in line: 47
of file: home/software/matlab/matlab-2012a/toolbox/matlab/lang/parallel_function.m
  3 个评论
MK
MK 2012-12-21
Hi Berk, Were you able to figure out the fix for this problem. I am also facing a similar issue with a 3rd party mex file which runs perfectly in a for loop but gives segmentation fault in parfor. Also I am not able to replicate the segmentation fault with specific inputs that I can share with the mex files authors i.e. it is unpredictable. Any update would be much appreciated!

请先登录,再进行评论。

回答(1 个)

Edric Ellis
Edric Ellis 2012-4-24
Please note that 'thread safety' really shouldn't be the major concern here. When you open a MATLABPOOL, you are starting several completely separate MATLAB worker processes. In fact, the worker processes are started with the '-singleCompThread' option so that there's only 1 computational thread.
In particular, you may need to consider any setup you're doing - you need to do this afresh on each worker process. Perhaps like this
matlabpool open ...
spmd
setupForMyMexStuff; % runs once per worker process
end
% Now use the mex stuff:
parfor ii=...
useMyMexStuff ...
end
From this and your previous posts, it looks like you're reproducibly hitting the SEGV in the call to 'putenv'. Perhaps you're not setting up the environment variables in some way that your Mex file is expecting.

类别

Help CenterFile Exchange 中查找有关 Array Geometries and Analysis 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by