Main Content

Set up MATLAB Job Scheduler Cluster for Auto-Resizing

You can customize your MATLAB® Job Scheduler cluster to resize automatically based on demand. By default, an MATLAB Job Scheduler cluster does not have the resizing functionality enabled. This means that MATLAB Job Scheduler immediately rejects any work you submit to the cluster that requires more than the current number of workers in the cluster. Auto-resizing, also called auto-scaling, allows you to submit such work to the cluster and makes the number of workers in the cluster change automatically with the amount of work submitted. The cluster grows (scales up) when there is more work to do and shrinks (scales down) when there is less work to do. This allows you to use your compute resources more efficiently and can result in cost savings.

To configure your MATLAB Job Scheduler cluster to resize automatically, you need to:

  1. Set the maximum number of workers in the mjs_def file.

  2. Start an MATLAB Job Scheduler cluster.

  3. Set up an auto-resizing process.

Set Maximum Number of Workers

To make an MATLAB Job Scheduler cluster resizable, you need to define the maximum number of workers of your cluster by editing the mjs_def file as follows:

  1. Open the file mjs_def.sh (on Linux®) or mjs_def.bat (on Windows®) located at matlabroot/toolbox/parallel/bin, where matlabroot is the directory of your MATLAB installation.

  2. Uncomment one or both of the lines #MAX_LINUX_WORKERS= and #MAX_WINDOWS_WORKERS= and set them to the desired values. These variables define the maximum number of Linux and Windows workers to which you can resize the cluster, respectively.

A resizable MATLAB Job Scheduler cluster allows jobs in the queue that require more than the current number of workers in the cluster, up to the amount specified in MAX_LINUX_WORKERS and MAX_WINDOWS_WORKERS. Other jobs are cancelled immediately.

Tip

In the mjs_def file, you can also specify a scheduling algorithm that works well with a resizable MATLAB Job Scheduler cluster, such as the standard scheduling algorithm. For more details, see the definition for the SCHEDULING_ALGORITHM parameter in Define MATLAB Job Scheduler Startup Parameters.

Start MATLAB Job Scheduler Cluster

To create a cluster with the options defined in the mjs_def file, start an MATLAB Job Scheduler cluster after editing and saving this file. For more information about how to install, configure and start an MATLAB Job Scheduler cluster, see Install for MATLAB Job Scheduler with Network License Manager.

Note

To change the maximum number of Linux and Windows workers after you start the cluster, use the resize script located at matlabroot/toolbox/parallel/bin to run the resize update command. For example:

% cd matlabroot/toolbox/parallel/bin
% ./resize update -jobmanager myJobManager -maxlinuxworkers 4 -maxwindowsworkers 8

Set up Auto-Resizing Process

To make a resizable MATLAB Job Scheduler cluster change size automatically, you must set up a background process to periodically adjust the size of the cluster. The specific implementation of this background process depends on many factors, but you can follow these general recommended steps:

  1. Identify the desired size of the cluster. The desired size of a resizable MATLAB Job Scheduler cluster is reported as the total number of workers for each operating system and hence includes all busy workers and some idle workers that are already in the cluster. The desired size changes based on running jobs and jobs in the queue. Use the resize script located at matlabroot/toolbox/parallel/bin to run the resize status command:

    % cd matlab/toolbox/parallel/bin
    % ./resize status
    The resize status command above returns information about the resizable cluster in JSON format:
    {
      "jobManagers": [
        {
          "name": "myJobManager",
          "host": "myhostname",
          "desiredWorkers": {
            "linux": 1,
            "windows": 0
          },
          "maxWorkers": {
            "linux": 4,
            "windows": 8,
          },
          "workers": [
            {
              "name": "worker_1",
              "host": "myhostname",
              "operatingSystem": "linux",
              "state": "busy",
              "secondsIdle": 0
            },
            {
              "name": "worker_2",
              "host": "myhostname",
              "operatingSystem": "linux",
              "state": "idle",
              "secondsIdle": 60
            }
          ]
        }
      ]
    }
    Parse the JSON output to extract the desiredWorkers values that represent the desired number of Linux and Windows workers for the MATLAB Job Scheduler cluster.

  2. Compare the desired number of workers with the workers in the cluster to decide whether you need to start or stop workers. Use the workers array in the output of the resize status command to examine the workers in the cluster. To ensure that jobs in the queue eventually run, you must start enough workers to match or exceed the desired number of workers. You can optionally stop idle workers that exceed the desired number of workers.

    Note

    If workers take a long time to start in your environment, you might want to wait for excess workers to be idle for some time before stopping them. This approach can be more efficient than immediately stopping excess idle workers if they are needed again soon after they become idle. To check how long a worker has been idle, examine the secondsIdle value for the worker.

  3. Start or stop workers as necessary. To do this, use the startworker and stopworker utility scripts. To avoid interrupting any work when stopping workers, it is recommended that you use the -onidle flag with the stopworker command.

See Also

| |

Related Topics