matlab.compiler.mlspark.SparkConf Class
Namespace: matlab.compiler.mlspark
Superclasses:
Interface class to configure an application with Spark parameters as key-value pairs
Description
A SparkConf
object stores the configuration
parameters of the application being deployed to Spark™. Every
application must be configured prior to deployment on a Spark cluster.
The configuration parameters are passed onto a Spark cluster
through a SparkContext
.
Construction
creates a conf
= matlab.compiler.mlspark.SparkConf('AppName',name
,'Master',url
,'SparkProperties',prop
)SparkConf
object with
the specified configuration parameters.
creates a conf
=
matlab.compiler.mlspark.SparkConf(___,Name,Value
)SparkConf
object with additional configuration
parameters specified by one or more Name,Value pair arguments. Name
is
a property name of the class and Value
is the corresponding
value. Name
must appear inside single quotes (''
).
You can specify several name-value pair arguments in any order as Name1,Value1,...,NameN,ValueN
.
Input Arguments
name
— Name of the MATLAB® application deployed to Spark
character vector | string
Name of application specified as a character vector inside single
quotes (''
).
Example: 'AppName', 'myApp'
Data Types: char
| string
url
— Master URL to connect to
character vector | string
Name of the master
URL specified as a
character vector inside single quotes
(''
).
URL | Description |
---|---|
local | Run Spark locally with one worker thread. There is no parallelism by selecting this option. |
local[K] | Run Spark locally with |
local[*] | Run Spark locally with as many worker threads as logical cores on your machine. |
yarn-client | Connect to a Hadoop® YARN cluster in client mode. The
cluster location is found based on the
HADOOP_CONF_DIR or
YARN_CONF_DIR variable. |
Example: 'Master',
'yarn-client'
Data Types: char
| string
prop
— Map of key-value pairs that specify Spark configuration properties
containers.Map
object
A containers.Map
object containing Spark configuration
properties as key-value pairs.
Note
When deploying to a local cluster using the MATLAB API
for Spark, the 'SparkProperties'
property
name can be ignored during the construction of a SparkConf
object,
thereby requiring no value for prop
. Or you can
set prop
to an empty containers.Map
object
as follows:
'SparkProperties',containers.Map({''},{''})
containers.Map
object are
empty char
vectors. When deploying to a Hadoop YARN cluster, set the value
for prop
with the appropriate Spark configuration
properties as key-value pairs. The precise set of Spark configuration
properties vary from one deployment scenario to another, based on
the deployment cluster environment. Users must verify the Spark setup
with a system administrator to use the appropriate configuration properties.
See the table for commonly used Spark properties. For a full
set of properties, see the latest Spark documentation.
Running Spark on YARN
Property Name (Key) | Default (Value) | Description |
---|---|---|
spark.executor.cores | 1 | The number of cores to use on each executor. For YARN and Spark standalone mode only. In Spark standalone mode, setting this parameter allows an application to run multiple executors on the same worker, provided that there are enough cores on that worker. Otherwise, only one executor per application runs on each worker. |
spark.executor.instances | 2 | The number of executors. Note This property is incompatible with
|
spark.driver.memory |
| Amount of memory to use for the driver process. If you get
any out of memory errors while using |
spark.executor.memory |
| Amount of memory to use per executor process. If you get
any out of memory errors while using |
spark.yarn.executor.memoryOverhead |
| The amount of off-heap memory (in MBs) to be allocated per executor. If you
get any out of memory errors while using |
spark.dynamicAllocation.enabled | false | This option integrates Spark with the YARN resource management. Spark initiates as many executors as possible given the executor memory requirement and number of cores. This property requires that the cluster be set up. Setting this property
to This property
requires |
spark.shuffle.service.enabled | false | Enables the external shuffle service. This service preserves
the shuffle files written by executors so the executors can be safely
removed. This must be enabled if |
MATLAB Specific Properties
Property Name (Key) | Default (Value) | Description |
---|---|---|
spark.matlab.worker.debug | false | For use in standalone/interactive mode only. If set to true,
a Spark deployable MATLAB application executed within the MATLAB desktop
environment, starts another MATLAB session as worker, and will
enter the debugger. Logging information is directed to log_<nbr>.txt . |
spark.matlab.worker.reuse | true | When set to true , a Spark executor
pools workers and reuses them from one stage to the next. Workers
terminate when the executor under which the workers are running terminates. |
spark.matlab.worker.profile | false | Only valid when using a session of MATLAB as a worker.
When set to true , it turns on the MATLAB Profiler
and generates a Profile report that is saved to the file profworker_<split_index>_<socket>_<worker
pass>.mat . |
spark.matlab.worker.numberOfKeys | 10000 | Number of unique keys that can be held in a containers.Map object
while performing *ByKey operations before map data
is spilled to a file. |
spark.matlab.executor.timeout | 600000 | Spark executor timeout in milliseconds. Not applicable when deploying tall arrays. |
Monitoring and Logging
Property Name (Key) | Default (Value) | Description |
---|---|---|
spark.history.fs.logDirectory | file:/tmp/spark-events | Directory that contains application event logs to be loaded by the history server. |
spark.eventLog.dir | file:///tmp/spark-events | Base directory in which Spark events are logged,
if |
spark.eventLog.enabled | false | Whether to log Spark events. This is useful for reconstructing the web UI after the application has finished. |
Data Types: char
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
ExecutorEnv
— Map of key-value pairs that will be used to establish the executor environment
containers.Map
object
Map of key-value pairs specified as a containers.Map
object.
Example: 'ExecutorEnv', containers.Map({'SPARK_JAVA_OPTS'},
{'-Djava.library.path=/my/custom/path'})
MCRRoot
— Path to MATLAB Runtime that is used to execute driver application
character vector | string
A character vector specifying the path to MATLAB Runtime within
single quotes ''
.
Example: 'MCRRoot', '/share/MATLAB/MATLAB_Runtime/v91'
Data Types: char
| string
Properties
The properties of this class are hidden.
Methods
There are no user executable methods for this class.
Examples
Configure an Application with Spark Parameters
The SparkConf
class allows
you to configure an application with Spark parameters as key-value
pairs.
sparkProp = containers.Map({'spark.executor.cores'}, {'1'}); conf = matlab.compiler.mlspark.SparkConf('AppName','myApp', ... 'Master','local[1]','SparkProperties',sparkProp);
More About
SparkConf
SparkConf stores the configuration parameters
of the application being deployed to Spark. Every application
must be configured prior to being deployed on a Spark cluster.
Some of the configuration parameters define properties of the application
and some are used by Spark to allocate resources on the cluster.
The configuration parameters are passed onto a Spark cluster
through a SparkContext
.
References
See the latest Spark documentation for more information.
Version History
Introduced in R2016b
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)