persist

Class: matlab.compiler.mlspark.RDD
Namespace: matlab.compiler.mlspark

Set the value of an RDD’s storage level to persist across operations after it is computed

Syntax

persist(obj,storageLevel)

Description

persist(obj,storageLevel) sets a persistent storage level specified by storageLevel in RDD object obj. The default storage level is MEMORY_ONLY. Use the persist method to assign a new storage level if obj does not have a storage level set. You can also use it to set a persistent storage level in memory across operations.

Input Arguments

expand all

`obj` — Input RDD
`RDD` object

An input RDD, specified as an RDD object.

`storageLevel` — New storage level to be assigned
`MEMORY_ONLY` (default) | `DISK_ONLY` | `MEMORY_AND_DISK` | `MEMORY_ONLY_2` | `DISK_ONLY_2` | `MEMORY_AND_DISK_2` | `OFF_HEAP`

New storage level to be assigned, specified as a character vector enclosed in ''. Use storageLevel to assign a new storage level if the RDD does not have a storage level set. The default storage level is MEMORY_ONLY.

Storage Level	Description
`MEMORY_ONLY`	Store the RDD in memory. If the RDD does not fit in memory, some partitions are not cached, and are recomputed each time they are needed.
`DISK_ONLY`	Store the RDD partitions on disk.
`MEMORY_AND_DISK`	Store the RDD in memory. If it does not fit in memory, then spill to disk.
`MEMORY_ONLY_2`	Store the RDD in memory, but replicate each partition in two cluster nodes.
`DISK_ONLY_2`	Store the RDD partitions on disk, but replicate each partition in two cluster nodes.
`MEMORY_AND_DISK_2`	Store the RDD in memory. If it does not fit in memory, then spill to disk. Replicate each partition in two cluster nodes.
`OFF_HEAP`	Store RDD in serialized format. For more information, see the programming guide at https://spark.apache.org/

Data Types: char

Examples

expand all

Persist an RDD

Use the persist method without any parameter to store an RDD in the memory of the executors across a cluster.

%% Connect to Spark
sparkProp = containers.Map({'spark.executor.cores'}, {'1'});
conf = matlab.compiler.mlspark.SparkConf('AppName','myApp', ...
                        'Master','local[1]','SparkProperties',sparkProp);
sc = matlab.compiler.mlspark.SparkContext(conf);

%% persist
myFile = sc.textFile('airlinesmall.csv');
myFile.persist();
myFile.unpersist();

Version History

Introduced in R2016b