主要内容

Parquet File Logging Format for Simulation Data

You can export simulation data to a Parquet file using the Record block or the Simulation Data Inspector. Parquet is an open-source file format with efficient compression and encoding of column-oriented data that can be useful for processing big data. When you export data to a Parquet file, the way the data is stored in the file depends on the type of data that you export.

You can export real or complex scalar and multidimensional data from a signal, bus, or array of buses to a Parquet file. You can also export messages to a Parquet file as double values. Exporting variable-size signals to a Parquet file is not supported.

Data Types in Parquet File

The Record block and Simulation Data Inspector support some data types that are not supported by a Parquet file. Most data types, such as double, int, or string, do not change. This table lists data types supported by the software and how these data types are represented in the Parquet file.

Simulink Data TypeParquet File Logical Data Type
doubledouble
singlesingle
int8int8
int16int16
int32int32
int64int64
uint8uint8
uint16uint16
uint32uint32
uint64uint64
stringstring
BooleanBoolean
halfdouble
fixed pointdouble (fixed-point data is stored in the JSON sidecar)
enumint32
imageData type of underlying image data
datetimedouble representation of epoch time

For more information about Parquet file data types, see Apache Parquet Data Type Mappings.

Data Format in Parquet File

How the data is formatted in the Parquet file depends on the type of signal being recorded. This table shows how each type of Simulink® signal is recorded in the Parquet file.

Simulink Signal TypeParquet File Logging Format
Scalar signal

Single column with a scalar value at each time step

Scalar signal with complex data

Single column with a 1-by-2 vector representing the real and imaginary parts of the complex value at each time step

Nonscalar signal

Single column with sample values in the form of a vector, list of column vectors, or a nested list of column vectors for each time step

Nonscalar signal with complex data

Single column containing 1-by-2 vectors representing the real and imaginary parts of each sample value nested in a vector, list of column vectors, or a nested list of column vectors at each time step

Virtual or nonvirtual busSeparate columns for each element in the bus or bus hierarchy
Array of busesSeparate columns for each element in the array of buses
Variable-size signalNot supported

Single-Rate and Multirate Data

You can save data to a Parquet file using shared or individual time columns. When you save single-rate data with a shared time column, the first column in the file contains time data, followed by columns containing signal data. The Record block and the Simulation Data Inspector export data to a Parquet file using shared time columns by default.

A model that logs two signals to a Record block, with a Parquet file that contains one time column followed by two columns of signal data.

When you save multirate data using shared time columns, signals that have identical time data are grouped by shared time vectors. Time columns specify the sample times for signals to the right, up to the next time vector.

A model that logs five signals to a Record block. Three signals have a sample time of 0.5, while the other two have a sample time of 0.1. In the Parquet file, columns for the three signals with a 0.5 sample time follow the time column with time steps of 0.5. Then, columns for the two signals with a 0.1 sample time follow a separate time column with time steps of 0.1.

When you save data using individual time columns, the software saves data in pairs of time and signal data columns.

A model that logs two signals to a Record block, with the Parquet file using separate time columns for each signal data column.

A Parquet file requires that all columns be of equal length. When you record signals that are not of equal length to the same Parquet file, the software appends NULL to any empty cells.

Complex Signals

The Record block and the Simulation Data Inspector export complex sample values to a Parquet file as a 1-by-2 vector, where the first element is the real part and the second element is the imaginary part of the complex value. For example, a scalar signal value of 0.3973 + 0.5960i is saved as [0.3973, .5960].

A model that logs complex data to a Record block, with the Parquet file storing real and imaginary parts as a two-element vector.

Multidimensional Signal Data

Multidimensional signal data with fixed dimensions can be represented in the Record block or the Simulation Data Inspector in two ways:

  • A single signal with multidimensional sample values

  • A set of signals with scalar sample values: one signal, called a channel, for each element of the multidimensional data

For both representations, the data for each time step is stored in the Parquet file as vectors for one-dimensional arrays, a list of column vectors for two-dimensional arrays, or as a nested list of column vectors for arrays with more than two dimensions. For instance, a 2-by-3 matrix-valued signal is recorded as a column of data where each entry consists of a vector of three 2-element vectors.

A model logs a 2-by-3 matrix using a Record block, saving data to a Parquet file with two columns: time and signal data grouped in three 1-by-2 vectors of the 2-by-3 matrix signal at each sample time.

To export individual channels to a Parquet file, in the Simulation Data Inspector, use the signal table to select only the channels to be exported. By default, signals with samples that contain fewer than five elements are represented as channels. To represent a multidimensional signal with five or more elements as channels, use the expand function or click the signal dimension in the signal table and select Convert to channels. When you export only selected runs or signals and you select individual channels, rather than the parent signal, of an expanded multidimensional signal, the Parquet file allocates a separate column for each channel. For more information about multidimensional signals, see Analyze Multidimensional Signal Data.

Complex Multidimensional Signal Data

When you save multidimensional signals that contain complex data to a Parquet file, each sample element is a nested 1-by-2 vector, where the first element is the real part and the second element is the imaginary part of the complex value. For real values, the second element is 0.

A model logs a 2-by-3 matrix signal containing complex data using a Record block. In the saved Parquet file, there are two columns: time and signal data grouped into three 1-by-2 vectors of the 2-by-3 matrix signal at each sample time. Each element of the sample values is represented as a pair of real and imaginary components in the form ([real, imaginary]).

Buses

You can export data logged from virtual or nonvirtual buses to a Parquet file. In the Parquet file, dots in signal names specify the bus hierarchy.

A model containing a nested bus connected to a Record block. The associated Parquet file uses dot notation to specify the bus hierarchy. For example, the signal named sine is an element of nestedBus, which is an element of topBus. In the Parquet file, this signal is named topBus.nestedBus.sine_data.

You can also export arrays of buses to a Parquet file. In the Parquet file, each element in the array of buses is stored in a separate column.

A model that logs an array of two buses to a Record block. Each bus in the array of buses contains two signals named a and b. The Record block uses a combination of index and dot notation in the Parquet file. For example, the column for the signal named a in the first nonvirtual bus is labeled AOB(1).a_data.

Enumerated Data

When you save enumerated data to a Parquet file, the software exports only the underlying integer data as int32 values.

For example, the MyColors class in this model defines a set of enumerated values consisting of six colors, each associated with an integer value between 0 and 5.

Logged enumerated data visualized in the Record block.

When you save the enumerated data to a Parquet file, only the underlying integer values associated with each enumerated value are saved in the file.

Model that records enumerated data. The Parquet file logs the underlying integer values associated with each enumerated value but does not log the enumerated name.

See Also

Tools

Blocks

Objects

Functions

Topics