Main Content

read

Read member data from an ensemble datastore

Description

Use this function to read data from ensemble datastores for condition monitoring and predictive maintenance.

example

data = read(ensemble) reads data from a member of the ensemble datastore ensemble. The function reads the variables specified in the SelectedVariables property of the ensemble datastore and returns them in a table.

If the ensemble has not been read since its creation (or since it was last reset using reset), then read reads data from the first member of the ensemble, as determined by the software. Otherwise, read reads data from the next ensemble member. read updates the LastMemberRead property of the ensemble to identify the most recently read member. For more information about how ensemble datastores work, see Data Ensembles for Condition Monitoring and Predictive Maintenance.

[data,info] = read(ensemble) also returns information about the location from which the data is read and the size of the data.

Examples

collapse all

In general, you use the read command to extract data from a simulationEnsembleDatastore object into the MATLAB® workspace. Often, your ensemble contains more variables than you need to use for a particular analysis. Use the SelectedVariables property of the simulationEnsembleDatastore object to select a subset of variables for reading.

For this example, use the following code to create a simulationEnsembleDatastore object using data previously generated by running a Simulink® model at a various fault values (See generateSimulationEnsemble.). The ensemble includes simulation data for five different values of a model parameter, ToothFaultGain. Because of the volume of data, the unzip operation takes a few minutes.

unzip simEnsData.zip  % extract compressed files
ensemble = simulationEnsembleDatastore(pwd,'logsout')
ensemble = 
  simulationEnsembleDatastore with properties:

           DataVariables: [5x1 string]
    IndependentVariables: [0x0 string]
      ConditionVariables: [0x0 string]
       SelectedVariables: [5x1 string]
                ReadSize: 1
              NumMembers: 5
          LastMemberRead: [0x0 string]
                   Files: [5x1 string]

The model that generated the data, TransmissionCasingSimplified, was configured such that the resulting ensemble contains variables including accelerometer data, Vibration, and tachometer data, Tacho. By default, the simulationEnsembleDatastore object designates all these variables as both data variables and selected variables, as shown in the DataVariables and SelectedVariables properties.

ensemble.DataVariables
ans = 5x1 string
    "PMSignalLogName"
    "SimulationInput"
    "SimulationMetadata"
    "Tacho"
    "Vibration"

ensemble.SelectedVariables
ans = 5x1 string
    "PMSignalLogName"
    "SimulationInput"
    "SimulationMetadata"
    "Tacho"
    "Vibration"

Suppose that for the analysis you want to do, you need only the Vibration data and the Simulink.SimulationInput object that describes the conditions under which this member data was simulated. Set ensemble.SelectedVariables to specify the variables you want to read. The read command then extracts those variables from the current ensemble member.

ensemble.SelectedVariables = ["Vibration";"SimulationInput"];
data1 = read(ensemble)
data1=1×2 table
         Vibration                SimulationInput        
    ___________________    ______________________________

    {20202x1 timetable}    {1x1 Simulink.SimulationInput}

data.Vibration is a cell array containing one timetable that stores the simulation times and the corresponding vibration signal. You can now process this data as needed. For instance, extract the vibration data from the table and plot it.

vibdata1 = data1.Vibration{1};
plot(vibdata1.Time,vibdata1.Data)
title('Vibration - First Ensemble Member')

The next time you call read on this ensemble, the last-read member designation advances to the next member of the ensemble (see Data Ensembles for Condition Monitoring and Predictive Maintenance). Read the selected variables from the next member of the ensemble.

data2 = read(ensemble)
data2=1×2 table
         Vibration                SimulationInput        
    ___________________    ______________________________

    {20215x1 timetable}    {1x1 Simulink.SimulationInput}

To confirm that data1 and data2 contain data from different ensemble members, examine the values of the varied model parameter, ToothFaultGain. For each ensemble, this value is stored in the Variables field of the SimulationInput variable.

data1.SimulationInput{1}.Variables
ans = 
  Variable with properties:

           Name: 'ToothFaultGain'
          Value: -2
      Workspace: 'global-workspace'
        Context: ''
    Description: ""

data2.SimulationInput{1}.Variables
ans = 
  Variable with properties:

           Name: 'ToothFaultGain'
          Value: -1.5000
      Workspace: 'global-workspace'
        Context: ''
    Description: ""

This result confirms that data1 is from the ensemble member with ToothFaultGain = –2, and data2 is from the member with ToothFaultGain = –1.5.

Create a file ensemble datastore for data stored in MATLAB files, and configure it with functions that tell the software how to read from and write to the datastore. (For more details about configuring file ensemble datastores, see File Ensemble Datastore with Measured Data.)

% Create ensemble datastore that points to datafiles in current folder
unzip fileEnsData.zip  % extract compressed files
location = pwd;
extension = '.mat';
fensemble = fileEnsembleDatastore(location,extension);

% Specify data and condition variables
fensemble.DataVariables = ["gs";"sr";"load";"rate"];
fensemble.ConditionVariables = "label";

% Configure with functions for reading and writing variable data
fensemble.ReadFcn = @readBearingData;
fensemble.WriteToMemberFcn = @writeBearingData; 

The functions tell the read and writeToLastMemberRead commands how to interact with the data files that make up the ensemble. Thus, when you call the read command, it uses readBearingData to read all the variables in fensemble.SelectedVariables. For this example, readBearingData extracts requested variables from a structure, bearing, and other variables stored in the file. It also parses the filename for the fault status of the data.

Specify variables to read, and read them from the first member of the ensemble.

fensemble.SelectedVariables = ["gs";"load";"label"];
data = read(fensemble)
data=1×3 table
     label            gs           load
    ________    _______________    ____

    "Faulty"    {5000x1 double}     0  

You can now process the data from the member as needed. For this example, compute the average value of the signal stored in the variable gs. Extract the data from the table returned by read.

gsdata = data.gs{1};
gsmean = mean(gsdata);

You can write the mean value gsmean back to the data file as a new variable. To do so, first expand the list of data variables in the ensemble to include a variable for the new value. Call the new variable gsMean.

fensemble.DataVariables = [fensemble.DataVariables;"gsMean"]
fensemble = 
  fileEnsembleDatastore with properties:

                 ReadFcn: @readBearingData
        WriteToMemberFcn: @writeBearingData
           DataVariables: [5x1 string]
    IndependentVariables: [0x0 string]
      ConditionVariables: "label"
       SelectedVariables: [3x1 string]
                ReadSize: 1
              NumMembers: 5
          LastMemberRead: "/tmp/Bdoc24a_2528353_1096756/tpfe6adf35/predmaint-ex34165887/FaultData_01.mat"
                   Files: [5x1 string]

Next, write the derived mean value to the file corresponding to the last-read ensemble member. (See Data Ensembles for Condition Monitoring and Predictive Maintenance.) When you call writeToLastMemberRead, it converts the data to a structure and calls fensemble.WriteToMemberFcn to write the data to the file.

writeToLastMemberRead(fensemble,'gsMean',gsmean);

Calling read again advances the last-read-member indicator to the next file in the ensemble and reads the data from that file.

data = read(fensemble)
data=1×3 table
     label            gs           load
    ________    _______________    ____

    "Faulty"    {5000x1 double}     50 

You can confirm that this data is from a different member by examining the load variable in the table. Here, its value is 50, while in the previously read member, it was 0.

You can repeat the processing steps to compute and append the mean for this ensemble member. In practice, it is more useful to automate the process of reading, processing, and writing data. To do so, reset the ensemble to a state in which no data has been read. Then loop through the ensemble and perform the read, process, and write steps for each member.

reset(fensemble)
while hasdata(fensemble)
    data = read(fensemble);
    gsdata = data.gs{1};
    gsmean = mean(gsdata);
    writeToLastMemberRead(fensemble,'gsMean',gsmean);
end

The hasdata command returns false when every member of the ensemble has been read. Now, each data file in the ensemble includes the gsMean variable derived from the data gs in that file. You can use techniques like this loop to extract and process data from your ensemble files as you develop a predictive-maintenance algorithm. For an example illustrating in more detail the use of a file ensemble datastore in the algorithm-development process, see Rolling Element Bearing Fault Diagnosis. The example also shows how to use Parallel Computing Toolbox™ to speed up the processing of large data ensembles.

To confirm that the derived variable is present in the file ensemble datastore, read it from the first and second ensemble members. To do so, reset the ensemble again, and add the new variable to the selected variables. In practice, after you have computed derived values, it can be useful to read only those values without rereading the unprocessed data, which can take significant space in memory. For this example, read selected variables that include the new variable, gsMean, but do not include the unprocessed data, gs.

reset(fensemble)
fensemble.SelectedVariables = ["label";"load";"gsMean"];
data1 = read(fensemble)
data1=1×3 table
     label      load     gsMean 
    ________    ____    ________

    "Faulty"     0      -0.22648

data2 = read(fensemble)
data2=1×3 table
     label      load     gsMean 
    ________    ____    ________

    "Faulty"     50     -0.22937

To read data from multiple ensemble members in one call to the read command, use the ReadSize property of an ensemble datastore. This example uses simulationEnsembleDatastore, but you can use the same technique for fileEnsembleDatastore.

Use the following code to create a simulationEnsembleDatastore object using data previously generated by running a Simulink model at a various fault values (see generateSimulationEnsemble). The ensemble includes simulation data for five different values of a model parameter, ToothFaultGain. (Because of the volume of data, the unzip operation might take a minute or two.) Specify some of the data variables to read.

unzip simEnsData.zip  % extract compressed files
ensemble = simulationEnsembleDatastore(pwd,'logsout');
ensemble.SelectedVariables = ["Vibration";"SimulationInput"];

By default, calling read on this ensemble datastore returns a single-row table containing the values of the Vibration and SimulationInput variables for the first ensemble member. Change the ReadSize property to read three members at once.

ensemble.ReadSize = 3;
data1 = read(ensemble)
data1=3×2 table
         Vibration                SimulationInput        
    ___________________    ______________________________

    {20202x1 timetable}    {1x1 Simulink.SimulationInput}
    {20215x1 timetable}    {1x1 Simulink.SimulationInput}
    {20204x1 timetable}    {1x1 Simulink.SimulationInput}

read returns a three-row table, where each row contains data from one of the first, second, and third ensemble members. read also updates the LastReadMember property of the ensemble datastore to a string array containing the paths of the three corresponding files. Avoid setting ReadSize to a value so large as to risk running out of memory while loading the data.

If the ensemble contains three or more additional members, the next read operation returns data from the fourth, fifth, and sixth members. Because the ensemble of this example contains only five members total, the next read operation returns only two rows.

data2 = read(ensemble)
data2=2×2 table
         Vibration                SimulationInput        
    ___________________    ______________________________

    {20213x1 timetable}    {1x1 Simulink.SimulationInput}
    {20224x1 timetable}    {1x1 Simulink.SimulationInput}

Input Arguments

collapse all

Ensemble datastore to read, specified as a:

In either case, read returns a table containing all the variables specified in ensemble.SelectedVariables.

Output Arguments

collapse all

Selected variables from the ensemble member, returned as a table. The table variables are the selected variables, and the table data are the values read from the ensemble data. By default, read reads one ensemble member at a time and returns a single table row.

To read multiple ensemble members at one time, set the ReadSize property of ensemble to a value greater than 1. For instance, if you set ReadSize to 3, then read reads the next 3 ensemble members and returns a table with 3 rows. If fewer than ReadSize members are unread, then read returns a table with as many rows as there are remaining members. For an example, see Read Multiple Ensemble Members in One Operation. Avoid setting ReadSize to such a large value as to risk running out of memory while loading data.

Data and ensemble member information, returned as a structure with fields:

  • Size — Dimensions of the table data, returned as a vector. For instance, if your ensemble has four variables specified in ensemble.SelectedVariables, then Info.Size = [1 4].

  • FileName — Path to the data file corresponding to the accessed ensemble member, returned as a string. For example, "C:\Data\Experiment1\fault1.mat". Calling read also sets the LastMemberRead property of the ensemble to this value. If the ReadSize property of ensemble is greater than 1, this value is a string vector containing the paths to all the accessed files.

Version History

Introduced in R2018a