Main Content

Generate and Use Simulated Data Ensemble

This example shows how to generate a data ensemble for predictive-maintenance algorithm design by simulating a Simulink® model of a machine while varying a fault parameter. The example then illustrates some of the ways you interact with a simulation ensemble datastore. The example shows how to read data from the datastore into the MATLAB® workspace, process the data to compute derived variables, and write the new variables back to the datastore.

The model in this example is a simplified version of the gear-box model described in Using Simulink to Generate Fault Data. Load the Simulink model.

mdl = 'TransmissionCasingSimplified';

For this example, only one fault mode is modeled. The gear-tooth fault is modeled as a disturbance in the Gear Tooth fault subsystem. The magnitude of the disturbance is controlled by the model variable ToothFaultGain, where ToothFaultGain = 0 corresponds to no gear tooth fault (healthy operation).

Generate the Ensemble of Simulated Data

To generate a simulation ensemble datastore of fault data, you use generateSimulationEnsemble to simulate the model at different values of ToothFaultGain, ranging from -2 to zero. This function simulates the model once for each entry in an array of Simulink.SimulationInput objects that you provide. Each simulation generates a separate member of the ensemble. Create such an array, and use setVariable to assign a tooth-fault gain value for each run.

toothFaultValues  = -2:0.5:0; % 5 ToothFaultGain values

for ct = numel(toothFaultValues):-1:1
    tmp = Simulink.SimulationInput(mdl);
    tmp = setVariable(tmp,'ToothFaultGain',toothFaultValues(ct));
    simin(ct) = tmp;

For this example, the model is already configured to log certain signal values, Vibration and Tacho (see Save Signal Data Using Signal Logging (Simulink)). The generateSimulationEnsemble function further configures the model to:

  • Save logged data to files in the folder you specify

  • Use the timetable format for signal logging

  • Store each Simulink.SimulationInput object in the saved file with the corresponding logged data

Specify a location for the generated data. For this example, save the data to a folder called Data within your current folder. If all the simulations complete without error, the function returns true in the indicator output, status.

mkdir Data
location = fullfile(pwd,'Data');
[status,E] = generateSimulationEnsemble(simin,location);
[12-Feb-2024 23:24:49] Running simulations...
[12-Feb-2024 23:25:01] Completed 1 of 5 simulation runs
[12-Feb-2024 23:25:03] Completed 2 of 5 simulation runs
[12-Feb-2024 23:25:05] Completed 3 of 5 simulation runs
[12-Feb-2024 23:25:07] Completed 4 of 5 simulation runs
[12-Feb-2024 23:25:10] Completed 5 of 5 simulation runs
status = logical

Inside the Data folder, examine one of the files. Each file is a MAT-file containing the following MATLAB® variables:

  • SimulationInput — The Simulink.SimulationInput object that was used to configure the model for generating the data in the file. You can use this to extract information about the conditions (such as faulty or healthy) under which this simulation was run.

  • logsout — A Dataset object containing all the data that the Simulink model is configured to log.

  • PMSignalLogName — The name of the variable that contains the logged data ('logsout' in this example). The simulationEnsembleDatastore command uses this name to parse the data in the file.

  • SimulationMetadata — Other information about the simulation that generated the data logged in the file.

Now you can create the simulation ensemble datastore using the generated data. The resulting simulationEnsembleDatastore object points to the generated data. The object lists the data variables in the ensemble, and by default all the variables are selected for reading.

ensemble = simulationEnsembleDatastore(location)
ensemble = 
  simulationEnsembleDatastore with properties:

           DataVariables: [4x1 string]
    IndependentVariables: [0x0 string]
      ConditionVariables: [0x0 string]
       SelectedVariables: [4x1 string]
                ReadSize: 1
              NumMembers: 5
          LastMemberRead: [0x0 string]
                   Files: [5x1 string]

ans = 4x1 string

ans = 4x1 string

Read Data from Ensemble Members

Suppose that for the analysis you want to do, you need only the Vibration data and the Simulink.SimulationInput object that describes the conditions under which each member was simulated. Set ensemble.SelectedVariables to specify the variables you want to read. The read command then extracts those variables from the first ensemble member, as determined by the software.

ensemble.SelectedVariables = ["Vibration";"SimulationInput"];
data1 = read(ensemble)
data1=1×2 table
        Vibration               SimulationInput        
    _________________    ______________________________

    {589x1 timetable}    {1x1 Simulink.SimulationInput}

data.Vibration is a cell array containing one timetable row storing the simulation times and the corresponding vibration signal. You can now process this data as needed. For instance, extract the vibration data from the table and plot it.

vibdata1 = data1.Vibration{1};
title('Vibration - First Ensemble Member')

The LastMemberRead property of the ensemble contains the file name of the most recently read member. The next time you call read on this ensemble, the software advances to the next member of the ensemble. (See Data Ensembles for Condition Monitoring and Predictive Maintenance for more information.) Read the selected variables from the next member of the ensemble.

data2 = read(ensemble)
data2=1×2 table
        Vibration               SimulationInput        
    _________________    ______________________________

    {603x1 timetable}    {1x1 Simulink.SimulationInput}

To confirm that data1 and data2 contain data from different ensemble members, examine the values of the varied model parameter, ToothFaultGain. For each ensemble, this value is stored in the Variables field of the SimulationInput variable.

SimInput1 = data1.SimulationInput{1};
ans = 
  Variable with properties:

           Name: 'ToothFaultGain'
          Value: -2
      Workspace: 'global-workspace'
        Context: ''
    Description: ""

SimInput2 = data2.SimulationInput{1};
ans = 
  Variable with properties:

           Name: 'ToothFaultGain'
          Value: -1.5000
      Workspace: 'global-workspace'
        Context: ''
    Description: ""

This result confirms that data1 is from the ensemble with ToothFaultGain = –2, and data2 is from the ensemble with ToothFaultGain = –1.5.

Append Data to Ensemble Member

Suppose that you want to convert the ToothFaultGain values for each ensemble member into a binary indicator of whether or not a tooth fault is present. Suppose further that you know from your experience with the system that tooth-fault gain values less than 0.1 in magnitude are small enough to be considered healthy operation. Convert the gain value for the ensemble member you just read into an indicator that is 0 (no fault) for –0.1 < gain < 0.1, and 1 (fault) otherwise.

sT = (abs(SimInput2.Variables.Value) < 0.1);

To append the new tooth-fault indicator to the corresponding ensemble data, first expand the list of data variables in the ensemble.

ensemble.DataVariables = [ensemble.DataVariables;"ToothFault"];
ans = 5x1 string

Then, use writeToLastMemberRead to write a value for new variable to the last-read member of the ensemble.


Batch Process Data from All Ensemble Members

In practice, you want to append the tooth-fault indicator to every member in the ensemble. To do so, reset the ensemble to its unread state, so that the next read begins at the first ensemble member. Then, loop through all the ensemble members, computing ToothFault for each member and appending it.

sT = false; 
while hasdata(ensemble)
    data = read(ensemble);
    SimInputVars = data.SimulationInput{1}.Variables;
    TFGain = SimInputVars.Value;
    sT = (abs(TFGain) < 0.1);

Finally, designate the new tooth-fault indicator as a condition variable in the ensemble. You can use this designation to track and refer to variables in the ensemble data that represent conditions under which the member data was generated.

ensemble.ConditionVariables = "ToothFault";
ans = 

Now, each ensemble member contains the original unprocessed data and an additional variable indicating the fault condition under which the data was collected. In practice, you might compute and append other values derived from the raw vibration data, to identify potential condition indicators that you can use for fault detection and diagnosis. For a more detailed example that shows more ways to manipulate and analyze data stored in a simulationEnsembleDatastore object, see Using Simulink to Generate Fault Data.

Read Multiple Members at Once

If it is efficient or useful for the processing you want to do, you can configure the ensemble to read data from multiple members at once. To do so, use the ReadSize property. The read command uses this property to determine how many ensemble members to read at one time. For example, configure the ensemble to read two members at a time.

ensemble.ReadSize = 2;

Changing the value of ReadSize also resets the ensemble to its unread state. Thus, the next read operation reads the first two ensemble members. read returns a table with a number of rows equal to ReadSize.

ensemble.SelectedVariables = ["Vibration";"ToothFault"];
data3 = read(ensemble)
data3=2×2 table
        Vibration        ToothFault
    _________________    __________

    {589x1 timetable}      false   
    {603x1 timetable}      false   

The LastMemberRead property of the ensemble contains the file names of all ensemble members that were read in this operation.

ans = 2x1 string

When you append data to an ensemble datastore that has ReadSize > 1, you must write to the same number of ensemble members as you read. Thus, for instance, when ReadSize = 2, supply a two-row table to writeToLastMemberRead.

See Also

| | |

Related Topics