Main Content

Importing Data — Supported Files and Data Types

Supported Files and Data Types

You can import tabular data to the SimBiology Model Analyzer app or to the MATLAB® Workspace. The supported file types are Excel® files (.xls, .xlsx), text files (.csv, .txt), and SAS® XPORT files (.xpt). You can also specify that the data is in a NONMEM® formatted file. The import process interprets the columns according to the NONMEM definitions. For more information see Support for Importing NONMEM Formatted Files.

Note

If your data set contains dosing information that is infusion data, the data set must contain the rate and not an infusion duration.

Unit Conversion

Regardless of whether unit conversion functionality is on or off, dosing in the data file must be expressed in amounts (or as amount/time for infusion rate). By default Unit Conversion is off, so you must ensure that units for the data are consistent with each other. If you want to turn on unit conversion, see Unit Conversion for Imported Data .

Create Data File with SimBiology Definitions

If you are creating a file containing time course data that you want to import into SimBiology for fitting, create the data file with the following columns:

  • Group column — Specify text, numeric, or categorical values. For instance, you can use this column to group multiple individuals into separate groups. You can then use this grouping or categorical information for hierarchical fits. This column is optional.

  • ID column — Specify text, numeric, or categorical values. The rows in the file that have the same ID column value are for the same individual. This column is optional if the measurement data comes from just one individual.

  • Time column — Specify monotonically increasing positive values within each ID that define the time of the dose, observation measurements, and covariate measurements.

  • Zero or more dosing columns — Create one dosing column for each compartment being dosed. In each column, specify positive values representing dose amounts that are added to a species. Use NaN (not a number) to specify that no dose was applied at the specified time. In other words, specify the dose amount as NaN when an observation was recorded but no dose was applied.

  • Zero or more rate columns — Specify positive values, zero, or NaN. Zero specifies an infinite rate and NaN specifies that no rate applies. The rate column is associated with a dosing column and defines the rate at which the dose is administered. For example, if you can specify an infusion dose in the Dose1 column, specify its rate in the Rate1 column.

  • Zero or more observation columns — Specify numeric values or NaN. NaN values define that no observation was recorded at the specified time. Use NaN for times when a dose was applied but no observation was recorded. You can specify one observation value at a particular time for each ID. When you have replicates, specify multiple observation values for the same time point by adding more rows with the same time value. For an example, see rows 2 and 3 in the screen shot below, where CentralConc has two measurements at time = 0.

  • Zero or more covariate columns — Specify text, numeric, or categorical values, or NaN. Each value defines the covariate value at the specified time. NaN values indicate that no covariate observation was recorded at the specified time. SimBiology supports only covariates that are not time varying. For instance, see the Sex and Age columns in the example below. For an example that shows how to use categories for fitting, see Estimate Category-Specific PK Parameters for Multiple Individuals.

A screen shot of a sample data file follows.

Image showing an excel sheet with columns that correspond to group, ID, time, measured data, covariate data, dose, and dose rate.

You can download the sample Excel file from the following location: matlabroot/examples/simbio/data/sample_data_simbiology.xlsx. matlabroot is the root directory where you installed MATLAB. You can also enter matlabroot at the command line to see the file path of the root directory.

Support for Importing NONMEM Formatted Files

You can specify that the data is in a NONMEM formatted file. The following table highlights the interpretation of this data in SimBiology® software.

Column HeaderInterpretation
ID

Text (character vector), numeric, or categorical values that identify the record or group. The import process assumes that contiguous data with the same value contains data from one individual. If the data contains non-contiguous references to the same value, the import process assigns the second ID encountered an indexed valued derived from the group first encountered. For example, if the ID columns contains [1 1 1 2 2 2 1 1 1], the IDs assigned are 1, 2, 1_1.

TIME

Monotonically increasing positive values within each group, indicating time of observation or dose or text (character vector). The data file can specify clock (2:30 as a character vector) or decimal values (6.25). The import process assigns a value of 0 to the first TIME value in the data file. The import process assigns subsequent values relative to the first value.

The following table is an example of how the import process interprets the clock values as decimal values.

Original Clock ValuesImported Values
10:000
10:300.5
111
12:302.5

If the data file also contains a DATE column, the import process uses it with the TIME column in calculating the relative TIME values. The column cannot contain Inf.

DATE, DAT1, DAT2, or DAT3

Defines the day of the observation or the dose. The column can contain the month as a number (9) or a character vector (Sep). Specify date in the following formats:

  • DATE — The column can specify month/day/year or month-day-year. If you specify two numbers, the import process assumes they are month and day. You can use either / or - as a separator.

  • DAT1 — The column can specify day/month/year or day-month-year. If you specify two numbers, the import process assumes they are day and month.

  • DAT2 — The column can specify year/month/day or year-month-day. If you specify two numbers, the import process assumes they are month and day.

  • DAT3 — The column can specify year/day/month or year-day-month. If you specify two numbers, the import process assumes they are day and month.

Note

  • If you specify only one number, the import process assumes it is the day.

  • You can omit the year or specify 1, 2, 3, or 4 digits. If you specify two-digit years, it is assumed to be in the 1900s.

  • If the data has the DAT1, DAT2, or DAT3 column, set the DateLabel property of a NMFileDef object accordingly using sbionmfiledef. Then specify the object as the second input argument when you run sbionmimport.

DVNumeric value of an observation. Column cannot contain Inf or –Inf.
MDV Defines whether a row describes an observation:
  • Row contains 0 — Observation event

  • Row contains 1 — Not an observation event

EVIDDefines the type of event described for the row in the record:
  • 0 — Observation event; row contains an observed value.

  • 1 — Dose event; row describes a dose.

  • 2 — Other event; row describes some other event such as measurement of a covariate.

If a column contains values for dose, but EVID is not 1, the import process ignores the value. You see a warning and the value is ignored.

If EVID is set to 2, then only those specified row data are imported as covariate data. However, if you have an EVID column as well as one or more covariate columns, but do not specify a value of 2 anywhere in the EVID column, then SimBiology imports all the row data as covariate values.

The import process does not support values 3 and 4. You see a warning and the value is ignored.

CMT Indicates which compartment is used for observation value or for dose received. The interpretation also depends on EVID:
  • Observation event (EVID = 0 ) — CMT column indicates which compartment was used for observation value.

  • Dose Event (EVID = 1) — CMT column indicates which compartment received the dose.

Note

SimBiology numbers compartments starting with 1, while NONMEM numbers them starting with 0. For instance, if a NONMEM data file contains doses and measurements for CMT = 0, SimBiology generates data columns named Dose1 and Response1 respectively.

AMT Positive number indicating dose. 0 or NaN specifies no dose administered. The column cannot contain Inf.
RATEPositive number indicating rate of infusion. 0 specifies an infinite rate (equivalent to a bolus dose), and NaN specifies no rate. The column cannot contain Inf.
IIPositive number defining the time between doses.
ADDLWhen the data specifies a number of identical serial doses at specific intervals (defined by II), ADDL specifies the number of doses in the series excluding the initial dose. If the data specifies II but not ADDL, then SimBiology assumes that the dosing occurs for the duration of that data record.

Unsupported NONMEM Definitions

The import process does not support (and therefore ignores) the rows containing the following values or definitions:

  • EVID values 3 and 4

  • SS column for specifying steady state doses

  • PCMT column to define whether to compute a prediction for the row

  • CALL column for calling the ERROR or the PK subroutine

  • If rate is specified as being less than zero, it is assumed to be zero

Supported Table Column Types in SimBiology Model Analyzer

When you are importing data from a table using SimBiology Model Analyzer, the app supports the following column data types: double, char, string, cell array of character vectors, categorical, duration, logical, and datetime.

Support for Importing Multidimensional SimData to SimBiology Model Analyzer

When you import a multidimensional SimData array to the app, the app flattens the SimData array and uses a single index (linear indexing) in the corresponding datasheet. For example, if you import a 2x2x2 SimData array A, the app creates a datasheet with 8 groups (one for each SimData object), indexing from 1 to 8. The app still displays the original size of the SimData array in the Browser.

See Also

| |