Main Content

SQLImportOptions

Define import options for database data

Description

After you create an SQLImportOptions object, you can customize the import options for importing data from a database into MATLAB®. Import options include defining the data types and fill values for missing data.

Creation

Create an SQLImportOptions object with the databaseImportOptions function.

Properties

expand all

Flag to exclude duplicates from imported data, specified as false or true. To exclude duplicates from the data in a database table or the results of an SQL query, set the ExcludeDuplicates property to true using dot notation.

Setting this property is the equivalent of using the DISTINCT SQL statement in ANSI SQL.

Data Types: logical

Variable names, specified as a cell array of character vectors. Each character vector in the cell array indicates the name of an imported database column from an SQL query or database table.

For a table or SQL query with only one database column, the cell array contains only one character vector.

The default variable names are the names of the columns in an SQL query or database table.

Example: {'productNumber','stockNumber'}

Data Types: cell

Variable types, specified as a cell array of character vectors. Each character vector in the cell array indicates the data type of an imported database column from an SQL query or database table. Each character vector must be a valid MATLAB data type.

For a table or SQL query with only one database column, the cell array contains only one character vector.

When you create the SQLImportOptions object, the databaseImportOptions function automatically detects the data type based on the data type of a database column. This table maps the data type of a database column to the detected MATLAB data type.

Database Data TypeMATLAB Detected Data Type

TEXT

char

DATE, TIME, DATETIME, or TIMESTAMP

char

NUMERIC

double

BOOLEAN or BIT

logical

If you are using the MySQL® native interface, this table maps the data type of a database column to the detected MATLAB data type.

MySQL Data TypeMATLAB Data Type

BIT

logical

TINYINT

double

SMALLINT

double

BIGINT

double

REAL

double

DOUBLE

double

DECIMAL

double

NUMERIC

double

CHAR

string

VARCHAR

string

LONGVARCHAR

string

TIMESTAMP

datetime

DATE

datetime

TIME

duration

YEAR

double

ENUM

categorical

JSON

char

If you are using the PostgreSQL native interface, this table maps the data type of a database column to the detected MATLAB data type.

PostgreSQL Data TypeMATLAB Data Type

Boolean

logical

Smallint

double

Integer

double

Bigint

double

Decimal

double

Numeric

double

Real

double

Double Precision

double

Smallserial

double

Serial

double

Bigserial

double

Money

double

Varchar

string

Char

string

Text

string

Bytea

string

Timestamp

datetime

Timestampz

datetime

Abstime

datetime

Date

datetime

Time

duration

Timez

duration

Interval

calendarDuration

Reltime

calendarDuration

Enum

categorical

Cidr

string

Inet

string

Macaddr

string

Uuid

string

Xml

string

To update the VariableTypes property, use the setoptions function.

Example: {'int64','int32'}

Data Types: cell

Subset of variables to import, specified as a character vector, cell array of character vectors, or numeric array that contains indices. Use the SelectedVariableNames property to determine the database columns to import into the MATLAB workspace.

The values in the SelectedVariableNames property must be equal to the values in the VariableNames property or a subset of these values. By default, the SelectedVariableNames property contains all variable names specified in the VariableNames property. When the SelectedVariableNames property specifies all variable names, the sqlread, fetch, and import functions of the DatabaseDatastore object import all database columns.

Example: {'productNumber','stockNumber'}

Example: [1,2,3]

Data Types: double | char | cell

Fill value for missing data, specified as a cell array that contains one or more values. Each value can be one of these data types:

  • All integer classes

  • single

  • double

  • char

  • string scalar

  • logical

  • datetime array

  • categorical array

  • missing

When you create the SQLImportOptions object, the databaseImportOptions function automatically detects the fill value for missing data based on the data type of the database column. This table maps the data type of a database column to the detected MATLAB fill value.

Database Data TypeMATLAB Detected Fill Value

TEXT

''

DATE, TIME, DATETIME, or TIMESTAMP

''

NUMERIC

NaN

BOOLEAN or BIT

false

If you are using the MySQL native interface, this table maps the data type of a database column to the detected MATLAB data type.

MySQL Data TypeMATLAB Detected Fill Value

CHAR, VARCHAR, LONGVARCHAR, or JSON

'' (if the VariableTypes property is char) or <missing> (if the VariableTypes property is string)

DATE or TIMESTAMP

NaT

TIME

NaN
  • TINYINT

  • SMALLINT

  • BIGINT

  • REAL

  • DOUBLE

  • DECIMAL

  • NUMERIC

  • YEAR

NaN

ENUM

<undefined>

If you are using the PostgreSQL native interface, this table maps the data type of a database column to the detected MATLAB data type.

PostgreSQL Data TypeMATLAB Detected Fill Value

boolean

false
  • smallint

  • integer

  • bigint

  • decimal

  • numeric

  • real

  • double precision

NaN

char, varchar, or text

<missing>

date or timestamp

NaT

time

NaN

interval

NaN

enum

undefined

To update the FillValues property, use the setoptions function.

Example: {'',NaN}

Data Types: cell

Type-specific variable import options, returned as an array of variable import options objects. The array contains an object corresponding to each variable specified in the VariableNames property. Each object in the array contains properties that support the importing of data with a specific data type.

To query the current (or detected) options for a variable, use the getoptions function.

To set and customize options for a variable, use the setoptions function.

Example: opts.VariableOptions returns a collection of SQLVariableImportOptions objects, one corresponding to each variable in the data.

Variable naming rule, specified as one of these values:

  • "modify" — Remove non-ASCII characters from variable names when the SQLImportOptions function imports data.

  • "preserve" — Preserve most variable names when the SQLImportOptions function imports data. For details, see Limitations.

If you are using the MySQL or PostgreSQL native interface, "preserve" is the default value.

The VariableNamingRule property has these limitations:

  • The variable names Properties, RowNames, and VariableNames are reserved identifiers for the table data type.

  • The length of each variable name must be less than the number returned by namelengthmax.

Example: "VariableNamingRule","modify"

Data Types: string

Row filter condition, specified as a matlab.io.RowFilter object. Using RowFilter you can filter by columns or other criteria not listed in the VariableNames property. To do this, set a new RowFilter property containing the new variable name. See Filter Data by Using Unique Variable Names.

Example: opt.RowFilter = opt.RowFilter.productnumber <= 5

Object Functions

getoptionsRetrieve import options for database data
previewPreview eight rows from database using import options
resetReset to default import options for database data
setoptionsCustomize import options for database data

Examples

collapse all

Customize import options when importing data from a database table. Control the import options by creating an SQLImportOptions object. Then, customize import options for different database columns. Import data using the sqlread function.

This example uses the patients.xls file, which contains the columns Gender, Location, SelfAssessedHealthStatus, and Smoker. The example also uses a Microsoft® SQL Server® Version 11.00.2100 database and the Microsoft SQL Server Driver 11.00.5058.

Create a database connection to a Microsoft SQL Server database with Windows® authentication. Specify a blank username and password.

datasource = 'MS SQL Server Auth';
conn = database(datasource,'','');

Load patient information into the MATLAB® workspace.

patients = readtable('patients.xls');

Create the patients database table using the patient information.

tablename = 'patients';
sqlwrite(conn,tablename,patients)

Create an SQLImportOptions object using the patients database table and the databaseImportOptions function.

opts = databaseImportOptions(conn,tablename)
opts = 
  SQLImportOptions with properties:

           ExcludeDuplicates: false
          VariableNamingRule: 'modify'

               VariableNames: {'LastName', 'Gender', 'Age' ... and 7 more}
               VariableTypes: {'char', 'char', 'double' ... and 7 more}
       SelectedVariableNames: {'LastName', 'Gender', 'Age' ... and 7 more}
                  FillValues: {'', '',  NaN  ... and 7 more }
                   RowFilter: <unconstrained> 

             VariableOptions: Show all 10 VariableOptions

Display the current import options for the variables selected in the SelectedVariableNames property of the SQLImportOptions object.

vars = opts.SelectedVariableNames;
varOpts = getoptions(opts,vars)
varOpts = 
    1x10 SQLVariableImportOptions array with properties:

   Variable Options:
                      (1) |      (2) |      (3) |        (4) |      (5) |      (6) |      (7) |        (8) |         (9) |                       (10)
         Name: 'LastName' | 'Gender' |    'Age' | 'Location' | 'Height' | 'Weight' | 'Smoker' | 'Systolic' | 'Diastolic' | 'SelfAssessedHealthStatus'
         Type:     'char' |   'char' | 'double' |     'char' | 'double' | 'double' | 'double' |   'double' |    'double' |                     'char'
  MissingRule:     'fill' |   'fill' |   'fill' |     'fill' |   'fill' |   'fill' |   'fill' |     'fill' |      'fill' |                     'fill'
    FillValue:         '' |       '' |      NaN |         '' |      NaN |      NaN |      NaN |        NaN |         NaN |                         ''

	To access sub-properties of each variable, use getoptions

Change the data types for the Gender, Location, SelfAssessedHealthStatus, and Smoker variables using the setoptions function. Because the Gender, Location, and SelfAssessedHealthStatus variables indicate a finite set of repeating values, change their data type to categorical. Because the Smoker variable stores the values 0 and 1, change its data type to logical. Then, display the updated import options.

opts = setoptions(opts,{'Gender','Location','SelfAssessedHealthStatus'}, ...
    'Type','categorical');
opts = setoptions(opts,'Smoker','Type','logical');

varOpts = getoptions(opts,{'Gender','Location','Smoker', ...
    'SelfAssessedHealthStatus'})
varOpts = 
    1x4 SQLVariableImportOptions array with properties:

   Variable Options:
                         (1) |           (2) |       (3) |                        (4)
         Name:      'Gender' |    'Location' |  'Smoker' | 'SelfAssessedHealthStatus'
         Type: 'categorical' | 'categorical' | 'logical' |              'categorical'
  MissingRule:        'fill' |        'fill' |    'fill' |                     'fill'
    FillValue:   <undefined> |   <undefined> |         0 |                <undefined>

	To access sub-properties of each variable, use getoptions

Import the patients database table using the sqlread function, and display the last eight rows of the table.

data = sqlread(conn,tablename,opts);
tail(data)
      LastName       Gender    Age            Location             Height    Weight    Smoker    Systolic    Diastolic    SelfAssessedHealthStatus
    _____________    ______    ___    _________________________    ______    ______    ______    ________    _________    ________________________

    {'Foster'   }    Female    30     St. Mary's Medical Center      70       124      false       130          91               Fair             
    {'Gonzales' }    Male      48     County General Hospital        71       174      false       123          79               Good             
    {'Bryant'   }    Female    48     County General Hospital        66       134      false       129          73               Excellent        
    {'Alexander'}    Male      25     County General Hospital        69       171      true        128          99               Good             
    {'Russell'  }    Male      44     VA Hospital                    69       188      true        124          92               Good             
    {'Griffin'  }    Male      49     County General Hospital        70       186      false       119          74               Fair             
    {'Diaz'     }    Male      45     County General Hospital        68       172      true        136          93               Good             
    {'Hayes'    }    Male      48     County General Hospital        66       177      false       114          86               Fair             

Display a summary of the imported data. The sqlread function applies the import options to the variables in the imported data.

summary(data)
Variables:

    LastName: 100×1 cell array of character vectors

    Gender: 100×1 categorical

        Values:

            Female       53   
            Male         47   

    Age: 100×1 double

        Values:

            Min          25   
            Median       39   
            Max          50   

    Location: 100×1 categorical

        Values:

            County General Hospital         39   
            St. Mary s Medical Center       24   
            VA Hospital                     37   

    Height: 100×1 double

        Values:

            Min          60   
            Median       67   
            Max          72   

    Weight: 100×1 double

        Values:

            Min          111  
            Median     142.5  
            Max          202  

    Smoker: 100×1 logical

        Values:

            True        34   
            False       66   

    Systolic: 100×1 double

        Values:

            Min         109   
            Median      122   
            Max         138   

    Diastolic: 100×1 double

        Values:

            Min           68  
            Median      81.5  
            Max           99  

    SelfAssessedHealthStatus: 100×1 categorical

        Values:

            Excellent       34   
            Fair            15   
            Good            40   
            Poor            11   

Now set the filter condition to import only data for patients older than 40 years and not taller than 68 inches.

opts.RowFilter = opts.RowFilter.Age > 40 & opts.RowFilter.Height <= 68
opts = 
  SQLImportOptions with properties:

           ExcludeDuplicates: false
          VariableNamingRule: 'modify'

               VariableNames: {'LastName', 'Gender', 'Age' ... and 7 more}
               VariableTypes: {'char', 'categorical', 'double' ... and 7 more}
       SelectedVariableNames: {'LastName', 'Gender', 'Age' ... and 7 more}
                  FillValues: {'',  <undefined>,  NaN  ... and 7 more }
                   RowFilter: Age > 40 & Height <= 68 

             VariableOptions: Show all 10 VariableOptions

Again, import the patients database table using the sqlread function, and display a summary of the imported data.

data = sqlread(conn,tablename,opts);
summary(data)
Variables:

    LastName: 24×1 cell array of character vectors

    Gender: 24×1 categorical

        Values:

            Female       17   
            Male          7   

    Age: 24×1 double

        Values:

            Min           41  
            Median      45.5  
            Max           50  

    Location: 24×1 categorical

        Values:

            County General Hospital         13   
            St. Mary s Medical Center        5   
            VA Hospital                      6   

    Height: 24×1 double

        Values:

            Min          62   
            Median       66   
            Max          68   

    Weight: 24×1 double

        Values:

            Min         119   
            Median      137   
            Max         194   

    Smoker: 24×1 logical

        Values:

            True         8   
            False       16   

    Systolic: 24×1 double

        Values:

            Min          114  
            Median     121.5  
            Max          138  

    Diastolic: 24×1 double

        Values:

            Min           68  
            Median      81.5  
            Max           96  

    SelfAssessedHealthStatus: 24×1 categorical

        Values:

            Excellent        7   
            Fair             3   
            Good            10   
            Poor             4   

Delete the patients database table using the execute function.

sqlquery = ['DROP TABLE ' tablename];
execute(conn,sqlquery)

Close the database connection.

close(conn)

Filter data by columns or other criteria not listed in the VariableNames property by using the RowFilter property.

Create a PostgreSQL native interface database connection to a PostgreSQL database. The database contains the table productTable.

datasource = "PostgreSQLDataSource";
username = "dbdev";
password = "matlab";

conn = postgresql(datasource,username,password);

Import data from the database table productTable. The sqlread function returns a MATLAB table that contains the product data. Display the first five rows of product data.

tablename = "productTable";
data = sqlread(conn,tablename);
head(data,5)
    productnumber    stocknumber    suppliernumber    unitcost    productdescription
    _____________    ___________    ______________    ________    __________________

          9          1.2597e+05          1003            13        "Victorian Doll" 
          8          2.1257e+05          1001             5        "Train Set"      
          7          3.8912e+05          1007            16        "Engine Kit"     
          2          4.0031e+05          1002             9        "Painting Set"   
          4          4.0034e+05          1008            21        "Space Cruiser"  

Create an SQLImportOptions object using the productTable database table and the databaseImportOptions function.

opts = databaseImportOptions(conn,tablename)
opts = 
  SQLImportOptions with properties:

           ExcludeDuplicates: false
          VariableNamingRule: 'preserve'

               VariableNames: {'productnumber', 'stocknumber', 'suppliernumber' ... and 2 more}
               VariableTypes: {'double', 'double', 'double' ... and 2 more}
       SelectedVariableNames: {'productnumber', 'stocknumber', 'suppliernumber' ... and 2 more}
                  FillValues: { NaN,  NaN,  NaN  ... and 2 more }
                   RowFilter: <unconstrained> 

             VariableOptions: Show all 5 VariableOptions

Set the RowFilter property containing the new variable name. For example, use the length function to filter the productdescription strings by the number of characters.

opts.RowFilter = rowfilter("length(productdescription)");
opts.RowFilter
ans = 
  RowFilter with no constraints and no selected variables

    <unconstrained>

  VariableNames: length(productdescription), productnumber, stocknumber, suppliernumber, unitcost, productdescription

By setting RowFilter, you added a unique VariableName to RowFilter. The VariableNames property of SQLImportOptions does not contain this variable name. Even though VariableNames of RowFilter does not completely match VariableNames of SQLImportOptions, updating VariableNames of SQLImportOptions still updates VariableNames of RowFilter.

opts.VariableNames{4} = "cost";
opts.RowFilter
ans = 
  RowFilter with no constraints and no selected variables

    <unconstrained>

  VariableNames: length(productdescription), productnumber, stocknumber, suppliernumber, cost, productdescription

Set the filtering condition using the unique variable name, length(productdescription), and the new variable name, cost.

opts.RowFilter = opts.RowFilter.("length(productdescription)") < 10 & opts.RowFilter.cost > 10;

Import data from the database table and display the first five rows of product data.

data = sqlread(conn,tablename,opts);
head(data,5)
    productnumber    stocknumber    suppliernumber    cost    productdescription
    _____________    ___________    ______________    ____    __________________

          3            4.01e+05          1009           17        "Slinky"      
         12          2.1046e+05          1010           22        "Hugsy"       
         13          4.7082e+05          1012         16.5        "Pancakes"    
         14           5.101e+05          1011           19        "Shawl"       
         15          8.9975e+05          1011           20        "Snacks"      

Limitations

  • If you use the "VariableNamingRule" name-value argument with the SQLImportOptions object opts, the data import functions return an error.

  • If you set the VariableNamingRule name-value argument to the value "modify":

    • The variable names Properties, RowNames, and VariableNames are reserved identifiers for the table data type.

    • The length of each variable name must be less than the number returned by namelengthmax.

  • The fetch and sqlread functions return an error if you specify the RowFilter name-value argument with the SQLImportOptions object opts. In this case, it is ambiguous which of the RowFilter object to use, especially if the filter conditions are different.

Version History

Introduced in R2018b

expand all