Import Large Data Using DatabaseDatastore
Object
This example shows how to create a
object for accessing collections of data stored in a relational database. After creating a databaseDatastore
DatabaseDatastore
object, you can preview data, read data in chunks, and read every record in the data set.
To analyze large data, you can run algorithms on large data sets using a tall array.
Alternatively, you can write a MapReduce algorithm that defines the chunking and reduction of the data.
This example uses a preconfigured JDBC data source to create the database connection. For more information, see the
function.configureJDBCDataSource
Create DatabaseDatastore
Object
Create a database connection to the JDBC data source MSSQLServerJDBCAuth
. This data source configures a JDBC driver to a Microsoft® SQL Server® database with Windows® authentication. Specify a blank user name and password.
datasource = "MSSQLServerJDBCAuth"; username = ""; password = ""; conn = database(datasource,username,password);
Create a DatabaseDatastore
object using the database connection and an SQL query. This SQL query retrieves all data from the airlinesmall
table.
sqlquery = 'select * from airlinesmall';
dbds = databaseDatastore(conn,sqlquery);
Preview Data in DatabaseDatastore
Object
Preview the first eight records in the data set returned by executing the SQL query.
preview(dbds)
ans = 8×29 table Year Month DayofMonth DayOfWeek DepTime CRSDepTime ArrTime CRSArrTime UniqueCarrier FlightNum TailNum ActualElapsedTime CRSElapsedTime AirTime ArrDelay DepDelay Origin Dest Distance TaxiIn TaxiOut Cancelled CancellationCode Diverted CarrierDelay WeatherDelay NASDelay SecurityDelay LateAircraftDelay ____ _____ __________ _________ _______ __________ _______ __________ _____________ _________ ________ _________________ ______________ _______ ________ ________ ______ _____ ________ ______ _______ _________ ________________ ________ ____________ ____________ ________ _____________ _________________ 1990 9 11 2 1810 1812 1939 1930 'AA' 1426 'NA' 89 78 'NA' 9 -2 'RST' 'ORD' 268 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1990 10 27 6 1353 1355 1634 1640 'US' 112 'NA' 161 165 'NA' -6 -2 'TPA' 'SYR' 1104 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1990 10 23 2 1057 1055 1205 1155 'US' 1621 'NA' 68 60 'NA' 10 2 'ROC' 'EWR' 246 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1990 10 8 1 1515 1440 1609 1535 'NW' 749 'NA' 54 55 'NA' 34 35 'MSP' 'FSD' 197 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1990 10 19 5 1130 1120 1203 1154 'UA' 369 'NA' 93 94 'NA' 9 10 'BUF' 'ORD' 473 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1990 10 12 5 1755 1733 1858 1820 'DL' 590 'NA' 63 47 'NA' 38 22 'BOS' 'BGR' 201 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 2001 11 22 4 1345 1355 1530 1549 'MQ' 4982 '#NAME?' 105 114 '90' -19 -10 'JAX' 'MIA' 334 '8' '7' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 2001 11 26 1 2105 2110 2209 2237 'AA' 1947 'N3BäA1' 64 87 '47' -28 -5 'SFO' 'LAX' 337 '6' '11' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA'
Read Data in DatabaseDatastore
Object
Read the first 10 records.
dbds.ReadSize = 10; read(dbds)
ans = 10×29 table Year Month DayofMonth DayOfWeek DepTime CRSDepTime ArrTime CRSArrTime UniqueCarrier FlightNum TailNum ActualElapsedTime CRSElapsedTime AirTime ArrDelay DepDelay Origin Dest Distance TaxiIn TaxiOut Cancelled CancellationCode Diverted CarrierDelay WeatherDelay NASDelay SecurityDelay LateAircraftDelay ____ _____ __________ _________ _______ __________ _______ __________ _____________ _________ _______ _________________ ______________ _______ ________ ________ ______ _____ ________ ______ _______ _________ ________________ ________ ____________ ____________ ________ _____________ _________________ 1987 10 30 5 1329 1329 1434 1436 'US' 683 'NA' 65 67 'NA' -2 0 'LGA' 'BUF' 292 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 11 7 6 1316 1315 1713 1647 'TW' 810 'NA' 177 152 'NA' 26 1 'STL' 'BOS' 1046 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 11 28 6 815 815 1015 1015 'WN' 441 'NA' 120 120 'NA' 0 0 'BNA' 'HOU' 670 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 11 2 1 700 700 800 800 'NW' 790 'NA' 60 60 'NA' 0 0 'BTR' 'MEM' 319 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 11 14 6 840 840 1127 1120 'CO' 733 'NA' 167 160 'NA' 7 0 'EWR' 'MCO' 938 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 11 1 7 1625 1625 1823 1758 'DL' 957 'NA' 118 93 'NA' 25 0 'EWR' 'CVG' 569 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 11 26 4 1314 1315 1538 1542 'AA' 398 'NA' 84 87 'NA' -4 -1 'BNA' 'RDU' 443 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 28 3 1140 1140 1212 1215 'US' 262 'NA' 32 35 'NA' -3 0 'CLE' 'PIT' 105 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 9 5 1155 1155 1250 1300 'US' 282 'NA' 55 65 'NA' -10 0 'ROC' 'LGA' 254 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 22 4 715 715 807 803 'US' 304 'NA' 52 48 'NA' 4 0 'DTW' 'PIT' 201 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA'
Read the DatabaseDatastore
object two more times by using counter n
. Read 10 records at a time.
n = 0; while(hasdata(dbds) && n~=2) read(dbds) n = n+1; end
ans = 10×29 table Year Month DayofMonth DayOfWeek DepTime CRSDepTime ArrTime CRSArrTime UniqueCarrier FlightNum TailNum ActualElapsedTime CRSElapsedTime AirTime ArrDelay DepDelay Origin Dest Distance TaxiIn TaxiOut Cancelled CancellationCode Diverted CarrierDelay WeatherDelay NASDelay SecurityDelay LateAircraftDelay ____ _____ __________ _________ _______ __________ _______ __________ _____________ _________ _______ _________________ ______________ _______ ________ ________ ______ _____ ________ ______ _______ _________ ________________ ________ ____________ ____________ ________ _____________ _________________ 1987 10 16 5 1553 1555 1641 1640 'US' 323 'NA' 48 45 'NA' 1 -2 'BUF' 'PIT' 186 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 30 5 1821 1815 1956 1955 'US' 346 'NA' 95 100 'NA' 1 6 'IND' 'EWR' 644 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 12 1 1300 1300 1529 1528 'US' 370 'NA' 89 88 'NA' 1 0 'STL' 'PIT' 553 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 7 3 810 810 904 900 'US' 387 'NA' 54 50 'NA' 4 0 'ELM' 'PIT' 208 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 19 1 733 735 827 831 'US' 407 'NA' 54 56 'NA' -4 -2 'ROC' 'CLE' 245 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 15 4 828 830 916 921 'US' 424 'NA' 48 51 'NA' -5 -2 'ORF' 'PHL' 211 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 4 7 1750 1735 1837 1816 'US' 449 'NA' 47 41 'NA' 21 15 'PIT' 'CLE' 105 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 16 5 959 1000 1212 1215 'US' 472 'NA' 73 75 'NA' -3 -1 'BNA' 'PIT' 462 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 17 6 2020 2020 2100 2057 'US' 490 'NA' 40 37 'NA' 3 0 'PIT' 'ERI' 109 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 6 2 1132 1135 1426 1419 'US' 516 'NA' 114 104 'NA' 7 -3 'MSP' 'PIT' 726 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' ans = 10×29 table Year Month DayofMonth DayOfWeek DepTime CRSDepTime ArrTime CRSArrTime UniqueCarrier FlightNum TailNum ActualElapsedTime CRSElapsedTime AirTime ArrDelay DepDelay Origin Dest Distance TaxiIn TaxiOut Cancelled CancellationCode Diverted CarrierDelay WeatherDelay NASDelay SecurityDelay LateAircraftDelay ____ _____ __________ _________ _______ __________ _______ __________ _____________ _________ _______ _________________ ______________ _______ ________ ________ ______ _____ ________ ______ _______ _________ ________________ ________ ____________ ____________ ________ _____________ _________________ 1987 10 24 6 944 945 1211 1213 'US' 535 'NA' 87 88 'NA' -2 -1 'STL' 'PIT' 553 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 18 7 833 835 1003 1011 'US' 571 'NA' 90 96 'NA' -8 -2 'PIT' 'ATL' 526 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 26 1 2356 2355 730 721 'US' 604 'NA' 274 266 'NA' 9 1 'SFO' 'PIT' 2254 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 29 4 1056 1055 1208 1215 'US' 627 'NA' 72 80 'NA' -7 1 'EWR' 'CLE' 404 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 1 4 2304 2255 2340 2329 'US' 655 'NA' 36 34 'NA' 11 9 'PBI' 'RSW' 103 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 3 6 1040 1040 1125 1120 'AS' 51 'NA' 45 40 'NA' 5 0 'OME' 'OTZ' 183 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 23 5 1855 1855 2158 2205 'AS' 66 'NA' 123 130 'NA' -7 0 'JNU' 'SEA' 909 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 30 5 1055 1055 1302 1315 'AS' 91 'NA' 187 200 'NA' -13 0 'SEA' 'ANC' 1449 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 28 3 NaN 1850 NaN 2050 'AS' 116 'NA' NaN 120 'NA' NaN NaN 'PDX' 'BUR' 817 'NA' 'NA' 1 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 26 1 1600 1600 1649 1640 'AS' 172 'NA' 49 40 'NA' 9 0 'LAX' 'SAN' 109 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA'
Reset DatabaseDatastore
Object
Reset the DatabaseDatastore
object to its original state, where no data has been read from it. Resetting allows you to reread from the same DatabaseDatastore
object.
reset(dbds)
Read Every Record in DatabaseDatastore
Object
Read every record in the DatabaseDatastore
object in increments of 50,000 records at a time.
dbds.ReadSize = 50000; data = readall(dbds);
Display the first three records of the full data set.
data(1:3,:)
ans = 3×29 table Year Month DayofMonth DayOfWeek DepTime CRSDepTime ArrTime CRSArrTime UniqueCarrier FlightNum TailNum ActualElapsedTime CRSElapsedTime AirTime ArrDelay DepDelay Origin Dest Distance TaxiIn TaxiOut Cancelled CancellationCode Diverted CarrierDelay WeatherDelay NASDelay SecurityDelay LateAircraftDelay ____ _____ __________ _________ _______ __________ _______ __________ _____________ _________ _______ _________________ ______________ _______ ________ ________ ______ _____ ________ ______ _______ _________ ________________ ________ ____________ ____________ ________ _____________ _________________ 1987 10 28 3 1140 1140 1212 1215 'US' 262 'NA' 32 35 'NA' -3 0 'CLE' 'PIT' 105 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 9 5 1155 1155 1250 1300 'US' 282 'NA' 55 65 'NA' -10 0 'ROC' 'LGA' 254 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA' 1987 10 22 4 715 715 807 803 'US' 304 'NA' 52 48 'NA' 4 0 'DTW' 'PIT' 201 'NA' 'NA' 0 'NA' 0 'NA' 'NA' 'NA' 'NA' 'NA'
Close DatabaseDatastore
Object and Database Connection
close(dbds)
See Also
database
| databaseDatastore
| close
| hasdata
| preview
| read
| readall
| reset