Main Content

Handling Large File I/O in MEX Files

Prerequisites to Using 64-Bit I/O

MATLAB® supports the use of 64-bit file I/O operations in your MEX file programs. You can read and write data to files that are up to and greater than 2 GB (2 31-1 bytes) in size. Some operating systems or compilers do not support files larger than 2 GB. The following topics describe how to use 64-bit file I/O in your MEX files.

Header File

Header file io64.h defines many of the types and functions required for 64-bit file I/O. The statement to include this file must be the first #include statement in your source file and must also precede any system header include statements:

#include "io64.h"
#include "mex.h"

Type Declarations

To declare variables used in 64-bit file I/O, use the following types.

MEX Type

Description

POSIX

fpos_T

Declares a 64-bit int type for setFilePos() and getFilePos(). Defined in io64.h.

fpos_t

int64_T, uint64_T

Declares 64-bit signed and unsigned integer types. Defined in tmwtypes.h.

long, long

structStat

Declares a structure to hold the size of a file. Defined in io64.h.

struct stat

FMT64

Used in mexPrintf to specify length within a format specifier such as %d. See example in the section Printing Formatted Messages. FMT64 is defined in tmwtypes.h.

%lld

LL, LLU

Suffixes for literal int constant 64-bit values (C Standard ISO®/IEC 9899:1999(E) Section 6.4.4.1). Used only on UNIX® systems.

LL, LLU

Functions

Use the following functions for 64-bit file I/O. All are defined in the header file io64.h.

Function

Description

POSIX

fileno()

Gets a file descriptor from a file pointer

fileno()

fopen()

Opens the file and obtains the file pointer

fopen()

getFileFstat()

Gets the file size of a given file pointer

fstat()

getFilePos()

Gets the file position for the next I/O

fgetpos()

getFileStat()

Gets the file size of a given file name

stat()

setFilePos()

Sets the file position for the next I/O

fsetpos()

Specifying Constant Literal Values

To assign signed and unsigned 64-bit integer literal values, use type definitions int64_T and uint64_T.

On UNIX systems, to assign a literal value to an integer variable where the value to be assigned is greater than 2 31-1 signed, you must suffix the value with LL. If the value is greater than 2 32-1 unsigned, then use LLU as the suffix. These suffixes are not valid on Microsoft® Windows® systems.

Note

The LL and LLU suffixes are not required for hardcoded (literal) values less than 2 G (2 31-1), even if they are assigned to a 64-bit int type.

The following example declares a 64-bit integer variable initialized with a large literal int value, and two 64-bit integer variables:

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, 
                  const mxArray *prhs[])
{
#if defined(_MSC_VER) || defined(__BORLANDC__)     /* Windows */
   int64_T large_offset_example = 9000222000;
#else                                              /* UNIX    */
   int64_T large_offset_example = 9000222000LL;
#endif

int64_T offset   = 0;
int64_T position = 0;

Opening a File

To open a file for reading or writing, use the C/C++ fopen function as you normally would. As long as you have included io64.h at the start of your program, fopen works correctly for large files. No changes at all are required for fread, fwrite, fprintf, fscanf, and fclose.

The following statements open an existing file for reading and updating in binary mode.

fp = fopen(filename, "r+b");
if (NULL == fp)
   {
   /* File does not exist. Create new file for writing 
    * in binary mode.
    */
   fp = fopen(filename, "wb");
   if (NULL == fp)
      {
      sprintf(str, "Failed to open/create test file '%s'",
        filename);
      mexErrMsgIdAndTxt( "MyToolbox:myfcn:fileCreateError",
        str);
      return;
      }
   else
      {
      mexPrintf("New test file '%s' created\n",filename);
      }
   }
else mexPrintf("Existing test file '%s' opened\n",filename);

Printing Formatted Messages

You cannot print 64-bit integers using the %d conversion specifier. Instead, use FMT64 to specify the appropriate format for your platform. FMT64 is defined in the header file tmwtypes.h. The following example shows how to print a message showing the size of a large file:

int64_T large_offset_example = 9000222000LL;

mexPrintf("Example large file size: %" FMT64 "d bytes.\n",
           large_offset_example);

Replacing fseek and ftell with 64-Bit Functions

The ANSI® C fseek and ftell functions are not 64-bit file I/O capable on most platforms. The functions setFilePos and getFilePos, however, are defined as the corresponding POSIX® fsetpos and fgetpos (or fsetpos64 and fgetpos64) as required by your platform/OS. These functions are 64-bit file I/O capable on all platforms.

The following example shows how to use setFilePos instead of fseek, and getFilePos instead of ftell. The example uses getFileFstat to find the size of the file. It then uses setFilePos to seek to the end of the file to prepare for adding data at the end of the file.

Note

Although the offset parameter to setFilePos and getFilePos is really a pointer to a signed 64-bit integer, int64_T, it must be cast to an fpos_T*. The fpos_T type is defined in io64.h as the appropriate fpos64_t or fpos_t, as required by your platform OS.

getFileFstat(fileno(fp), &statbuf);
fileSize = statbuf.st_size;
offset = fileSize;

setFilePos(fp, (fpos_T*) &offset);
getFilePos(fp, (fpos_T*) &position );

Unlike fseek, setFilePos supports only absolute seeking relative to the beginning of the file. If you want to do a relative seek, first call getFileFstat to obtain the file size. Then convert the relative offset to an absolute offset that you can pass to setFilePos.

Determining the Size of an Open File

To get the size of an open file:

  • Refresh the record of the file size stored in memory using getFilePos and setFilePos.

  • Retrieve the size of the file using getFileFstat.

Refreshing the File Size Record

Before attempting to retrieve the size of an open file, first refresh the record of the file size residing in memory. If you skip this step on a file that is opened for writing, the file size returned might be incorrect or 0.

To refresh the file size record, seek to any offset in the file using setFilePos. If you do not want to change the position of the file pointer, you can seek to the current position in the file. This example obtains the current offset from the start of the file. It then seeks to the current position to update the file size without moving the file pointer.

getFilePos( fp, (fpos_T*) &position);
setFilePos( fp, (fpos_T*) &position);

Getting the File Size

The getFileFstat function takes a file descriptor input argument. Use fileno function to get the file pointer of the open file. getFileFstat returns the size of that file in bytes in the st_size field of a structStat structure.

structStat statbuf;
int64_T fileSize = 0;

if (0 == getFileFstat(fileno(fp), &statbuf))
   {
   fileSize = statbuf.st_size;
   mexPrintf("File size is %" FMT64 "d bytes\n", fileSize);
   }

Determining the Size of a Closed File

The getFileStat function takes the file name of a closed file as an input argument. getFileStat returns the size of the file in bytes in the st_size field of a structStat structure.

structStat statbuf;
int64_T fileSize = 0;

if (0 == getFileStat(filename, &statbuf))
   {
   fileSize = statbuf.st_size;
   mexPrintf("File size is %" FMT64 "d bytes\n", fileSize);
   }