readPDFFormData
Read data from PDF forms
Description
Examples
Read Data from PDF Form
Read the data from the form fields in weatherReportForm1.pdf
using readPDFFormData
. The function returns a struct containing the data from the PDF form fields.
filename = "weatherReportForm1.pdf";
data = readPDFFormData(filename)
data = struct with fields:
event_type: "Thunderstorm Wind"
event_narrative: "Large tree down between Plantersville and Nettleton."
Read Data From Multiple Forms
Read the data from the form fields in multiple files using a file datastore.
Create a file datastore for the weather reports forms. The forms are named "weatherReportFormN.pdf"
, where N
is the number of the form.. Specify the file name using the wildcard "*" to find all file names of this structure. To specify the read function to be readPDFFormData
, input this function to fileDatastore
using a function handle.
fds = fileDatastore("weatherReportForm*.pdf",'ReadFcn',@readPDFFormData)
fds = FileDatastore with properties: Files: { ' .../tpe3bf832a/textanalytics-ex39762425/weatherReportForm1.pdf'; ' .../tpe3bf832a/textanalytics-ex39762425/weatherReportForm2.pdf'; ' .../tpe3bf832a/textanalytics-ex39762425/weatherReportForm3.pdf' ... and 1 more } Folders: { '/tmp/Bdoc24b_2725827_1114975/tpe3bf832a/textanalytics-ex39762425' } UniformRead: 0 ReadMode: 'file' BlockSize: Inf PreviewFcn: @readPDFFormData SupportedOutputFormats: ["txt" "csv" "dat" "asc" "xlsx" "xls" "parquet" "parq" "png" "jpg" "jpeg" "tif" "tiff" "wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] ReadFcn: @readPDFFormData AlternateFileSystemRoots: {}
Loop over the files in the datastore and read each PDF form.
data = []; while hasdata(fds) textData = read(fds); data = [data; textData]; end data
data=4×1 struct array with fields:
event_type
event_narrative
Input Arguments
filename
— Name of file
string scalar | character vector
Name of the file, specified as a string scalar or character vector.
readPDFFormData
supports AcroForm PDF files
(interactive forms) only.
Data Types: string
| char
password
— Password to open PDF file
string scalar | character vector
Password to open the PDF file, specified as a character vector or a string scalar.
Example: "skroWhtaM"
Data Types: string
| char
Output Arguments
data
— Output struct
struct
Output struct. The fields of data
correspond to the
names of the form fields in the PDF. If the form field names are not valid
struct field names, then the function automatically edits them to construct
valid names.
Version History
Introduced in R2018a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)