Importing a PDF Bank Statement into MATLAB and splitting transactions correctly
10 次查看(过去 30 天)
显示 更早的评论
Hi, my bank statement comes as a PDF, but I cannot for the life of me import it into Excel correctly in order to do some analysis. I decided to try MATLAB and see if it imports the data as intended.
The PDF starts with a bunch of text that isn't useful, such as name and address etc. I want to start the import at the point in the document where the transactions begin.
The columns split into [Trans Data], [Post Date], [Description], [Amount]
Then each line below that has data corresponding to that data point.
The columns are only split by a space, but a data entry may also have spaces inside it, so spaces cannot be used to determine where one cell finishes and the next one starts.
Any ideas?
Bonus if I can export the final result as an Excel file. If not I can keep it in MATLAB
0 个评论
回答(3 个)
Star Strider
2025-1-8
The Text Analytics Toolbox has several functions that can be used to extract data from PDF files. See the documenntation section on Extract Text Data from PDF for details. (I don’t have that toolbox. I just wanted to see if that was a possibility.)
0 个评论
Jacob Mathew
2025-1-8
Hey AluminiumMan,
You could try using extractfiletext method to extract the string from the PDF then apply some form of regex pattern matching to get the data. However, do note that data within the PDF need not be structured or in any particular order. Hence, it is much more common to use OCR techniques to extract data from a PDF than reading it. You can reference more on this in the MATLAB Answer below:
You can refer to the extractfiletext documentation below to know more about reading PDF:
0 个评论
Griffin
2025-2-27
Steps to Convert & Customize Bank Statement to Excel
✔ Convert to Excel:
- Open the PDF/CSV bank statement in Excel.
- If it's a PDF, use an online converter or import via Excel → Data → Get Data → From PDF.
✔ Format the Data:
- Adjust columns, remove blank rows, and set proper headings (Date, Description, Amount, Balance).
- Use bold text for headers and important values.
✔ Customize & Annotate:
- Add comments or highlight specific transactions.
- Use conditional formatting to mark debits/credits.
- Apply filters or pivot tables for better analysis.
✔ Save & Share:
- Save as .xlsx format.
- Lock important columns or add a signature note if needed.
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Spreadsheets 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!