Problem with using fopen
9 次查看(过去 30 天)
显示 更早的评论
The goal is not just get the words from a pdf like you get from extractFileText(filename) syntax, but also the position of each sentence. The solution i use is to read the pdf and then flatedecode it to acive this information. After decoding the information can look like this:
I found a pyhonscript* that works and i want to translate it into matlab.
I found a pyhonscript* that works and i want to translate it into matlab. ...here comes the problem
Python:
pdf = open("TestCOA.pdf","rb").read() <--- python read the file perfectly
Matlab:
fileID = fopen("TestCOA.pdf",'rb','n','us-ascii');
A = fscanf(fileID,'%c') <-- reads some char but mixed with invalid characters <?>
pdf=py.open("TestCOA.pdf","rb").read() <-- same results with the python integration syntax
Upploaded example pdf to try it out. Hope someone can help me to figure this out. :)
*The full python script: https://gist.github.com/averagesecurityguy/ba8d9ed3c59c1deffbd1390dafa5a3c2
0 个评论
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Call Python from MATLAB 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!