how to download pdf files from website?

3 次查看(过去 30 天)
Yara
Yara 2022-12-7
评论: Yara 2022-12-17
I need to download all pdf files from specific url (I do not have the list of names of these files)
I just need to download any file ends with .pdf
Ive tried :
url = 'https://... '; %assume it is a real url
urlwrite(url,'*.pdf');
but it is not working.

回答(1 个)

C B
C B 2022-12-7
system('wget -r -A.pdf https://smallpdf.com/blog/sample-pdf')
--2022-12-07 15:30:38-- https://smallpdf.com/blog/sample-pdf Resolving smallpdf.com (smallpdf.com)... 99.86.127.71 Connecting to smallpdf.com (smallpdf.com)|99.86.127.71|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 450993 (440K) [text/html] Saving to: ‘smallpdf.com/blog/sample-pdf.tmp’ smallpdf.com/blog/sample-pdf.tmp 0%[ ] 0 --.-KB/s smallpdf.com/blog/sample-pdf.tmp 100%[============================================================================================================>] 440.42K --.-KB/s in 0.005s 2022-12-07 15:30:38 (89.8 MB/s) - ‘smallpdf.com/blog/sample-pdf.tmp’ saved [450993/450993] Loading robots.txt; please ignore errors. --2022-12-07 15:30:38-- https://smallpdf.com/robots.txt Reusing existing connection to smallpdf.com:443. HTTP request sent, awaiting response... 200 OK Length: 57 [text/plain] Saving to: ‘smallpdf.com/robots.txt.tmp’ smallpdf.com/robots.txt.tmp 0%[ ] 0 --.-KB/s smallpdf.com/robots.txt.tmp 100%[============================================================================================================>] 57 --.-KB/s in 0s 2022-12-07 15:30:38 (16.3 MB/s) - ‘smallpdf.com/robots.txt.tmp’ saved [57/57] Removing smallpdf.com/blog/sample-pdf.tmp since it should be rejected. --2022-12-07 15:30:38-- https://smallpdf.com/ Reusing existing connection to smallpdf.com:443. HTTP request sent, awaiting response... 200 OK Length: 445828 (435K) [text/html] Saving to: ‘smallpdf.com/index.html.tmp’ smallpdf.com/index.html.tmp 0%[ ] 0 --.-KB/s smallpdf.com/index.html.tmp 100%[============================================================================================================>] 435.38K --.-KB/s in 0.005s 2022-12-07 15:30:38 (82.2 MB/s) - ‘smallpdf.com/index.html.tmp’ saved [445828/445828] Removing smallpdf.com/index.html.tmp since it should be rejected. FINISHED --2022-12-07 15:30:38-- Total wall clock time: 0.6s Downloaded: 3 files, 876K in 0.01s (85.8 MB/s)
ans = 0
  3 个评论
C B
C B 2022-12-11
sorry for late reply are using windows or linux or mac?

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Downloads 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by