Reading conetent from web url
5 次查看(过去 30 天)
显示 更早的评论
I know how to read urls and save the content for further analyzing the data.
The issue I am facing is that I want to read certain content of a url in a specif way;
For e.g from this url https://www.gem.wiki/Almaty-2_power_station. I would like to read table 2 in a table format or tables with having specific words in it.
On exploring internet I figured out that I can read table directly from urls but I am not sure the table I want to read from the url is actual table or just text content.
Any help will be great
2 个评论
采纳的回答
Rahul
2024-8-30
I understand that you are trying to read the content of 'Table 2' from url https://www.gem.wiki/Almaty-2_power_station .
You can achieve the desired result by following the following code:
url = 'https://www.gem.wiki/Almaty-2_power_station';
htmlContent = webread(url); % Reading the content from the url
tree = htmlTree(htmlContent);
tables = findElement(tree, "table"); % Finding the tables from the DOM tree
secondTableElement = tables(4); % Here I have tables the index as 4 as some other elemts are of the HTML page are also getting considered as tables.
% Find all rows in the second table
rows = findElement(secondTableElement, "tr");
% Initialize a cell array to store table data
tableData = {};
columnNames = {};
headerCells = findElement(rows(1), "th");
% Extract header text
for j = 1:numel(headerCells)
columnNames{j} = strtrim(extractHTMLText(headerCells(j)));
end
% Extract data rows
for i = 2:numel(rows)
cells = findElement(rows(i), "td");
% Extract text from each cell
rowData = cell(1, numel(cells));
for j = 1:numel(cells)
rowData{j} = strtrim(extractHTMLText(cells(j)));
end
tableData = [tableData; rowData];
end
% The following part is just to get a string cell array for the header
headerCellstring = cell(size(columnNames));
for i = 1:numel(columnNames)
headerCellstring{i} = columnNames{i}{1};
end
% Obtain the table using 'cell2table' function
secondTable = cell2table(tableData, 'VariableNames', headerCellstring);
You can refer to the following documentations for your reference:
'cell2table': https://www.mathworks.com/help/releases/R2024a/matlab/ref/cell2table.html?searchHighlight=cell2table&s_tid=doc_srchtitle
Hope this helps! Thanks.
更多回答(1 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!