How to read the PRE html tags and replace some white spaces

2 次查看(过去 30 天)
I read data from html file and delimmited by the following tags
<pre>
12.0 29132 -60.3 -91.4 1 0.01 260 753.2 753.3 753.2
10.0 30260 -57.9 1 0.01 260 58 802.4 802.5 802.4
9.8 30387 -57.7 -89.7 1 0.01 261 61 807.8 807.9 807.8
6.0 33631 -40.4 -77.4 1 0.17 260 88 1004.0 1006.5 1004.1
5.9 33746 -40.3 -77.3 1 0.17 1009.2 1011.8 1009.3
</pre>
by the code:
t = regexp(html, '<PRE[^>]*>(.*?)</PRE>', 'tokens');
where t is a cell of char
Well, now I am trying to replace blank space with NaN to obtain:
12.0 29132 -60.3 -91.4 1 0.01 260 Nan 753.2 753.3 753.2
10.0 30260 -57.9 Nan 1 0.01 260 58 802.4 802.5 802.4
9.8 30387 -57.7 -89.7 1 0.01 261 61 807.8 807.9 807.8
6.0 33631 -40.4 -77.4 1 0.17 260 88 1004.0 1006.5 1004.1
5.9 33746 -40.3 -77.3 1 0.17 NaN NaN 1009.2 1011.8 1009.3
In this data set the columns are not always delimited by the same space and I do not know the lenght of the white spaces.
For example: in the last one line of my frist one data set there are two "empty places" that I would replace with 'NaN'. The position of all elements can't be changed (textscan function is dangerous I think)
Do you have any suggestion? Maybe I should to read the PRE tags by another way?
Thank you

采纳的回答

Cedric
Cedric 2014-6-20
编辑:Cedric 2014-6-21
I've got to run, but here is one way (I'll come back later to discuss further if needed).
EDIT: the first solution could not work, I will update it as soon as I have more information.
  6 个评论
Cedric
Cedric 2014-6-21
编辑:Cedric 2014-6-21
Ok, this is a table with 7 characters fixed column width. So you can process it as follows
regexprep( content, ' {7}', ' NaN' )
where content is the token that is outputted by you first call to REGEXP. If you have more than 7 white spaces at the beginning of each line, e.g. because of HTML indentation, we can refine the pattern to exclude them. Just let me know.
Stefano
Stefano 2014-6-23
Ok, thank you! Good job! It's a perfect solution for my data. Answer accepted :)

请先登录,再进行评论。

更多回答(0 个)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by