Your file is an xml file, so it should be parsed with an xml parser. Parsing it as text or html will always be fragile. Thankfully, you already have an xml parser in matlab, see xmlread.
xmlread returns a Java DOMnode object that allows you to navigate the xml structure in different ways. While it's very powerful, it's also not particularly intuitive and can be quite daunting so you may want to use this xml2struct FileExchange entry instead (or in addition). This will give you more or less the structure you desire.
As for your reformatting, I don't particularly see the point. Both are exactly the same xml and any decent code that parses xml will ignore whitespace anyway. If it's for human consumption, then a) xml is not designed for human consumption, b) there are plenty of XML beautifiers you can download.
