XMLwrite involuntary wrapping of long tags - not behaving as it should (bug) - upgrade from 2018a

1 次查看(过去 30 天)
Good afternoon,
today I tried to migrate to the new 2019a version.
All my current code ran pretty well as I had hoped, but there is one big issue for me. I write some of my results in XML, and the way XML is handled by Matlab has changed in a way that is undesired for me: It is regards to line breaks. I made a demo script to reproduce the bug so it would be easier to understand what I mean.
Problem: After writing number of attributes on the same tag eventually it will start to wrap them. From that first wrap, it will actually wrap every single attribute line-by-line.
The resulting output is most likely according to XML specs (it can be understood perfectly), but for a human it looks hideous.
After trying to change the bahaviour and playing with some javasettings I gave up and reverted to 2018a.
Imagine a 50-500 tag xml with this behaviour... ¯\_(ツ)_/¯
Does anybody understand what is going on here, why it has changed and if there is a way to mitigate this in the new release?
Thank you for reading
Cheers, Daniel
Script to reproduce the bug:
%% Create Document and Elements
docNode = com.mathworks.xml.XMLUtils.createDocument('Demo');
root = docNode.getDocumentElement; % Identify root element
Content = docNode.createElement('Content');
Analysis = docNode.createElement('Analysis');
Output = docNode.createElement('LongNamedElementWrapsSoonerYoullSee');
% Define the attributes
for i=1:15
Analysis.setAttribute(strcat('t',num2str(i)),'test');
end
for i=1:3
Output.setAttribute(strcat('t',num2str(i)),'AtLeastItWillGetTheTabsRight');
end
% Append everything
root.appendChild(Content);
Content.appendChild(Analysis);
Content.appendChild(Output);
% Write
xmlwrite('Demo.xml',docNode);
2018a output
<?xml version="1.0" encoding="utf-8"?>
<Demo>
<Content>
<Analysis t1="test" t10="test" t11="test" t12="test" t13="test" t14="test" t15="test" t2="test" t3="test" t4="test" t5="test" t6="test" t7="test" t8="test" t9="test"/>
<LongNamedElementWrapsSoonerYoullSee t1="AtLeastItWillGetTheTabsRight" t2="AtLeastItWillGetTheTabsRight" t3="AtLeastItWillGetTheTabsRight"/>
</Content>
</Demo>
2019a output
<?xml version="1.0" encoding="utf-8"?>
<Demo>
<Content>
<Analysis t1="test" t10="test" t11="test" t12="test" t13="test" t14="test" t15="test"
t2="test"
t3="test"
t4="test"
t5="test"
t6="test"
t7="test"
t8="test"
t9="test"/>
<LongNamedElementWrapsSoonerYoullSee t1="AtLeastItWillGetTheTabsRight" t2="AtLeastItWillGetTheTabsRight"
t3="AtLeastItWillGetTheTabsRight"/>
</Content>
</Demo>
  2 个评论
Guillaume
Guillaume 2019-5-21
I'm not sure you can call this a bug. As you've pointed out, from an xml point of view, the two outputs are identical. And as the actual formatting of the xml is not documented at all, you shouldn't have any expectation of that formatting. Personally, I would have expected the xml to be just one long line.
You can always run the file through a beautifier. But this also begs the question: why is a human looking at the raw xml?
DanielS
DanielS 2019-5-21
Thank you for your reponse, Guillaume!
Mathworks giveth and Mathworks taketh away? The fact of the matter is that the output of xmlwrite is not one-line, and it is being written out in some standardized way set up by the maintainer which is beyond the specification of XML - and I have relied myself upon this, no wrong in that.
The result from the 2019a release does not look proper. There is some sort of linewrap method that does not play out as intended. If it was doing the job correctly, it would not push a new line for every attribute after the first wrap. Surely, you must agree?
FYI: At our job we do use raw xml in several developments , because we use it as a plain-text vessel between different programs (don't ask why not JSON for this ;)).
//D

请先登录,再进行评论。

采纳的回答

Josh Kahn
Josh Kahn 2024-10-10
Here are the changes that should achieve your expected behavior.
Hope this helps,
Josh
%% Create Document and Elements
docNode = matlab.io.xml.dom.Document('Demo'); %% CHANGED
root = docNode.getDocumentElement; % Identify root element
Content = docNode.createElement('Content');
Analysis = docNode.createElement('Analysis');
Output = docNode.createElement('LongNamedElementWrapsSoonerYoullSee');
% Define the attributes
for i=1:15
Analysis.setAttribute(strcat('t',num2str(i)),'test');
end
for i=1:3
Output.setAttribute(strcat('t',num2str(i)),'AtLeastItWillGetTheTabsRight');
end
% Append everything
root.appendChild(Content);
Content.appendChild(Analysis);
Content.appendChild(Output);
% Write
writer = matlab.io.xml.dom.DOMWriter; %% CHANGED
writer.Configuration.FormatPrettyPrint = true; %% CHANGED
writeToFile(writer,docNode,'Demo.xml'); %% CHANGED

更多回答(2 个)

Guillaume
Guillaume 2019-5-21
and I have relied myself upon this, no wrong in that
Well, you're relying on undocumented behaviour. It's always risky and you can't really complain when it changes. Now if it were documented, then yes, you could complain loudly.
In any case, it looks like the behaviour is outside Mathworks control. You can look at the source code of xmlwrite and you'll see that it relies on Java for everything. The only thing that is matlab dependent is a XSLT file (in matlabroot\toolbox\matlab\iofun\+matlab\+io\+internal) which I don't believe has any effect on what you're seeing. So it looks to me that if a change has occured, it's in Java.
I've no longer got 2018a installed. 2018b and 2019a produce the same output.

DanielS
DanielS 2019-5-22
Thank you for checking, Guillaume.
Guillaume states that this behaviour is visible from 2018b and onwards.
It is suspicously convient that Mathworks introduced a class of tidy (link to mathworks doc) in the 2018b release. Just when xmlwrite stated acting up.
Solution
Matlab command
% XML attribute wrap fix
% Problem and cure introduced in version > 2018a
if ~verLessThan('matlab','9.5')
mlreportgen.utils.tidy("Demo.xml","OutputFile","Demo.xml","ConfigFile","tidy.cfg");
end
tidy.cfg:
Config options are documented on the HTML Tidy project page:
// Config for tidy, Matlab post 2018a fix for XMLwrite.
indent: 1
indent-spaces: 4
wrap: 0
wrap-attributes: 0
output-xml: yes
input-xml: yes

类别

Help CenterFile Exchange 中查找有关 Structured Data and XML Documents 的更多信息

产品


版本

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by