Issue with native2unicode and windows-1252 encoding
15 次查看(过去 30 天)
显示 更早的评论
Hi all,
I'm trying to encode some bytes into a character set using the windows-1252 encoding and I've checked that native2unicode
回答(3 个)
Walter Roberson
2022-1-14
source = char(0:511)
bytes = unicode2native(source, 'windows-1252')
backport = char(bytes)
whichdiffer = find(source(1:256) ~= backport(1:256) )
source(whichdiffer)
bytes(whichdiffer)
backport(whichdiffer)
What this is telling us is that Unicode 129 to 141 are not represented in Windows 1252
bytes2 = uint8(129:141)
encodes_as = native2unicode(bytes2, 'windows-1252')
double(encodes_as)
Looks about right.
2 个评论
Walter Roberson
2022-1-17
code point 26 is the standard value to substitute for codepoints that cannot be represented
https://en.m.wikipedia.org/wiki/Substitute_character
Borja Heriz
2022-1-17
1 个评论
Rik
2022-1-17
This is an answer, but it looks like a comment. Please use the comment sections to post comments. The order of answers can change, which will make reading back confusing.
Please post this as a comment and delete the answer.
When you do, I (or Walter) will post something along these lines:
Why do you think 153 and 156 are encoded as the same character? They are displayed as the same character, but that is probably due to a limitation in the display, as this could very well encode a control character without a proper symbol.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Data Type Conversion 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!