decoding utf-8 type emoji codes and special characters from facebook data
48 次查看(过去 30 天)
显示 更早的评论
Hi, I recently downloaded the messenger data from facebook in form of ".json" format.
This format was new for me and it was quiet interesting to load,play around the file and make it like a conversation.
The problem is with decoding the emojis. I have no idea about the format. It looked something like this..
"\u00f0\u009f\u0098\u0082 \u00f0\u009f\u0098\u0082" which, the actual emoji I used is ??.
In matlab as shown in the figure it shows some rubbish "ð ð".
After a long research in the internet, I came to know that it is Unicode-8 format. So, I tried to read the file using unicode-8 format by looking at some answers form matlab central..
clear; clc
fname = 'message_keller.json';
fid = fopen(fname, 'rb');
raw = fread(fid, '*uint8')';
str = native2unicode(raw,'UTF-8');
fclose(fid);
val = jsondecode(str);
But it still was showing "ð ð".
The above link was the method I found for decoding. But that was for powershell.
Can anyone help me decode the unicode so that it can be viewed in matlab and other softwares (curently I am planning to export the conversation to excel)..?
4 个评论
Guillaume
2018-10-12
I wanted the raw json, not the stuff you've parsed when it is too late to get the right characters. You can just replace the confidential bits with xs or dots.
Or just provide the actual portion of the raw json that correspond to an actual message, e.g, one of the
{"message":{"sender_name":"Don't care","timestamp_ms":whatever,"content":"this is what I need","type":"Generic"}}
section.
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 JSON Format 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!