More efficient jsonencode for large data?

12 次查看(过去 30 天)
So I am using matlabs jsonencode function to encode a structure array to a character array, and then write this to an output json text file. The structure array mat file equalls approximately 10GB. This takes both a long time, and alot of computer memory.
I then import the JSON text file into mongodb.
Is there a more efficient way to directly get the data into mongodb? Maybe the only option is the MatLab database toolbox...?
Thanks

回答(3 个)

Seb
Seb 2017-8-1
I have found a temporary way to better do this.
I still use matlabs built in jsonencode function, and output a JSON file, but I do it in small stages/chunks. For example, I have a devision factor, which determines how many chunks I do. This then splits the data structure into row indices. I then encode the json within those row indices, and write that chunk of data to the file. I then go onto the next chunk and so forth until all data is written.
If someone is in need of the code then let me know and I can provide it. It does not take any less time than a single jsonencode call would be, however greatly reduces load and memory consumption on the computer, and can be made to be 32-bit compliant (each chunk cant be larger than ~1gb)
  1 个评论
Marcel
Marcel 2020-4-17
Hi Seb, do you still have the code for this? I am interested in this. How do you puzzle together the parts in the json file? THANKS

请先登录,再进行评论。


Carl
Carl 2017-7-25
编辑:Carl 2017-7-25
You can try following the example in this File Exchange function, which uses the MongoDB Java driver to insert a document:
The following Stackoverflow post also has some good suggestions:
With these approaches, you should be able to avoid writing text to disk, which may be more efficient. Note that this workflow has not been qualified, and it may or may not be more efficient than your current method.

Marco Rossi
Marco Rossi 2021-7-28
Is matlab development team planning to remove this limitation? I also noticed that, when the 32 bit limit is overtaken, no warnings/errors are returned.
  1 个评论
Joris Brouwer
Joris Brouwer 2022-8-11
I second that. Running into what seems to be exponentionally / halting jsonencode performance as well. Trying the chunked solution mentioned above.

请先登录,再进行评论。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by