Does matlab support parquet partitions

4 次查看(过去 30 天)
I have a large data set written using parquet partitioning. The partition variable is called 'mdRun', and I have 10 parquet files created in 10 directories as follows:
.../events/mdRun=0/events-0.parquet
../events/mdRun=1/events-0.parquet
and so on. I created these files using pyarrow Hive partitioning.
Using pyarrow, I can read the parquet file corresponding to a single partition using the filter argument, which will read only the parquet file stored in the appropriate directory. As a nice side effect, the mdRun column is not stored in the parquet file, but it is automatically included when I read a partition file(s).
Is it possible to read a parquet partitioned dataset in matlab in the same way?
Thank you!

回答(1 个)

Sudarshan
Sudarshan 2023-1-2
Hi Jerry,
As per my knowledge, the feature is not supported by MATLAB in R2022b. This request has already been forwarded to the relevant team.
However, MATLAB R2022b does support parquet file reading and writing. I have attached a few documentation links that may help you in working with parquet functions.
You can refer to the link below for various functions that could be useful in your case:
You can refer to link below for the detailed documentation of the data type mappings:
To help you read parquet files, you can refer the link below:
I hope that this helps!

类别

Help CenterFile Exchange 中查找有关 Database Toolbox 的更多信息

产品


版本

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by