Is it possible to get a Table from an mxArray in a MEX function?

4 次查看(过去 30 天)
I have a bunch of data in Tables (i.e. using the table data type introduced a few releases ago) and I would like to pass this to a MEX function.
I cannot find anything in the mxArray documentation regarding how to get a table from a mxArray. Does such functionality exist?
If it doesn't exist, what are the alternatives and/or is it planned for the near future?
  1 个评论
Todd Leonhardt
Todd Leonhardt 2016-5-17
编辑:Todd Leonhardt 2016-5-17
From what I have been able to determine so far, there is no support for getting a table from an mxArray, and hence no way of directly passing a table to a MEX function.
The two viable options I've found are:
1) Convert the table to a cell array using table2cell() and pass that. This has the advantage of an easy conversion, but it loses the column names from the table.
2) Convert the table to a struct of arrays manually where the field names in the struct are the column names from the table. This is a more burdensome conversion but has the advantage of preserving the names.
NOTE: I also tried using the table2struct() function to convert to an array of structs, but the performance of that option was abysmally bad to the point that I view it as not being a viable option.

请先登录,再进行评论。

采纳的回答

Todd Leonhardt
Todd Leonhardt 2016-5-18
编辑:Todd Leonhardt 2016-5-18
The best solution I could find is to create a MATLAB function to convert a table to struct of arrays as shown below (and then pass that struct to the MEX function). The function for doing so is attached.
I would very much appreciate hearing about other proposed solutions from those in the community who are more expert than myself.
function [ outStruct ] = table2structofarrays( inTable )
%TABLE2STRUCTOFARRAYS Convert a table to a struct of arrays.
% Usage: outStruct = TABLE2STRUCTOFARRAYS( inTable )
%
% Convert a table with M rows and N variables to a struct with N fields,
% each of which contains a 1-dimensional array of length M and where the
% field names in the struct are the same as the variable names in the
% table.
%
% NOTE: There ia a built-in function TABLE2STRUCT which converts a table to
% an array of structs. However, there are HUGE performance advantages of
% a struct of arrays over an array of structs.
% Make sure the input really is a table
if ~isa(inTable, 'table')
error('Error. Input to function %s must be a table, not a %s', mfilename, class(inTable))
end
% Create an empty struct with no fields
outStruct = struct;
% If the table has explicitly defined row names, then add a field for these
if ~isempty(inTable.Properties.RowNames)
outStruct = setfield(outStruct, 'RowNames', inTable.Properties.RowNames)
end
% Iterate through all of the variables in the table
for varNum=1:width(inTable)
% Get the variable name as a cell array with 1 element
varNameCell = inTable.Properties.VariableNames(varNum);
% Extract the variable name as a string
varName = varNameCell{1};
% Add a new field to the struct containing the data for this variable
outStruct = setfield(outStruct, varName, inTable.(varNum));
end
end

更多回答(1 个)

James Tursa
James Tursa 2016-5-20
编辑:James Tursa 2016-5-21
As you have already discovered, the table class is not directly supported in the mex API. The table class is a classdef type of object, and the only support you get in the mex API for classdef objects is mxGetProperty and mxPutProperty. These have two drawbacks when working with tables. First, mxGetProperty and mxPutProperty both force a deep data copy ... so simply examining large properties in a mex routine can seriously blow up your memory. Not good. Second, you have to know the property names in order to use these routines. For your own classdef objects at least you know that. But for tables you don't ... at least not at the mex level. Consider the following example:
% file myclass.m
classdef myclass
properties
a
b
c
end
end
Then at the MATLAB command line:
>> a = rand(10,1);
>> b = int32(a*100);
>> c = a<.5;
>> t = table(a,b,c)
t =
a b c
_______ __ _____
0.15761 16 true
0.97059 97 false
0.95717 96 false
0.48538 49 true
0.80028 80 false
0.14189 14 true
0.42176 42 true
0.91574 92 false
0.79221 79 false
0.95949 96 false
>> m = myclass;
>> m.a = a;
>> m.b = b;
>> m.c = c;
>> m
m =
myclass with properties:
a: [10x1 double]
b: [10x1 int32]
c: [10x1 logical]
Now look at the properties of each:
>> properties(t)
Properties for class table:
a
b
c
Properties
>> properties(m)
Properties for class myclass:
a
b
c
Seems pretty straightforward, right? Both of them show that they have properties named 'a', 'b', and 'c'. (And t apparently has an extra property named 'Properties'). You can even get at them with the .property syntax:
>> t.a
ans =
0.1576
0.9706
0.9572
0.4854
0.8003
0.1419
0.4218
0.9157
0.7922
0.9595
>> m.a
ans =
0.1576
0.9706
0.9572
0.4854
0.8003
0.1419
0.4218
0.9157
0.7922
0.9595
Etc.
But a look at these variables inside a mex routine reveals the truth ... that t is not a classdef object with a first level property named 'a'. E.g.,
// table_test1.c
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mxArray *property;
mexPrintf("Class name of input is %s\n",mxGetClassName(prhs[0]));
property = mxGetProperty(prhs[0],0,"a");
mexPrintf("The 'a' property pointer (a deep copy) is %p\n",property);
property = mxGetProperty(prhs[0],0,"b");
mexPrintf("The 'b' property pointer (a deep copy) is %p\n",property);
property = mxGetProperty(prhs[0],0,"c");
mexPrintf("The 'c' property pointer (a deep copy) is %p\n",property);
}
>> table_test1(m)
Class name of input is myclass
The 'a' property pointer (a deep copy) is 08C08720
The 'b' property pointer (a deep copy) is 08C08E20
The 'c' property pointer (a deep copy) is 08C08100
>> table_test1(t)
Class name of input is table
The 'a' property pointer (a deep copy) is 00000000
The 'b' property pointer (a deep copy) is 00000000
The 'c' property pointer (a deep copy) is 00000000
So the thot plickens, since those NULL pointer returns indicate that the 'a', 'b', and 'c' properties are not there. As far as the physical storage is concerned, at least at the first level, there are no properties named 'a', 'b', or 'c' for tables like there are for the simply user defined myclass classdef object. This is, of course, a major complication for the mex programmer. Not only are classdef objects poorly supported (only deep copies with mxGetProperty and mxPutProperty), but with tables you can't even get at the data since you don't know how to get at it. The only apparent clue is that the properties(t) output is formatted differently than the properties(m) output ... leading me to believe that this is a special output from MATLAB because those properties really aren't at the first level, even though you can get at them from MATLAB with the .property notation.
Well, what the heck is in that extra property named Properties?
>> t.Properties
ans =
Description: ''
VariableDescriptions: {}
VariableUnits: {}
DimensionNames: {'Row' 'Variable'}
UserData: []
RowNames: {}
VariableNames: {'a' 'b' 'c'}
At least this is something. Properties is a struct with a variety of information in it. In particular, Properties.VariableNames is a cell array of strings that contain the property names. All well and good ... at least you can get at all of this stuff in a mex routine if you want, but unfortunately there is nothing in there that connects you to the data. My guess is that the data is locked up inside a private area of the object, and there is no way to get at it from within a mex routine. In fact, when I hack into the mxArray variables inside a mex routine I can see all 3 shared copies of a, b, and c (the variables a, b, c in the workspace, the properties .a, .b, .c that are part of m, and the "properties" .a, .b, .c that are part of t). But other than the (not officially supported and highly discouraged) hack, I can think of no way to get at the data inside a mex routine.
So you are stuck with pulling the data out at the MATLAB level like you are doing before you pass it into the mex routine. My only advice here is to try to do it in a way that results in shared data copies. Cells and structs (and the old OOP class method btw) are very mex friendly since mxGetField and friends do not do a deep data copy ... they return the actual pointer to the data area inside the struct (yea!).
One thing I would change in your code would be to move the RowNames field to the end of your struct, not at the beginning. That way you always know that the field numbering 1, 2, 3, etc always corresponds to your variables. The way you have it currently coded the variables could be in either fields 1, 2, 3, etc or in fields 2, 3, 4, etc. The only way to know is to check to see if RowNames is present. Why make the user do that? In fact, why not just always append it at the end even if it is empty? E.g., something like this:
% Get the variable names
VariableNames = inTable.Properties.VariableNames;
% Create a 1x1 struct with desired fields
struct_arguments = cell(2,numel(VariableNames));
struct_arguments(1,:) = VariableNames;
outStruct = struct(struct_arguments{:});
% Iterate through all of the variables in the table
for varNum=1:width(inTable)
% Store the table variable in the struct
outStruct.(VariableNames{varNum}) = inTable.(varNum);
end
% Always append the row names even if they are empty
outStruct.RowNames = inTable.Properties.RowNames;
FINAL NOTE: I have checked with a mex hack of the resulting struct that the field variables are indeed shared copies (in this case reference copies) of the original table variables, so your method is a pretty good one for data efficiency.
  1 个评论
Todd Leonhardt
Todd Leonhardt 2016-5-21
James - thanks for your very detailed answer! I had spent a little bit of time hacking around trying to "coax" the raw table data out of an mxArray at the MEX level and didn't have any luck.
And that is an excellent suggestion about moving the RowNames field to the end.

请先登录,再进行评论。

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by