Vectorize a table row with mixed numeric values
    6 次查看(过去 30 天)
  
       显示 更早的评论
    
MATLAB has the ability to concatenate different numerical types and make an array of the type with the most precision, sort of.
If I concatenate a logical type with any numerical type, it will promote the logical value to that numerical type, examples:
[logical(1), uint8(1)]
 will give a 1x2 uint8 vector
likewise
[logical(1), uint64(1)] 
will give a 1x2 uint64 vector
and
[logical(1), 1.5] 
will give a 1x2 double vector
 However, if you concatenate multiple numeric types, they will all be converted to the first integer numeric type listed
[uint8(1), 1.5, uint64(256)]
 will give a 1x3 uint8 vector, rounding 1.5 up to 2 and truncating 256 to 255.
[logical(1), uint16(256), 1.5, uint64(66000)] 
will give a 1x4 uint16 vector, rounding 1.5 up to 2 and truncating 66000 to 65535
[logical(1), 1.5, uint16(256), uint64(33000)] 
will also give a 1x4 uint16 vector.
So while logical will get promoted to the first integer type, all higher-precision integers and floating point values will get demoted to the integer value.
Further, if you're using is a user-defined class that is a subclass to an int or uint type, regardless of its position, it will then try to convert all the values to that user-defined class.
For example, if i create an enumeration class octal
classdef octal < uint8
    %OCTAL Test Class Definition
    %   First 8 integers
    enumeration 
        zero    (0);
        one     (1);
        two     (2);
        three   (3);
        four    (4);
        five    (5);
        six     (6);
        seven   (7);
    end
end
And I concatenate:
[1, 2, octal.six, 5]
 it converts all of the numbers to the corresponding octal class object and I get the output 
[one two six five]
But if I add a number that isn't a part of the enumeration, such as:
[9, 1, octal.six, 5] 
I get the error: 
Error using octal
Cannot find a member of the 'octal' enumeration class that corresponds to each element of the given input argument.
So, here's my dilemma. I have a very large table variable that I am saving to disk. All of the table variables are unsigned integers. Each variable has a different range of valid values. To save memory and disk space, each variable is set to the lowest precision that contains the range of values required (e.g. if no value can be > 255, the variable is uint8). Additionally, a couple of variables are restricted not just in their range, but only to specific (non-continuous) values in that range, and I'm using an enumeration to store them (for the enumerations - each integer value represents a code, and the enumeration names are the names corresponding to those codes).
One of the columns is also a checksum for each row. So, what I want to do is verify the checksum by doing the necessary math on the other values in the row. If I could make the table row a single vector all of type uint64, I could vectorize the math for the checksum. I can, of course, do a for loop through each element in the row I'm calculating the checksum for - but once my data populates and I have thousands of rows, this takes up considerable time. Is there any way to vectorize converting a table row like this to uint64 without losing precision?
0 个评论
回答(1 个)
  Steven Lord
    
      
 2023-6-12
        If I concatenate a logical type with any numerical type, it will promote the logical value to that numerical type, examples:
However, if you concatenate multiple numeric types, they will all be converted to the first integer numeric type listed
If one of the arrays you're concatenating together is of an integer type yes, as per this documentation page.
Further, if you're using is a user-defined class that is a subclass to an int or uint type, regardless of its position, it will then try to convert all the values to that user-defined class.
So, here's my dilemma. I have a very large table variable that I am saving to disk.
Saving as a MAT-file as a table array or writing to some type of file as a regular numeric array?
One of the columns is also a checksum for each row. So, what I want to do is verify the checksum by doing the necessary math on the other values in the row. If I could make the table row a single vector all of type uint64, I could vectorize the math for the checksum. I can, of course, do a for loop through each element in the row I'm calculating the checksum for - but once my data populates and I have thousands of rows, this takes up considerable time. Is there any way to vectorize converting a table row like this to uint64 without losing precision?
Variables in a table array must be of one type, but you can have data of different types in a row of a table (the Name variable may be a string array while Age a double or an integer and Smoker a logical true or false, as an example.)
Rather than computing checksums on each row separately, why not vectorize your checksum calculation?
T = array2table(magic(5));
T.Var3 = int8(T.Var3);
T.Var5 = uint64(T.Var5)
See that each variable is of the expected class.
varfun(@class, T)
Now instead of computing using T{1, 'Var1'}, T{1, 'Var2'}, etc. just use T.Var1, T.Var2, etc. and peform the desired conversion on the variables as a whole.
y = single(T.Var3) + single(T.Var5);
class(y)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

