converting string to num and table

65 次查看(过去 30 天)
I have a table containing string data that represents cyclone certain and uncertain tracks. Each cell in the table may contain multiple track numbers (seperated by a space), where the presence of 'x' beside a number indicates uncertainty in that track. For example, in tempData(7,1) = "30 25x", '30' is a certain track, and '25' is uncertain.
I need to write a code that parses each cell in a column of tempData, separates the track numbers and their uncertainty markers ('x'), and outputs a (in this case) 4-column table for each column in tempData. The headers of the output table should be track1, uncert1, track2, uncert2, corresponding to the parsed tracks and their certainty indicators (marked by 0 or 1). If the input has 3 tracks, the then output table should have a 6 column table
Finally, I want to store each output table in a cell array marked by the varname of the input column (y98, y99, y20, etc), as I need to process over 200 columns in my original data.
I have attached a sample input (tempData) and output for the first coumn that i want (tempDataOut).
Thanks in advance.

采纳的回答

Stephen23
Stephen23 2024-8-22,21:06
编辑:Stephen23 2024-8-23,6:29
Tip for the future: upload your raw data, not a data extract that you have already imported and modified.
S = load('sample1.mat')
S = struct with fields:
tempData: [10x3 table] tempDataOut: [10x5 table]
S.tempData % lets take a look at the input data
ans = 10x3 table
y89 y90 y20 ________ ____________ __________________ "3x 4" {0x0 double} {[ 5]} "5" {'4x' } {'14 8 11 16x 18'} "11" {[ 23]} {0x0 cell } "15x" {'24x' } {[ 13]} "22x" {'33x' } {[ 17]} "26x" {0x0 cell } {0x0 cell } "30 25x" {'48x 46x' } {[ 28]} "" {[ 51]} {[ 35]} "32" {[ 52]} {0x0 cell } "35" {'53x' } {[ 42]}
That table data is a bit of a mess, so the first thing we will do is convert the table data to a consistent type (fixing the data file import would make this task easier/superfluous):
V = S.tempData.Properties.VariableNames;
T = varfun(@mystring,S.tempData)
T = 10x3 table
mystring_y89 mystring_y90 mystring_y20 ____________ ____________ ________________ "3x 4" "" "5" "5" "4x" "14 8 11 16x 18" "11" "23" "" "15x" "24x" "13" "22x" "33x" "17" "26x" "" "" "30 25x" "48x 46x" "28" "" "51" "35" "32" "52" "" "35" "53x" "42"
C = varfun(@myvarfun,T, 'OutputFormat','cell');
C = cellfun(@myrename,C,V, 'uni',0)
C = 1x3 cell array
{10x5 table} {10x5 table} {10x11 table}
S.tempDataOut % your expected output for comparison against C{1}
ans = 10x5 table
y89 track1 uncert1 track2 uncert2 ________ ______ _______ ______ _______ "3x 4" 3 1 4 0 "5" 5 0 0 0 "11" 11 0 0 0 "15x" 15 1 0 0 "22x" 22 1 0 0 "26x" 26 1 0 0 "30 25x" 30 0 25 1 "" 0 0 0 0 "32" 32 0 0 0 "35" 35 0 0 0
C{:} % all output tables
ans = 10x5 table
y89 track1 uncert1 track2 uncert2 ________ ______ _______ ______ _______ "3x 4" 3 1 4 0 "5" 5 0 0 0 "11" 11 0 0 0 "15x" 15 1 0 0 "22x" 22 1 0 0 "26x" 26 1 0 0 "30 25x" 30 0 25 1 "" 0 0 0 0 "32" 32 0 0 0 "35" 35 0 0 0
ans = 10x5 table
y90 track1 uncert1 track2 uncert2 _________ ______ _______ ______ _______ "" 0 0 0 0 "4x" 4 1 0 0 "23" 23 0 0 0 "24x" 24 1 0 0 "33x" 33 1 0 0 "" 0 0 0 0 "48x 46x" 48 1 46 1 "51" 51 0 0 0 "52" 52 0 0 0 "53x" 53 1 0 0
ans = 10x11 table
y20 track1 uncert1 track2 uncert2 track3 uncert3 track4 uncert4 track5 uncert5 ________________ ______ _______ ______ _______ ______ _______ ______ _______ ______ _______ "5" 5 0 0 0 0 0 0 0 0 0 "14 8 11 16x 18" 14 0 8 0 11 0 16 1 18 0 "" 0 0 0 0 0 0 0 0 0 0 "13" 13 0 0 0 0 0 0 0 0 0 "17" 17 0 0 0 0 0 0 0 0 0 "" 0 0 0 0 0 0 0 0 0 0 "28" 28 0 0 0 0 0 0 0 0 0 "35" 35 0 0 0 0 0 0 0 0 0 "" 0 0 0 0 0 0 0 0 0 0 "42" 42 0 0 0 0 0 0 0 0 0
function S = mystring(C)
% Convert cell array of mixed numeric/char/string to string array.
C(cellfun(@isempty,C)) = {''};
S = string(C);
end
function T = myvarfun(S)
% Convert string array to table (with certain & uncertain columns).
assert(~any(contains(S,'-')),'Minus character is not supported')
T = table(S);
C = regexprep(S,'(\d+)x','-$1');
N = numel(C);
M = nan(N,0);
for k = 1:N
V = sscanf(C{k},'%f');
M(k,1:numel(V)) = V;
end
U = +(M<0);
M = abs(M);
for k = 1:size(M,2)
T = addvars(T,M(:,k),U(:,k),'NewVariableNames',["track","uncert"]+k);
end
end
function T = myrename(T,v1)
% Rename first table column to v1.
T.Properties.VariableNames{1} = v1;
end
  3 个评论
Sarvesh
Sarvesh 2024-8-23,0:50
thanks @Stephen23. Just what i needed. Also, the original data was even more messier than the sample I gave which is why i did not put the whole original data. But i will keep your suggestion in mind for my future questions. Thanks once again.

请先登录,再进行评论。

更多回答(1 个)

Karanjot
Karanjot 2024-8-22,6:34
Hi Sarvesh,
To achieve this task, you can write a script in MATLAB that processes each column of your tempData table, parses the track numbers and their uncertainty markers, and stores the results in a cell array. Here's a step-by-step guide to help you accomplish this:
  • Initialize Variables: Create a cell array to store the output tables, and define the headers for the output tables.
  • Parse Each Column: Loop through each column in tempData, and for each cell, split the string by spaces to separate track numbers and uncertainty markers.
  • To carry out the string manipulation as described above, You can use the strsplit function to split a string into parts based on a delimiter and the endsWith function to check if a string ends with a specified substring. For example:
parts = strsplit('30 25x', ' '); % Splits into {'30', '25x'}
isUncertain = endsWith('25x', 'x'); % Returns true
  • Create Output Table: For each parsed cell, determine the number of tracks and uncertainty indicators, and create a table with appropriate headers.
  • Store Results: Store each output table in the cell array with the corresponding variable name.
To know more about the functions mentioned above, Please refer to the following documentation:
I hope this helps!

类别

Help CenterFile Exchange 中查找有关 Tables 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by