How can I keep the first two elements from CSV values in a string

I have a Table where one of the columns contains text elements separated by commas. I want to keep the first element (if there is only one) or the first two (if the are more than one). There also could be empty lines. For example, from:
abcd, efgh, tyui
[]
lkjh, poiu
wert
I want to get
abcd, efgh
[]
lkjh, pou
wert
I know how to do this with a for loop. But the idea would be to use regular expressions or something similar to accelerate the process. The Table has almost 2M elements.
Any suggestion would be of great help. Thanks.

 采纳的回答

S = ["";"philosopher,historian,writer,political activist,literary critic";"philosopher";"philosopher,writer"]
S = 4x1 string array
"" "philosopher,historian,writer,political activist,literary critic" "philosopher" "philosopher,writer"
T = regexp(S,'^[^,]*(,[^,]+)?','match','once','emptymatch')
T = 4x1 string array
"" "philosopher,historian" "philosopher" "philosopher,writer"

更多回答(1 个)

FilteredData = regexp(YourTable.ColumnName, '^[^,]+(,\s+[^,]+)?', 'match', 'once');

2 个评论

Thanks, Walter, for your soon response. This regexp works to extract the words before the first comma, but I need to extract the words before the second comma (if there are more than one). Suppose I have:
"philosopher,historian,writer,political activist,literary critic"
"philosopher"
"philosopher,writer"
I want to get
"philosopher,historian"
"philosopher"
"philosopher,writer"
I would appreciate if you can provide a regexp to solve this problem.
Aparently, this regexp works:
regexp(Table.Colum, '^[^,]*(?:,[^,]*)?','match')

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Characters and Strings 的更多信息

产品

版本

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by