Regex for string match

5 次查看(过去 30 天)
ML_Analyst
ML_Analyst 2023-9-24
评论: ML_Analyst 2023-9-25
I have a huge string array 1*50000 length like below:
Stock_field1_img
Sys_tim_valt98.qaf.rat.app.gui
Enable1.HSB_setblcondition.Enable_logic.ui
P_k12.delay.init_func_delay_update.Sys
#fat_11ks.ergaa.ths.dell
$thispt.dynmem11.ide.gra
.....
.....
I am looking for a regex, which can search this array based on "user input". For ex,
if user gives st* then it should get all the strings starting with "st" ,
if user gives *st then it should get all strings ending with "st",
if user gives *st* then it should get all strings which has st in between start and end,
user can also give *st*app.*sys* then it should list all combinations which has strings with st in between, followed by app. in between and followed by sys in between.
I tried multiple combos like below and also other combinations
expression = '\w* + signal + \w*';
a = regexp(str_array, ,'match','ignorecase');
but doesn't work as intended, could someone help with this.

采纳的回答

Voss
Voss 2023-9-24
I think it may be tricky to get this to work for any possible expression the user may enter, because every special character used in regexp will have to be modified in the user-input expression. For example, you want * to represent any character sequence, which in regexp is .* so you have to replace * with .* in the user-input expression before passing to regexp; other special characters you want to treat literally have to be escaped (by prepending \), so that . becomes \. and $ becomes \$ etc. The function get_matches defined below does this replacement explicitly for a few special characters before passing the expression to regexp and returns the matches. You can add more special characters to it as needed.
str = [
"Stock_field1_img"
"Sys_tim_valt98.qaf.rat.app.gui"
"Enable1.HSB_setblcondition.Enable_logic.ui"
"P_k12.delay.init_func_delay_update.Sys"
"#fat_11ks.ergaa.ths.dell"
"$thispt.dynmem11.ide.gra"
];
user_input = "st*"; % return any string starting with st
matched_str = get_matches(str,user_input)
matched_str = "Stock_field1_img"
user_input = "*.sys"; % ending with .sys
matched_str = get_matches(str,user_input)
matched_str = "P_k12.delay.init_func_delay_update.Sys"
user_input = "*del*"; % containing del
matched_str = get_matches(str,user_input)
matched_str = 2×1 string array
"P_k12.delay.init_func_delay_update.Sys" "#fat_11ks.ergaa.ths.dell"
user_input = "$*"; % starting with $
matched_str = get_matches(str,user_input)
matched_str = "$thispt.dynmem11.ide.gra"
user_input = "*.*d*.*"; % containing d somewhere between two .s
matched_str = get_matches(str,user_input)
matched_str = 3×1 string array
"Enable1.HSB_setblcondition.Enable_logic.ui" "P_k12.delay.init_func_delay_update.Sys" "$thispt.dynmem11.ide.gra"
function a = get_matches(str,user_input)
regex = replace(user_input,["*",".","$","^"],[".*","\.","\$","\^"]);
a = rmmissing(regexpi(str,"^"+regex+"$",'match','once'));
end
  3 个评论
Stephen23
Stephen23 2023-9-25

Note that regexptranslate can be used to escape all special characters:

https://www.mathworks.com/help/matlab/ref/regexptranslate.html

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Characters and Strings 的更多信息

产品


版本

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by