Searching a string on a table to get time

2 次查看(过去 30 天)
Hi,
I have an excel spreadsheet (attached). The table is basically information from a ticket system. The column are as follows: ID, creation date & time, several comments (each one in a different column) and ticket closing date & time.
The first step I do is reading it: Tbl = readtable(filename, 'ReadVariableNames', false);
I want to calculate:
1) the time between when the ticket was acknowledged and the creation time
2) the time between when the ticket is asked to be closed and when it is actually closed.
A ticket is acknowledged in different ways, but it always says "your ticket".
A ticket is asked to be closed in different ways, it says: "can this be closed?", ''can we close this?", "is this still an issue?" or "are you happy to close this?"
So, what I'm thinking is: searching the table for key phrases (like "your ticket"), and then reading the time of the corresponding cell. However, how can I do this without using a for loop to go through the columns?
Thanks

回答(2 个)

KSSV
KSSV 2023-12-12
You may try using functions like contains, strcmp
T = readtable('https://in.mathworks.com/matlabcentral/answers/uploaded_files/1566224/test.xlsx') ;
idx = contains(T.Comment,'received your ticket') ;
T = T(idx,:)
  1 个评论
Maria
Maria 2023-12-12
Thanks, the problem is that when you read the spreadsheet into a table, all the columns that are called "Comment" will get a different header in Matlab, like: Comment, Comment_1, Comment_2, etc. So doing using contains(T.Comment...) will only search in that specific column, not all the the Comment columns.
I tried also strcmp, and it gave me all zeros, which I think it makes sense since it returns one only if the 2 strings are identical. But I don't know if there is another way to use it, since I have never used it before.
Regardless, I still have the same issue about searching into all the Comment columns.

请先登录,再进行评论。


Stephen23
Stephen23 2023-12-12
编辑:Stephen23 2023-12-12
"I still have the same issue about searching into all the Comment columns."
The MATLAB documentation explains that you can use PATTERN objects to specify the variables/columns:
tbl = readtable('test.xlsx');
pat = "Comment" + wildcardPattern;
tbl = convertvars(tbl,pat,'string')
tbl = 6×13 table
ID Start Comment Comment_1 Comment_2 Comment_3 Comment_4 Comment_5 Comment_6 Comment_7 Comment_8 Comment_9 End __ ____________________ _______________________________________________________________________ ______________________________________________________________________ ________________________________________________________ ____________________________________________ _______________________________________ _________ _________ _________ _________ _________ ____________________ 5 30-Nov-2023 22:35:10 "01/Dec/23 02:04:19; We have received your ticket, we'll reply shortly" "04/Dec/23 06:29:02;I will now close this ticket as a duplicate." "" "" "" <missing> <missing> <missing> <missing> <missing> 04-Dec-2023 06:29:02 8 25-Nov-2023 22:00:04 "25/Nov/23 22:42:43;We have received your ticket, we'll reply shortly" "26/Nov/23 09:37:15;ALARM orange" "26/Nov/23 09:39:15;another alarm orange" "26/Nov/23 22:49:06;Alarm is off now" "27/Nov/23 08:10:39;can we close this?" <missing> <missing> <missing> <missing> <missing> NaT 12 22-Nov-2023 22:28:35 "22/Nov/23 22:47:22;blah blah" "22/Nov/23 23:14:17;Your ticket was received" "23/Nov/23 09:49:45;no data" "23/Nov/23 10:02:29;can this be closed?" "" <missing> <missing> <missing> <missing> <missing> 28-Nov-2023 12:19:52 15 18-Nov-2023 07:44:36 "18/Nov/23 07:44:37;Look at this picture" "18/Nov/23 08:08:13;We are in receipt of your ticket" "23/Nov/23 11:23:17;blah blah" "28/Nov/23 13:46:22;is this still an issue?" "" <missing> <missing> <missing> <missing> <missing> 04-Dec-2023 10:53:25 17 12-Nov-2023 17:27:55 "12/Nov/23 18:44:51;Your ticket was received" "13/Nov/23 05:23:20;blah blah" "13/Nov/23 07:42:10;Are you happy to close this ticket?" "" "" <missing> <missing> <missing> <missing> <missing> 16-Nov-2023 20:18:38 20 10-Nov-2023 08:26:27 "10/Nov/23 08:32:06;blah blah" "10/Nov/23 09:59:14;We have received your ticket, we'll reply shortly" "10/Nov/23 14:48:03;is this still an issue?" "" "" <missing> <missing> <missing> <missing> <missing> 13-Nov-2023 07:30:36
idx = contains(tbl{:,pat},"your ticket")
idx = 6×10 logical array
1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
vec = ["can this be closed?","can we close this?","is this still an issue?","are you happy to close this?"];
idy = contains(tbl{:,pat},vec)
idy = 6×10 logical array
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
Then you can use ANY, FIND, etc. as required to obtain the columns or rows that you require from the table. Note that not all rows have both start and end text.
  3 个评论
Maria
Maria 2023-12-13
Thank you, this looks great. Unfortunately, it seems I can't use pattern with R2018a (it seems it started with 2020b).
But I will search how to do a similar thing in my Matlab version.
Stephen23
Stephen23 2023-12-13
编辑:Stephen23 2023-12-13
"Unfortunately, it seems I can't use pattern with R2018a (it seems it started with 2020b)."
Obtain the column/variable names, use text tools to select the ones you want, then use the names you selected, e.g.:
tbl = readtable('test.xlsx');
pat = tbl.Properties.VariableNames; % changed this line
pat = pat(startsWith(pat,'Comment')); % changed this line
tbl = convertvars(tbl,pat,'string')
tbl = 6×13 table
ID Start Comment Comment_1 Comment_2 Comment_3 Comment_4 Comment_5 Comment_6 Comment_7 Comment_8 Comment_9 End __ ____________________ _______________________________________________________________________ ______________________________________________________________________ ________________________________________________________ ____________________________________________ _______________________________________ _________ _________ _________ _________ _________ ____________________ 5 30-Nov-2023 22:35:10 "01/Dec/23 02:04:19; We have received your ticket, we'll reply shortly" "04/Dec/23 06:29:02;I will now close this ticket as a duplicate." "" "" "" <missing> <missing> <missing> <missing> <missing> 04-Dec-2023 06:29:02 8 25-Nov-2023 22:00:04 "25/Nov/23 22:42:43;We have received your ticket, we'll reply shortly" "26/Nov/23 09:37:15;ALARM orange" "26/Nov/23 09:39:15;another alarm orange" "26/Nov/23 22:49:06;Alarm is off now" "27/Nov/23 08:10:39;can we close this?" <missing> <missing> <missing> <missing> <missing> NaT 12 22-Nov-2023 22:28:35 "22/Nov/23 22:47:22;blah blah" "22/Nov/23 23:14:17;Your ticket was received" "23/Nov/23 09:49:45;no data" "23/Nov/23 10:02:29;can this be closed?" "" <missing> <missing> <missing> <missing> <missing> 28-Nov-2023 12:19:52 15 18-Nov-2023 07:44:36 "18/Nov/23 07:44:37;Look at this picture" "18/Nov/23 08:08:13;We are in receipt of your ticket" "23/Nov/23 11:23:17;blah blah" "28/Nov/23 13:46:22;is this still an issue?" "" <missing> <missing> <missing> <missing> <missing> 04-Dec-2023 10:53:25 17 12-Nov-2023 17:27:55 "12/Nov/23 18:44:51;Your ticket was received" "13/Nov/23 05:23:20;blah blah" "13/Nov/23 07:42:10;Are you happy to close this ticket?" "" "" <missing> <missing> <missing> <missing> <missing> 16-Nov-2023 20:18:38 20 10-Nov-2023 08:26:27 "10/Nov/23 08:32:06;blah blah" "10/Nov/23 09:59:14;We have received your ticket, we'll reply shortly" "10/Nov/23 14:48:03;is this still an issue?" "" "" <missing> <missing> <missing> <missing> <missing> 13-Nov-2023 07:30:36
out = rowfun(@myfun, tbl, 'NumOutputs',2, 'OutputVariableNames',["acknowledge","askclose"])
out = 6×2 table
acknowledge askclose __________________ __________________ 1/Dec/23 02:04:19 4/Dec/23 06:29:02 25/Nov/23 22:42:43 27/Nov/23 08:10:39 22/Nov/23 23:14:17 23/Nov/23 10:02:29 18/Nov/23 08:08:13 28/Nov/23 13:46:22 12/Nov/23 18:44:51 13/Nov/23 07:42:10 10/Nov/23 09:59:14 10/Nov/23 14:48:03
out.acknowledge - tbl.Start
ans = 6×1 duration array
03:29:09 00:42:39 00:45:42 00:23:37 01:16:56 01:32:47
tbl.End - out.askclose
ans = 6×1 duration array
00:00:00 NaN 122:17:23 141:07:03 84:36:28 64:42:33
function [start,close] = myfun(varargin)
X = cellfun(@isstring,varargin);
S = [varargin{X}];
Y = find(contains(S,"your ticket",'IgnoreCase',true),1,'first');
Z = find(strlength(S)>0,1,'last');
Ty = split(S(Y),';');
Tz = split(S(Z),';');
start = datetime(Ty(1), 'Format','d/MMM/yy HH:mm:ss');
close = datetime(Tz(1), 'Format','d/MMM/yy HH:mm:ss');
end

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Environment and Settings 的更多信息

标签

产品


版本

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by