How to calculate average inter-arrival time from data set?

14 次查看(过去 30 天)
Hi all,
I am attempting to calculate the average inter-arrival time of hospital admissions for each hour during the afternoon shift (14:00-22:00) for each of the 45 days that I have data (note: the days are not consecutive, they are only weekdays). Ultimately, I would like to save the output in a table for easier viewing if possible. I have attached both a schematic of what I am trying to do and the txt file of data.
I have attempted to use loops to do this but my intution says there is an easier way to do this so I figured I would ask the community. Help would be greatly appreciated as I have never computed anything like this using matlab before.
Matt

回答(1 个)

cdawg
cdawg 2023-4-28
I took a shot at this even though I had for loops :-)
Enjoyed this (even if it might not be correct- haha) thanks!!
%% IMPORT TEXT FILE
opts = delimitedTextImportOptions("NumVariables", 4);
% Specify range and delimiter
opts.DataLines = [2, Inf];
opts.Delimiter = "\t";
% Specify column names and types
opts.VariableNames = ["Diagnosis", "Date", "Time", "Region"];
opts.VariableTypes = ["categorical", "datetime", "datetime", "categorical"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, ["Diagnosis", "Region"], "EmptyFieldRule", "auto");
opts = setvaropts(opts, "Date", "InputFormat", "yyyy-MM-dd");
opts = setvaropts(opts, "Time", "InputFormat", "HH:mm");
% Import the data
data = readtable("pts_admission_Feb_March.txt", opts);
clear opts
data = removevars(data,["Diagnosis","Region"]);
%% START DATA ANALYSIS
% Remove irrelevant data (not within afternoon shift)
data = data(isbetween(data.Time, datetime('14:00','InputFormat','HH:mm'),datetime('22:00','InputFormat','HH:mm')),:);
% Start sorting the data based on date/time
data.Date = categorical(data.Date,unique(data.Date),cellstr(strcat('Day ',string(1:length(unique(data.Date))))));
times = split(string(data.Time),':');
data.Hours = categorical(times(:,1), unique(times(:,1)), cellstr(unique(times(:,1))));
data.mins = str2double(times(:,2));
% Initialize tables
avs = zeros(length(categories(data.Date)), length(categories(data.Hours))-1);
ci = zeros(length(categories(data.Date)), length(categories(data.Hours))-1);
cat1 = categories(data.Date);
cat2 = categories(data.Hours);
z = 1.96; % for 95% CI
for ii = 1:length(cat1)
datii = data(data.Date == cat1{ii},:);
for jj = 1:length(cat2)-1
datjj = datii(datii.Hours == cat2{jj},:);
n = length(datjj.mins);
avs(ii,jj) = mean(diff(datjj.mins));
s = std(diff(datjj.mins));
ci(ii,jj) = z*s/sqrt(n);
end
end
meanTimes = array2table(avs, "VariableNames", strcat('Hour',cat2(1:end-1)), "RowNames",cat1)
meanTimes = 45×8 table
Hour14 Hour15 Hour16 Hour17 Hour18 Hour19 Hour20 Hour21 ______ ______ ______ ______ ______ ______ ______ ______ Day1 4 6 2.5556 5.125 3 3.8 4.6 10.2 Day2 2.8824 4.5 3.5 3.3571 6.625 5.7778 4.4545 5.3 Day3 3.8667 4.9091 4 5.8333 3.5 4.7778 4.1429 6 Day4 3 4.2 4.2857 5.2727 5.6 8 3.6 2.5 Day5 4.8 3.8 4.5833 5.5 6.1667 4.8571 4.7273 45 Day6 5.7778 4.75 3.7 6.5556 3.4286 4.7273 10 10.2 Day7 4.2727 3.8667 4.8333 3.4118 4.3 4.1538 3 4.7 Day8 2.8235 3.4667 4.6667 13.75 5.6667 5.25 7.7143 5.75 Day9 3.8571 2.9333 3.8571 4 4.7143 5.7 3.25 6.8333 Day10 2.4762 3.4667 4.0909 4.0833 3.9231 4.5 5.1 6.5 Day11 3.8889 2.3636 7.125 9.8 5.1 5.6 4.2727 6.8 Day12 5.7 4.6667 3.6667 4.2 3.6 3.7333 5.5 2.5556 Day13 5.6 3.8 3.375 5.1111 7.8333 4.2222 4.2 4.1818 Day14 3.1176 3.6667 3.8667 8.3333 4.9 7.4286 5.5556 5.8889 Day15 5.2 3.9231 4.7 4.8182 5.2222 4.1818 5.1 6.4286 Day16 2.5556 4.5 2.7222 4.6 3.5625 10.667 3.0714 10.8
CI = array2table(ci, "VariableNames", strcat('Hour',cat2(1:end-1)), "RowNames",cat1)
CI = 45×8 table
Hour14 Hour15 Hour16 Hour17 Hour18 Hour19 Hour20 Hour21 _______ _______ ______ ______ ______ ______ ______ _______ Day1 1.5022 4.3936 1.1969 1.8947 1.0872 1.5737 2.5563 4.3973 Day2 1.0053 2.6378 1.876 1.6321 3.5774 1.6911 1.831 1.7849 Day3 1.6447 2.4722 2.3063 3.3268 1.3426 1.8794 4.0688 4.9326 Day4 1.1644 1.9856 2.6452 1.4556 3.75 7.6987 1.6124 0.80017 Day5 3.3174 2.022 2.1677 2.4939 4.1753 1.9774 2.8861 0 Day6 3.8555 3.1793 3.0011 3.1008 1.0637 2.6549 7.2281 9.8228 Day7 1.541 1.5031 1.8664 1.8627 1.763 1.7744 0.98 2.8689 Day8 1.1461 1.8784 1.9985 8.1405 2.0084 5.1298 5.3932 2.6308 Day9 1.8816 1.404 2.0805 1.9114 3.3881 4.4578 2.2398 4.9897 Day10 0.81049 1.5031 2.2696 1.8754 1.8078 1.3141 3.5616 3.276 Day11 1.1778 0.76245 2.0194 7.0077 1.8151 3.9417 1.8781 4.4695 Day12 2.9751 2.7719 2.0844 1.5955 1.408 1.404 2.2156 1.7286 Day13 2.6894 1.9617 1.6357 4.3985 5.12 2.0031 1.9856 1.6557 Day14 1.4598 1.5896 1.6131 7.6797 2.5902 4.3801 3.6813 2.4094 Day15 3.8682 1.5636 1.7849 2.0055 2.6977 2.0372 2.9673 3.2713 Day16 0.96591 4.0008 1.1927 2.0509 1.9713 7.9816 1.0017 6.7755
So the confidence intervals are meanTimes +/- CI?
  1 个评论
matt beemsterboer
matt beemsterboer 2023-4-28
This code seems to be correct. Thank you for the help, however how would you "refine" the code to do the same exact thing except for each of the four diagnoses (M (mental health), R (respiratory), T (trauma) and C (cardiovascular)?
Given the way you did it, I feel like you would simply add a conditional statement but I am not sure.
Thanks,
Matt

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Data Distribution Plots 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by