Matlab will recognize some strings, but not all

format long
station = input('Enter the station you want data for with out the K identifier (BOS not KBOS) ','s');
Day1 = input('Enter first day ','s'); %enter as without leading zero ex: 1 not 01
Month1 = input('Enter first month ','s'); %enter as without leading zero ex: 1 not 01
Year1 = input('Enter first year ','s'); %enter as without leading zero ex: 1 not 01
Day2 = input('Enter second day ','s'); %enter as without leading zero ex: 1 not 01
Month2 = input('Enter second month ','s'); %enter as without leading zero ex: 1 not 01
Year2 = input('Enter second year ','s'); %enter as without leading zero ex: 1 not 01
Date1 = strcat(Month1, '-',Day1,'-',Year1);
Date2 = strcat(Month2, '-',Day2,'-',Year2);
DateNum1 = datenum(Date1);
DateNum2 = datenum(Date2);
t1 = datevec(DateNum1,'mmmm dd, yyyy HH:MM:SS');
t2 = datevec(DateNum2,'mmmm dd, yyyy HH:MM:SS');
MinBetDat = ((etime(t2,t1))/60) +1440; %+1440 to count end date
Dataurl = strcat('https://mesonet.agron.iastate.edu/cgi-bin/request/asos.py?station=',station,'&data=skyc1&data=skyc2&data=skyc3&year1=',Year1,'&month1=',Month1,'&day1=',Day1,'&year2=',Year2,'&month2=',Month2,'&day2=',Day2,'&tz=Etc%2FUTC&format=onlycomma&latlon=no&missing=M&trace=T&direct=no&report_type=1&report_type=2');
% Reference URL: 'https://mesonet.agron.iastate.edu/cgi-bin/request/asos.py?station=BOS&data=skyc1&data=skyc2&data=skyc3&year1=2019&month1=1&day1=1&year2=2019&month2=4&day2=8&tz=Etc%2FUTC&format=onlycomma&latlon=no&missing=M&trace=T&direct=no&report_type=1&report_type=2'
str = urlread(Dataurl); %pulls data from url as a single string
DATES1 = []; %blank date array
CLOUDS = []; %Blank array to separate the sky condition into, NOT FUNCTIONAL YET
Data = strsplit(str,{'\n',','}); %parses sinlge url string first separating by each new line and then by commas such that each element has it's own indexed number
L = length(Data); % probably not necessary, but I'm not going to change it
Data = Data(6:end); %removes column headers
Data(1:5:end) = []; %removes BOS identifier in each line
k=1; %leftover from previous versions
DATES1 = Data(1:4:end); %separates dates from data into their own array, makes referencing the dates easier in the following lines
k = length(DATES1);
Days = MinBetDat/1440;
DATES1 = datenum(DATES1);
DATES1 = datevec(DATES1);
D = ones(1,length(DATES1));
DATES2 = mat2cell(DATES1,D);
k = 6; %index of second line so that it actually has something to compare against.
t = 1;
tO = 1;
tB = 1;
tS = 1;
tF = 1;
tV = 1;
tC = 1;
time = zeros(1,length(DATES2));
timeOVC = zeros(1,length(DATES2));
timeBKN = zeros(1,length(DATES2));
timeSCT = zeros(1,length(DATES2));
timeFEW = zeros(1,length(DATES2));
timeVV = zeros(1,length(DATES2));
timeCLR = zeros(1,length(DATES2));%Faster to preallocate the times array
while k<=length(Data)
if any(strcmp(Data(k:k+2), 'OVC')) || any(strcmp(Data(k:k+2), 'BKN')) || any(strcmp(Data(k:k+2), 'SCT')) || any(strcmp(Data(k:k+2), 'FEW')) || any(strcmp(Data(k:k+2), 'VV')) || any(strcmp(Data(k:k+2), 'CLR'))%checks if the current line contains any cloud conditions
if any(strcmp(Data(k-4:k-2), 'OVC')) %checks if the line above had any cloud conditions, if it does then it checks which sky condition was in the lines above and adds the time difference to that respective sky con matrix
timeOVC(tO)= ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tO = tO+1;
elseif any(strcmp(Data(k-4:k-2), 'BKN'))
timeBKN(tB)= ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tB = tB + 1;
elseif any(strcmp(Data(k-4:k-2), 'SCT'))
timeSCT(tS)= ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tS = tS + 1;
elseif any(strcmp(Data(k-4:k-2), 'FEW'))
timeFEW(tF)= ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tF = tF + 1;
elseif any(strcmp(Data(k-4:k-2), 'CLR'))
timeCLR(tC) = ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tC = tC + 1;
elseif any(strcmp(Data(k-4:k-2), "VV"))
timeVV(tV)= ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tV = tV + 1;
else
time(t) = 0;
t = t+1;
end
elseif any(strcmp(Data(k:k+2), 'OVC')) || any(strcmp(Data(k:k+2), 'BKN')) || any(strcmp(Data(k:k+2), 'SCT')) || any(strcmp(Data(k:k+2), 'FEW')) || any(strcmp(Data(k:k+2), 'VV')) || any(strcmp(Data(k:k+2), 'CLR'))%checks if the current line contains any cloud conditions
if any(strcmp(Data(k-4:k-2), 'OVC')) %checks if the line above had any cloud conditions, if it does then it checks which sky condition was in the lines above and adds the time difference to that respective sky con matrix
timeOVC(tO)= ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tO = tO+1;
elseif any(strcmp(Data(k-4:k-2), 'BKN'))
timeBKN(tB)= ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tB = tB + 1;
elseif any(strcmp(Data(k-4:k-2), 'SCT'))
timeSCT(tS)= ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tS = tS + 1;
elseif any(strcmp(Data(k-4:k-2), 'FEW'))
timeFEW(tF)= ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tF = tF + 1;
elseif any(strcmp(Data(k-4:k-2), 'CLR'))
timeCLR(tC) = ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tC = tC + 1;
elseif any(strcmp(Data(k-4:k-2), "VV"))
timeVV(tV) = ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
tV = tV + 1;
else
time(t) = 0;
t = t+1;
end
end
progress_clouds = (k/length(Data))* 100
k = k+4;
end
OVC = sum(timeOVC)
BKN = sum(timeBKN)
SCT = sum(timeSCT)
FEW = sum(timeFEW)
VV = sum(timeVV)
CLR = sum(timeCLR)
totaltime = OVC + BKN + SCT + FEW + VV;
CloudyTime = (totaltime/MinBetDat)*100
This is my code, this program is designed to pull weather observations from a user generated url and parse the data down to only sky conditions and then calculate the total time spent at each condition when it is the worst one present. When I run it I can get values for all of the sky conditions except VV which is always returned as 0? I've checked it againts an excel sheet with the same data and there are several instances where VV is the worst but it doesn't perform the calculation. Does anyone have any insight as to why it won't pick it up?

 采纳的回答

Joshua - from this code
strcmp(Data(k-4:k-2), "VV"))
you are comparing the two character string "VV" with a three character string. If k is 5 then you are extracting Data(1,3) which is three characters long...and so will never be identical to "VV". Either you need to ensure that you only extract two characters from Data or you will could use strfind or contains to see if the shorter string is contained within the longer. (Or use any of the regular expressions to do the same)
EDIT
Data is a cell array of strings and so Data(k-4:k-2) is not a single string but three strings.

13 个评论

I tried a different method instead where I find if the line is missing the data all together but it uses the same code with "VV" being replaced with "M" and it calculates it just fine. I then take all my sums and subtract them from the total time and what's leftover must be the VV time. it is searching any of the positions in the range k-4:k-2. It just seems odd that it will find "M" just fine but the same code using "VV" returns 0
elseif any(strcmp(Data(k-4:k-2), "M"))
timeM(M) = ((etime(DATES2{t+1},DATES2{t})/60));
t = t+1;
M = M + 1;
ok so Data is an array of lines - is that correct? Since k starts at 6 and increments by four on each iteration of the loop, then you are considering lines Data(k-4:k-2)
k == 6 --> lines 2:4
k == 10 --> lines 6:8
k == 14 --> lines 10:12
etc.
By the way, why are you skipping lines 5, 9, 13, etc.? Or am I misunderstanding the logic? What are the dimensions for your Data array? Is it a character/string array or a cell array?
Oh - so it must be a cell array
>> Data = {'test1'; 'test2'; 'M'}
>> strcmp(Data, 'M')
ans =
0
0
1
and since you are wrapping the above with any, then that is why it evaluates to true. Could it be that because you are skipping some lines, you are missing the ones with the 'VV' string?
Sort of, k= 6:9 represents cells 1:3 on line 2. The data array is a 1x(4*Dates1) cell array where the each line of data is represented by k:k+3 for a total of 4 cells per line. This is why the k-4 is necessary because you have to start on the second line and compare it to the first. so the function as it is searches the 2nd, 3rd, and 4th cells to see if it meets the conditions, and if it does then it subtracts the current time from the previous time to get an elapsed time of the previous dominant sky condition
Looking at the excel sheet, k = 6 would be represented by B4 on the 1/1/18 0:05 line. The function here would see that line 3 contains "CLR" and that 5 minutes has passed between line 4 and 3 and then adds that time to the variable timeCLR
EDIT: k = 6 represents B4 when ignoring the positions of the column headers
But you always increment k by four on each subsequent iteration of the loop
k = k+4;
so then when k is 10, you are considering Data(k-4:k-2) which are lines 6:8. Why have you skipped line 5?
What are the dimensions of Data?
From your code
Data = Data(6:end); %removes column headers
Data(1:5:end) = []; %removes BOS identifier in each line
k=1; %leftover from previous versions
DATES1 = Data(1:4:end);
you remove the first five (header) rows from the response and then you remove every fifth row Have you confirmed that these only correspond to BOS idenitifiers and that you are not removing the VV rows? In Data, which rows have the "VV" string?
I haven't skipped line 5, k = 10 would be B5 in the excel example. The cell array in MATLAB is a 1xN cell array so the 10th position in the cell array corresponds to B5 in the spreadsheet. I have to add four everytime so I can effectively move onto the next line when in all reality I'm just moving four positions down in the array
but when is Data(5) or Data(9) etc. ever considered?
Data(5) and Data(9) are never considered and that's mostly just because I haven't bothered to remove it from the Data array. But, I have taken separated these dates into their own array called DATES1 and convert it to a cell array called DATES2 which is then referenced in the classification part at the end.
The attached image shows the relative positions, The iteration by 4 is necessary to move onto the "next" line
EDIT- I've verified and compared against the raw data that none of the VV's are missing. The raw data only has 3 sky conditions per line and this is true for the cell array as well. You can check against Data{1,4646} where there is a whole series of lines containing VV
In order to check against Data(1,4464), you'll need to post the DataUrl (since I don't know what you are choosing for station, days, months, etc.).
Looking at the data, the "VV" string seems to really be "VV ". Have you tried adding an extra space to your code to see if you can string compare on that?
any(strcmp(Data(k-4:k-2), "VV "))
To think, this whole thing was held up by just a stray space lol
Thank you

请先登录,再进行评论。

更多回答(0 个)

类别

帮助中心File Exchange 中查找有关 Cell Arrays 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by