how could avoid too have loop in big data

I have a big data set(near to 13G). it is a table with information about different agencies at a different time(every day of 10 years). I use unique() to make groups for months and years and agencies. (is it true?) I need to calculate the mean of each agency at all month and years.how could I save time?

 采纳的回答

jonas
jonas 2018-10-21
编辑:jonas 2018-10-21
You can use findgroups/splitapply or put the data in a table and use varfun or grpstats.
Upload some data if you need code.

8 个评论

the point is that some of my data is in Persian and my date too,e.g, 1392 is persain year, and I have different sheet in excel. I will upload a small part of them. if I used func = @mean; Mean = varfun(func,data,'GroupingVariables',شهرستان) I received this error: Unrecognized variable name 'اسكو'.
Persian characters are not accepted in variable names, at least on my machine. I can load the characters fine in the table, but reading the table displays a warning that the variable names were changed. Easiest would be to change the of the variables. For a table T, with 17 columns, you can write:
T.Properties.VariableNames = sprintfc('Var_%g',1:17)
Perhaps you already know this, but make sure you only pass numeric values to varfun. You can pass a table with non-numeric data to varfun, but you must then specify the numeric columns that you want to evaluate. For example,
vmean = varfun(@mean,T,'InputVariables','Var_1','GroupingVariables','Var_14')
Var_14 is persian text and it works fine!
THANKS ALOT, but even by changing the name of the variables, I still have error. If I used the unique and cell it works but if I want to get mean cities in different months and years...I need to use loop
I edited my comment a bit after you replied. Consider my edit, try again and then give me the entire error message and explain what columns you want to use as grouping variables (column 2 and... 14?)
I'm having some trouble parsing the dates correctly, they are interpreted in persian for some reason. If you can somehow get the dates in datetime format, then you can easily add two more columns with the month and year and add them as additional grouping variables.
I use this code for months and years:1391 and 1392 are years month = unique(month(datetime(data.(16),'InputFormat','yyyy/MM/dd'))); year=unique(year(datetime(data.(16),'InputFormat','yyyy/MM/dd'))); but I want to group data by the City... but it gives the error like the last time Error using getVarIndices (line 47) Unrecognized variable name 'اسكو'.
Error in table/varfun (line 103) groupVars = getVarIndices(a,groupVars);
Error in index (line 44) Mean = varfun(func,data,'GroupingVariables',City)
The grouping variable must be a string. If your variable is called "City" then the syntax is as follows:
varfun(func,data,'GroupingVariables','City')
I suspect that this will return a different error, unless the table data includes only numerical columns, for reasons explained in my previous comment.
If you already have the dates in datetime format, then the last steps are quite straight-forward. Let's say you have a column with your dates called dates, you will want to:
  • Add two columns to your table with month number and year number
T.Yrs = year(T.dates)
T.Mnth = month(T.dates)
  • Add these to the grouping variables
varfun(func,data,'GroupingVariables',{'City','Yrs','Mnth');
  • Add the variable names you want to evaluate, as I described above.
I'm going to sleep, but will check in tomorrow. Also, you may want to try using english variable names, in the event that there is some problem with persian variable names (I have never tested).
thank a lot for your answer and time

请先登录,再进行评论。

更多回答(0 个)

类别

帮助中心File Exchange 中查找有关 Logical 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by