Main Content

innerjoin

Inner join between two tables or timetables

Description

T = innerjoin(Tleft,Tright) creates the table or timetable, T, as the inner join of Tleft and Tright using key variables. An inner join combines rows where the key variables have matching values. For example, if Tleft has variables named Key1 and Var1, and Tright has variables Key1 and Var2, then T=innerjoin(Tleft,Tright) uses Key1 as a key variable.

The output T has variables Key1, Var1, and Var2, and combines only the rows where Key1 has matching values in Tleft and Tright.

By default, the key variables are:

  • Variables that have the same names in Tleft and Tright, if both inputs are tables, or if Tleft is a timetable and Tright is a table.

  • Vectors of row times, if both Tleft and Tright are timetables.

The matching values of the key variables in the left and right tables do not have to be in the same order. Inner joins can perform one-to-many and many-to-one matches between the key variables of the two tables. That is, a value that occurs once in a key variable of the left table can have multiple matches in the right table. Similarly, a value that occurs once in a key variable of the right table can have multiple matches in the left table.

You can perform inner joins only on certain combinations of tables and timetables.

  • If Tleft is a table, then Tright must be a table. innerjoin returns T as a table.

  • If Tleft is a timetable, then Tright can be either a table or a timetable. innerjoin returns T as a timetable for either combination of inputs.

example

T = innerjoin(Tleft,Tright,Name,Value) performs the inner-join operation with additional options specified by one or more Name,Value pair arguments.

For example, you can specify the variables to use as key variables.

[T,ileft,iright] = innerjoin(___) also returns index vectors, ileft and iright indicating the correspondence between rows in T and rows in Tleft and Tright respectively. You can use this syntax with any of the input arguments in the previous syntaxes.

example

Examples

collapse all

Create a table, Tleft.

Tleft = table([5;12;23;2;6],...
        {'cereal';'pizza';'salmon';'cookies';'pizza'},...
        'VariableNames',{'Age','FavoriteFood'})
Tleft=5×2 table
    Age    FavoriteFood
    ___    ____________

     5     {'cereal' } 
    12     {'pizza'  } 
    23     {'salmon' } 
     2     {'cookies'} 
     6     {'pizza'  } 

Create a table, Tright, with one variable in common with Tleft.

Tright = table({'cereal';'cookies';'pizza';'salmon';'cake'},...
         [110;160;140;367;243],...
         {'A-';'D';'B';'B';'C-'},...
         'VariableNames',{'FavoriteFood','Calories','NutritionGrade'})
Tright=5×3 table
    FavoriteFood    Calories    NutritionGrade
    ____________    ________    ______________

    {'cereal' }       110           {'A-'}    
    {'cookies'}       160           {'D' }    
    {'pizza'  }       140           {'B' }    
    {'salmon' }       367           {'B' }    
    {'cake'   }       243           {'C-'}    

Use the innerjoin function to create a new table, T, with data from tables Tleft and Tright.

T = innerjoin(Tleft,Tright)
T=5×4 table
    Age    FavoriteFood    Calories    NutritionGrade
    ___    ____________    ________    ______________

     5     {'cereal' }       110           {'A-'}    
     2     {'cookies'}       160           {'D' }    
    12     {'pizza'  }       140           {'B' }    
     6     {'pizza'  }       140           {'B' }    
    23     {'salmon' }       367           {'B' }    

Table T is sorted by the key variable, FavoriteFood.

Create a table, Tleft.

Tleft = table({'a' 'b' 'c' 'e' 'h'}',[1 2 3 11 17]',...
        'VariableNames',{'Key1' 'Var1'})
Tleft=5×2 table
    Key1     Var1
    _____    ____

    {'a'}      1 
    {'b'}      2 
    {'c'}      3 
    {'e'}     11 
    {'h'}     17 

Create a table, Tright, with common values in the variable Key1 between tables Tleft and Tright, but also containing rows with values of Key1 not present in Tleft.

Tright = table({'a' 'b' 'd' 'e'}',[4 5 6 7]',...
         'VariableNames',{'Key1' 'Var2'})
Tright=4×2 table
    Key1     Var2
    _____    ____

    {'a'}     4  
    {'b'}     5  
    {'d'}     6  
    {'e'}     7  

Use the innerjoin function to create a new table, T, with data from tables Tleft and Tright. Retain only rows whose values in the variable Key1 match.

Also, return index vectors, ileft and iright indicating the correspondence between rows in T and rows in Tleft and Tright respectively.

[T,ileft,iright] = innerjoin(Tleft,Tright)
T=3×3 table
    Key1     Var1    Var2
    _____    ____    ____

    {'a'}      1      4  
    {'b'}      2      5  
    {'e'}     11      7  

ileft = 3×1

     1
     2
     4

iright = 3×1

     1
     2
     4

Table T is sorted by the values in the key variable, Key1, and contains the horizontal concatenation of Tleft(ileft,:) and Tright(iright,'Var2').

Create a table, Tleft.

Tleft = table([10;4;2;3;7],[5;4;9;6;1],[10;3;8;8;4])
Tleft=5×3 table
    Var1    Var2    Var3
    ____    ____    ____

     10      5       10 
      4      4        3 
      2      9        8 
      3      6        8 
      7      1        4 

Create a table, Tright, with common values in the second variable as the first variable of table Tleft.

Tright = table([6;1;1;6;8],[2;3;4;5;6])
Tright=5×2 table
    Var1    Var2
    ____    ____

     6       2  
     1       3  
     1       4  
     6       5  
     8       6  

Use the innerjoin function to create a new table, T, with data from tables Tleft and Tright. Use the first variable of Tleft and the second variable of Tright as key variables.

[T,ileft,iright] = innerjoin(Tleft,Tright,'LeftKeys',1,'RightKeys',2)
T=3×4 table
    Var1_Tleft    Var2    Var3    Var1_Tright
    __________    ____    ____    ___________

        2          9       8           6     
        3          6       8           1     
        4          4       3           1     

ileft = 3×1

     3
     4
     2

iright = 3×1

     1
     2
     3

Table T retains only the rows that match between Tleft and Tright with respect to the key variables.

Table T contains the horizontal concatenation of Tleft(ileft,:) and Tright(iright,'Var1').

Create two timetables, Tleft and Tright. They have some row times in common, but each also includes row times that are not in the other timetable.

Tleft = timetable(seconds([1;2;4;6]),[1 2 3 11]')
Tleft=4×1 timetable
    Time     Var1
    _____    ____

    1 sec      1 
    2 sec      2 
    4 sec      3 
    6 sec     11 

Tright = timetable(seconds([2;4;6;7]),[4 5 6 7]')
Tright=4×1 timetable
    Time     Var1
    _____    ____

    2 sec     4  
    4 sec     5  
    6 sec     6  
    7 sec     7  

Combine Tleft and Tright with an inner join. T matches up the rows with common row times. T does not contain any other rows from either timetable.

T = innerjoin(Tleft,Tright)
T=3×2 timetable
    Time     Var1_Tleft    Var1_Tright
    _____    __________    ___________

    2 sec         2             4     
    4 sec         3             5     
    6 sec        11             6     

Input Arguments

collapse all

Left table, specified as a table or a timetable.

Right table, specified as a table or a timetable.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'Keys',2 uses the second variable in Tleft and the second variable in Tright as key variables.

Variables to use as keys, specified as the comma-separated pair consisting of 'Keys' and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.

You cannot use the 'Keys' name-value pair argument with the 'LeftKeys' and 'RightKeys' name-value pair arguments.

A vector of row labels can be a key, alone or in combination with other key variables. For more information, see the Algorithms section.

Example: 'Keys',[1 3] uses the first and third variables in Tleft and Tright as a key variables.

Example: 'Keys',{'X','Y'} uses the variables named X and Y in Tleft and Tright as key variables.

Example: 'Keys','Row' uses the vectors of row names of Tleft and Tright as key variables, if both Tleft and Tright are tables with row names.

Variables to use as keys in Tleft, specified as the comma-separated pair consisting of 'LeftKeys' and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.

You must use the 'LeftKeys' name-value pair argument in conjunction with the 'RightKeys' name-value pair argument. 'LeftKeys' and 'RightKeys' both must specify the same number of key variables. innerjoin pairs key values based on their order.

A vector of row labels can be a key, alone or in combination with other key variables. For more information, see the Algorithms section.

Example: 'LeftKeys',1 uses only the first variable in Tleft as a key variable.

Variables to use as keys in Tright, specified as the comma-separated pair consisting of 'RightKeys' and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.

You must use the 'RightKeys' name-value pair argument in conjunction with the 'LeftKeys' name-value pair argument. 'LeftKeys' and 'RightKeys' both must specify the same number of key variables. innerjoin pairs key values based on their order.

A vector of row labels can be a key, alone or in combination with other key variables. For more information, see the Algorithms section.

Example: 'RightKeys',3 uses only the third variable in Tright as a key variable.

Variables from Tleft to include in T, specified as the comma-separated pair consisting of 'LeftVariables' and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.

You can use 'LeftVariables' to include or exclude key variables, as well as nonkey variables from the output, T.

By default, innerjoin includes all variables from Tleft.

Variables from Tright to include in T, specified as the comma-separated pair consisting of 'RightVariables' and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.

You can use 'RightVariables' to include or exclude key variables, as well as nonkey variables from the output, T.

By default, innerjoin includes all the variables from Tright except the key variables.

Output Arguments

collapse all

Inner join from Tleft and Tright, returned as a table or a timetable. The output table or timetable, T, contains one row for each pair of rows in Tleft and Tright that share the same combination of values in the key variables. If Tleft and Tright contain variables with the same name, innerjoin adds a unique suffix to the corresponding variable names in T.

In general, if there are m rows in Tleft and n rows in Tright that all contain the same combination of values in the key variables, then T contains m*n rows for that combination.

T is sorted by the values in the key variables and contains the horizontal concatenation of Tleft(ileft,LeftVars) and Tright(iright,RightVars). By default, LeftVars consists of all the variables of Tleft, and RightVars consists of all the nonkey variables from Tright. Otherwise, LefttVars consists of the variables specified by the 'LeftVariables' name-value pair argument, and RightVars is the variables specified by the 'RightVariables' name-value pair argument.

You can store additional metadata such as descriptions, variable units, variable names, and row names in T. For more information, see the Properties sections of table or timetable.

Index to Tleft, returned as a column vector. Each element of ileft identifies the row in Tleft that corresponds to that row in the output table or timetable, T.

Index to Tright, returned as a column vector. Each element of iright identifies the row in Tright that corresponds to that row in the output table or timetable, T.

More About

collapse all

Key Variable

Variable used to match and combine data between the input tables, Tleft and Tright.

Algorithms

  • The vector of row labels from an input table or timetable can be used as a key variable, alone or in combination with other key variables. Row labels are the row names of a table or the row times of a timetable. To use this vector as a key, specify it as 'Row' (for the row names of a table), as the name of a timetable vector of row times, or as the value of T.Properties.DimensionNames{1}, where T is the table or timetable.

    In general, innerjoin copies row labels from the input table Tleft to the output table T.

    • If Tleft has no row labels, then T has no row labels.

    • If Tleft has row labels, then innerjoin copies row labels from Tleft to create row labels in T.

      • However, if both Tleft and Tright are tables, but you do not specify either input’s row names as a key, then innerjoin does not create row names in T.

      • If both Tleft and Tright are timetables, but you do not specify either input’s row times as a key, then innerjoin copies row times from Tleft to T.

    You cannot perform an inner join using the row labels of Tleft as the left key and a variable of Tright as the right key. To perform the inner join, convert the row labels of Tleft to a table variable and use the new table variable as a key.

Extended Capabilities

Thread-Based Environment
Run code in the background using MATLAB® backgroundPool or accelerate code with Parallel Computing Toolbox™ ThreadPool.

Version History

Introduced in R2013b

See Also

Functions

Live Editor Tasks