Main Content

Label new data using semi-supervised graph-based classifier

To fit labels to unlabeled training data, `fitsemigraph`

constructs a similarity graph with both labeled and unlabeled observations as nodes, and
distributes the label information from labeled observations to unlabeled observations by using
either label propagation or label spreading. The resulting
`SemiSupervisedGraphModel`

object stores the fitted labels and label scores
for the unlabeled data in its `FittedLabels`

and
`LabelScores`

properties, respectively.

To predict the label of a new observation *x*, the
`predict`

function uses a weighted average of neighboring observation
scores to compute the label scores for *x*, namely $${F}_{x}=\frac{{\displaystyle \sum _{j=1}^{n}S(x,{x}_{j}){F}_{{x}_{j}}}}{{\displaystyle \sum _{j=1}^{n}S(x,{x}_{j})}}$$.

*n*is the number of observations in the training data.*F*is the row vector of label scores for the training observation_{xj}*x*(or node_{j}*j*). For more information on the computation of label scores for training observations, see Algorithms.*S*(*x*,*x*) is the pairwise similarity between the new observation_{j}*x*and the training observation*x*, where_{j}*S*(*x*,_{i}*x*) =_{j}*S*is as defined in Similarity Graph._{i,j}

The column with the maximum score in
*F _{x}* corresponds to the predicted class label for

[1] Delalleau, Olivier, Yoshua Bengio,
and Nicolas Le Roux. “Efficient Non-Parametric Function Induction in Semi-Supervised
Learning.” *Proceedings of the Tenth International Workshop on Artificial
Intelligence and Statistics*. 2005.