Main Content

distributionScores

Distribution confidence scores

Since R2023a

    Description

    scores = distributionScores(discriminator,X) returns the distribution confidence score for each observation in X using the method you specify in the Method property of discriminator. You can use the scores to separate data into in-distribution (ID) and out-of-distribution (OOD) data sets. For example, you can classify any observation with distribution confidence score less than or equal to the Threshold property of discriminator as OOD. For more information about how the software computes the distribution confidence scores, see Distribution Confidence Scores.

    example

    scores = distributionScores(discriminator,X1,...,XN) returns the distribution scores for networks with multiple inputs using the specified in-memory data.

    scores = distributionScores(___,VerbosityLevel=level) also specifies the verbosity level.

    Examples

    collapse all

    Load a pretrained classification network.

    load('digitsClassificationMLPNetwork.mat');

    Load ID data. Convert the data to a dlarray object.

    XID = digitTrain4DArrayData;
    XID = dlarray(XID,"SSCB");

    Modify the ID training data to create an OOD set.

    XOOD = XID.*0.3 + 0.1;

    Create a discriminator using the networkDistributionDiscriminator function.

    method = "baseline";
    discriminator = networkDistributionDiscriminator(net,XID,XOOD,method)
    discriminator = 
      BaselineDistributionDiscriminator with properties:
    
           Method: "baseline"
          Network: [1×1 dlnetwork]
        Threshold: 0.9743
    
    

    The discriminator object contains a threshold for separating the ID and OOD confidence scores.

    Use the distributionScores function to find the distribution scores for the ID and OOD data. You can use the distribution confidence scores to separate the data into ID and OOD. The algorithm the software uses to compute the scores is set when you create the discriminator. In this example, the software computes the scores using the baseline method.

    scoresID = distributionScores(discriminator,XID);
    scoresOOD = distributionScores(discriminator,XOOD);

    Plot the distribution confidence scores for the ID and OOD data. Add the threshold separating the ID and OOD confidence scores.

    figure
    histogram(scoresID,BinWidth=0.02)
    hold on
    histogram(scoresOOD,BinWidth=0.02)
    xline(discriminator.Threshold)
    legend(["In-distribution scores","Out-of-distribution scores","Threshold"],Location="northwest")
    xlabel("Distribution Confidence Scores")
    ylabel("Frequency")
    hold off

    Figure contains an axes object. The axes object with xlabel Distribution Confidence Scores, ylabel Frequency contains 3 objects of type histogram, constantline. These objects represent In-distribution scores, Out-of-distribution scores, Threshold.

    Load a pretrained classification network.

    load("digitsClassificationMLPNetwork.mat");

    Load ID data and convert the data to a dlarray object.

    XID = digitTrain4DArrayData;
    XID = dlarray(XID,"SSCB");

    Modify the ID training data to create an OOD set.

    XOOD = XID.*0.3 + 0.1;

    Create a discriminator.

    method = "baseline";
    discriminator = networkDistributionDiscriminator(net,XID,XOOD,method);

    Use the distributionScores function to find the distribution scores for the ID and OOD data. You can use the distribution scores to separate the data into ID and OOD.

    scoresID = distributionScores(discriminator,XID);
    scoresOOD = distributionScores(discriminator,XOOD);

    Use rocmetrics to plot a ROC curve to show how well the model performs at separating the data into ID and OOD.

    labels = [
        repelem("In-distribution",numel(scoresID)), ...
        repelem("Out-of-distribution",numel(scoresOOD))];
    scores = [scoresID',scoresOOD'];
    
    rocObj = rocmetrics(labels,scores,"In-distribution");
    figure
    plot(rocObj)

    Figure contains an axes object. The axes object with title ROC Curve, xlabel False Positive Rate, ylabel True Positive Rate contains 3 objects of type roccurve, scatter, line. These objects represent In-distribution (AUC = 0.9876), In-distribution Model Operating Point.

    Input Arguments

    collapse all

    Input data, specified as a formatted dlarray or a minibatchqueue object that returns a formatted dlarray. For more information about dlarray formats, see the fmt input argument of dlarray.

    Use a minibatchqueue object for a network with multiple inputs where the data does not fit on disk. If you have data that fits in memory that does not require additional processing, then it is usually easiest to specify the input data as in-memory arrays. For more information, see X1,...,XN.

    In-memory data for multi-input network, specified dlarray objects. The input Xi corresponds to the network input discriminator.Network.InputNames(i).

    For multi-input networks, if you have data that fits in memory that does not require additional processing, then it is usually easiest to specify the input data as in-memory arrays. If you want to make predictions with data stored on disk, then specify X as a minibatchqueue object.

    Verbosity level of the Command Window output, specified as one of these values:

    • "off" — Do not display progress information.

    • "summary" — Display a summary of the progress information.

    • "detailed" — Display detailed information about the progress. This option prints the mini-batch progress. If you do not specify the input data as a minibatchqueue object, then the "detailed" and "summary" options print the same information.

    Data Types: char | string

    More About

    collapse all

    References

    [1] Shalev, Gal, Gabi Shalev, and Joseph Keshet. “A Baseline for Detecting Out-of-Distribution Examples in Image Captioning.” In Proceedings of the 30th ACM International Conference on Multimedia, 4175–84. Lisboa Portugal: ACM, 2022. https://doi.org/10.1145/3503161.3548340.

    [2] Shiyu Liang, Yixuan Li, and R. Srikant, “Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks” arXiv:1706.02690 [cs.LG], August 30, 2020, http://arxiv.org/abs/1706.02690.

    [3] Weitang Liu, Xiaoyun Wang, John D. Owens, and Yixuan Li, “Energy-based Out-of-distribution Detection” arXiv:2010.03759 [cs.LG], April 26, 2021, http://arxiv.org/abs/2010.03759.

    [4] Markus Goldstein and Andreas Dengel. "Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm." KI-2012: poster and demo track 9 (2012).

    [5] Jingkang Yang, Kaiyang Zhou, Yixuan Li, and Ziwei Liu, “Generalized Out-of-Distribution Detection: A Survey” August 3, 2022, http://arxiv.org/abs/2110.11334.

    [6] Lee, Kimin, Kibok Lee, Honglak Lee, and Jinwoo Shin. “A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks.” arXiv, October 27, 2018. http://arxiv.org/abs/1807.03888.

    Extended Capabilities

    expand all

    Version History

    Introduced in R2023a

    expand all