主要内容

findAdversarialExamples

Find adversarial examples for MATLAB, ONNX, and PyTorch classification networks

Since R2026a

    Description

    Add-On Required: This feature requires the AI Verification Library for Deep Learning Toolbox add-on.

    dlnetwork adversarial examples

    [example,mislabel] = findAdversarialExamples(net,XLower,XUpper,label) creates untargeted adversarial examples example from the network net between XLower and XUpper. Specify the expected correct label using the label argument. The function also returns the actual predicted label mislabel.

    example

    [example,mislabel,iX,iE] = findAdversarialExamples(net,XLower,XUpper,label) also returns index vectors iX and iE. You can find adversarial examples for several sets of input bounds and labels at once. However, the findAdversarialExamples function does not always find an adversarial example. If the generated example is not misclassified as expected, then the function does not return it. Therefore, the batch dimension of example can be smaller than the batch dimensions of XLower, XUpper, and label. To find out which example corresponds to which set of inputs, use the index vectors iX and iE to index into the example and input batches, respectively.

    ___ = findAdversarialExamples(___,AdversarialLabel=adversarialLabel) creates targeted adversarial examples that the network incorrectly classifies as adversarialLabel instead of label.

    example

    ___ = findAdversarialExamples(___,Name=Value) specifies additional options using one or more name-value arguments.

    ONNX and PyTorch network adversarial examples

    This syntax requires the Deep Learning Toolbox Interface for alpha-beta-CROWN Verifier add-on.

    [example,mislabel] = findAdversarialExamples(modelfile,XLower,XUpper,label,numClasses) creates untargeted adversarial examples example between XLower and XUpper from the pretrained ONNX™ or PyTorch® network in modelfile. Specify the expected correct label using the label argument and the number of classes in the network with the numClasses argument. The function also returns the actual predicted label mislabel

    example

    [example,mislabel,iX,iE] = findAdversarialExamples(modelfile,XLower,XUpper,label,numClasses) also returns index vectors iX and iE. You can find adversarial examples for several sets of input bounds and labels at once. However, the findAdversarialExamples function does not always find an adversarial example. If the created example is not misclassified as expected, then the function does not return it. Therefore, the batch dimension of example can be smaller than the batch dimensions of XLower, XUpper, and label. To find out which example corresponds to which set of inputs, use the index vectors iX and iE to index into the example and input batches, respectively.

    ___ = findAdversarialExamples(___,Name=Value) specifies additional options using one or more name-value arguments.

    Examples

    collapse all

    Load a pretrained network. This network has been trained to classify images of digits.

    rng(1)
    load("digitsClassificationConvolutionNet.mat","net")
    classNames = categorical(0:9);

    Load the test dataset, then randomly select a subset of samples to use for generating adversarial examples.

    [XTest,TTest] = digitTest4DArrayData; 
    
    numInputs = 10;
    testIdx = randi(numel(TTest),numInputs);
    imgs = XTest(:,:,:,testIdx);
    labels = TTest(testIdx,:);

    Prepare the data by converting it to a dlarray object.

    X = dlarray(single(imgs),"SSCB");

    Find the labels predicted by the network.

    scores = predict(net,X);
    YTest = scores2label(scores,classNames);

    In this example, the values of the pixels are between 0 and 1, so specify a maximum perturbation size of 0.1. Clip the lower and upper bounds so that they remain within the range of the input data.

    perturbationSize = 0.1;
    
    XLower = max(X-perturbationSize,0);
    XUpper = min(X+perturbationSize,1);

    Use the findAdversarialExamples function to find adversarial examples.

    [examples,mislabels,iX] = findAdversarialExamples(net,XLower,XUpper,labels);

    For the first adversarial example, view the original image and the adversarial example side-by-side. The adversarial example is misclassified even though the adversarial image appears very similar to the original image.

    adversarialExampleIndex = 1;
    inputIndex = iX(adversarialExampleIndex);
    
    figure
    tiledlayout(1,2); 
    
    nexttile(1);
    imshow(imgs(:,:,:,inputIndex));
    title({"Original Image (Class: " + string(labels(inputIndex)) + ")", ...
        "Predicted Class: " + string(YTest(inputIndex))});
    
    nexttile(2) 
    imshow(extractdata(examples(:,:,:,adversarialExampleIndex))); 
    title({"Adversarial Example (Class: " + string(labels(inputIndex)) + ")", ...
        "Predicted Class: " + string(mislabels(1))});

    Figure contains 2 axes objects. Hidden axes object 1 with title Original Image (Class: 4) Predicted Class: 4 contains an object of type image. Hidden axes object 2 with title Adversarial Example (Class: 4) Predicted Class: 8 contains an object of type image.

    Load a pretrained network. This network has been trained to classify waveforms into one of four classes: sawtooth, sine, square, or triangle.

    rng("default")
    load("trainedWaveformClassificationNetwork.mat","net")

    Load a test input.

    load("WaveformData");
    classNames = unique(labels)
    classNames = 4×1 categorical
         Sawtooth 
         Sine 
         Square 
         Triangle 
    
    
    numChannels = size(data{1},2);
    testIdx = 1;
    input = data{testIdx};
    label = labels(testIdx)
    label = categorical
         Sine 
    
    

    Prepare the input by converting it to a dlarray object.

    X = dlarray(single(input),"TC");

    Find the class predicted by the network.

    score = predict(net,X);
    YTest = scores2label(score,classNames)
    YTest = categorical
         Sine 
    
    

    Find adversarial examples. As this data has values in the range [-1,1], specify a maximum perturbation size of 0.3.

    perturbationSize = 0.3;
    
    XLower = max(X-perturbationSize,-1);
    XUpper = min(X+perturbationSize,1);

    To specify additional options, create an adversarialOptions object. Set the step size to 0.1 and the number of iterations to 50.

    options = adversarialOptions("bim",StepSize=0.1,NumIterations=50)
    options = 
      AdversarialOptionsBIM with properties:
    
                    StepSize: 0.1000
               NumIterations: 50
               MiniBatchSize: 128
        ExecutionEnvironment: 'auto'
                     Verbose: 0
    
    

    Use the findAdversarialExamples function to find an adversarial example for the test input. If no example is found, the function returns []. Find an adversarial example that misclassifies the input as "Sawtooth".

    [example,mislabel] = findAdversarialExamples(net,XLower,XUpper,label, ...
        Algorithm=options,AdversarialLabel=categorical("Sawtooth",string(classNames)));

    View the original input and the adversarial example side-by-side.

    figure
    tiledlayout(1,2);
    
    nexttile(1);
    stackedplot(input,DisplayLabels="Channel "+string(1:numChannels))
    title({"Original Image (Class: " + string(label) + ")", ...
        "Predicted Class: " + string(YTest)});
    
    nexttile(2)
        stackedplot(extractdata(squeeze(example))',DisplayLabels="Channel "+string(1:numChannels));
    title({"Adversarial Example (Class: " + string(label) + ")", ...
        "Predicted Class: " + string(mislabel)});

    MATLAB figure

    Load a pretrained classification network. This network is a PyTorch® model that has been trained to predict the class label of images of handwritten digits.

    rng(1)
    modelfile = "digitsClassificationConvolutionNet.pt";
    numClasses = 10;

    Load the test dataset, then randomly select a subset of samples to use for generating adversarial examples.

    [XTest,TTest] = digitTest4DArrayData; 
    numInputs = 10;
    testIdx = randi(numel(TTest),numInputs);
    X = XTest(:,:,:,testIdx);
    labels = TTest(testIdx,:);

    In this example, the values of the pixels are between 0 and 1, so specify a maximum perturbation size of 0.1. Clip the lower and upper bounds so that they remain within the range of the input data.

    perturbationSize = 0.1;
    XLower = max(X-perturbationSize,0);
    XUpper = min(X+perturbationSize,1);

    Use the findAdversarialExamples function to generate adversarial examples.

    [examples,mislabels,iX] = findAdversarialExamples(modelfile,XLower,XUpper,labels,numClasses, ...
        Algorithm="bim", ...
        InputDataPermutation=[4 3 1 2]);

    For the first adversarial example, view the original image and the adversarial example side-by-side.

    adversarialExampleIndex = 1;
    inputIndex = iX(adversarialExampleIndex);
    
    figure
    tiledlayout(1,2);
    nexttile(1);
    imshow(X(:,:,:,inputIndex));
    title("Original Image");
    
    nexttile(2) 
    imshow(extractdata(examples(:,:,:,adversarialExampleIndex))); 
    title("Adversarial Example");

    Load a pretrained network. This network has been trained to classify natural RGB images.

    rng("default")
    [net,classNames] = imagePretrainedNetwork;
    inputSize = net.Layers(1).InputSize(1:2);

    Load a test image and resize it to the expected network input size. This is an image of a golden retriever.

    img = imread("sherlock.jpg");
    img = imresize(img,inputSize);
    X = dlarray(single(img),"SSCB");
    
    label = categorical("golden retriever",classNames);

    Find the label predicted by the network.

    score = predict(net,X);
    YTest = scores2label(score,classNames)
    YTest = categorical
         golden retriever 
    
    

    This image has values in the range [0 255]. Generate lower and upper bounds with a maximum perturbation size of ∓10. Ensure that the values do not go below 0 or above 255.

    perturbationSize = 10;
    
    XLower = max(X-perturbationSize,0);
    XUpper = min(X+perturbationSize,255);

    The default step size is suitable for inputs with values between [0,1]. As this input has values with a maximum of 255, create an adversarial options object with a step size of 1 and number of iterations set to 2.

    options = adversarialOptions("bim",StepSize=1,NumIterations=2);
    [example,mislabel] = findAdversarialExamples(net,XLower,XUpper,label,Algorithm=options);

    View the original image and the adversarial example side-by-side. The adversarial example is misclassified even though the adversarial image appears very similar to the original image.

    figure
    tiledlayout(1,2); 
    
    nexttile(1);
    imshow(img);
    title({"Original Image (Class: " + string(label) + ")", ...
        "Predicted Class: " + string(YTest)});
    nexttile(2) 
    imshow(uint8(extractdata(example))); 
    title({"Adversarial Example", "Predicted Class: " + string(mislabel)});

    Figure contains 2 axes objects. Hidden axes object 1 with title Original Image (Class: golden retriever) Predicted Class: golden retriever contains an object of type image. Hidden axes object 2 with title Adversarial Example Predicted Class: Italian greyhound contains an object of type image.

    Input Arguments

    collapse all

    Lower bound of the search space for the adversarial examples, specified as a formatted dlarray object or a numeric array.

    • If you provide a dlnetwork object as input, XLower must be a formatted dlarray object. For more information about dlarray formats, see the fmt input argument of dlarray.

    • If you provide an ONNX or PyTorch modelfile as input, XLower must be a numeric array.

    The lower and upper bounds, XLower and XUpper, must have the same size and format.

    Upper bound of the search space for the adversarial examples, specified as a formatted dlarray object or a numeric array.

    • If you provide a dlnetwork object as input, XUpper must be a formatted dlarray object. For more information about dlarray formats, see the fmt input argument of dlarray.

    • If you provide an ONNX or PyTorch modelfile as input, XUpper must be a numeric array.

    The lower and upper bounds, XLower and XUpper, must have the same size and format.

    Expected correct label for input data between XLower and XUpper, specified as a numeric vector of class indices or as a categorical array.

    The number of elements of label must be equal to the number of observations in XLower and XUpper.

    Example: categorical("cat",["cat","dog","bird"])

    dlnetwork only

    Neural network, specified as an initialized dlnetwork object.

    The findAdversarialExamples function does not support networks that have multiple inputs or multiple outputs.

    ONNX or PyTorch only

    Since R2026a

    ONNX or PyTorch model file name specified as a character vector or a string scalar. The modelfile must be a full PyTorch model (saved using torch.save()) or an ONNX model with the .onnx extension.

    Note

    The Python® classification network must be saved without a softmaxLayer in the output.

    Number of output classes specified as a numeric integer. This is the number of output classes in the pretrained ONNX or PyTorch network.

    Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

    Name-Value Arguments

    expand all

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: findAdversarialExamples(net,XUpper,XLower,label,Algorithm="bim") finds untargeted adversarial examples using the basic iterative method (BIM).

    All Model Types

    expand all

    Algorithm to find adversarial examples, specified as a character vector or string scalar of a built-in algorithm name or a built-in custom object.

    • Built-in algorithm name. Specify the algorithm as a string scalar or character vector.

      • "bim" — Basic iterative method

      • "fgsm" — Fast gradient sign method

    • Built-in algorithm object. If you need more flexibility, you can use the built-in algorithms objects.

      • Create the following algorithms objects using the adversarialOptions function. Only applicable when using dlnetwork as input.

        • AdversarialOptionsBIM — Basic iterative method object

        • AdversarialOptionsFGSM — Fast gradient sign method object

      • NetworkVerificationOptionsα-β-CROWN network verification object. Only applicable when using ONNX or PyTorch modelfile as input.

    dlnetwork only

    expand all

    Adversarial label, specified as a numeric vector of class indices or as a categorical array. Use this name-value argument to find targeted adversarial examples.

    ONNX or PyTorch only

    expand all

    Input dimension ordering specified as a numeric row vector. The ordering is the desired permutation of data in XLower and XUpper from MATLAB to Python dimension ordering. See Input Dimension Ordering for more information.

    Example: [4 3 1 2]

    Number of dimensions in input data specified as a positive integer.

    Example: 4

    Hardware resource, specified as one of these values:

    • "auto" – Use a local GPU if one is available. Otherwise, use the local CPU.

    • "cpu" – Use the local CPU.

    • "gpu" – Use the local GPU.

    The "gpu" option requires Parallel Computing Toolbox™. To use a GPU for deep learning, you must also have a supported GPU device. For information on supported devices, see GPU Computing Requirements (Parallel Computing Toolbox). If you choose one of these options and Parallel Computing Toolbox or a suitable GPU is not available, then the software returns an error.

    For more information on when to use the different execution environments, see Scale Up Deep Learning in Parallel, on GPUs, and in the Cloud.

    Dependency

    If you specify Algorithm as an AlphaCROWNOptions or a NetworkVerificationOptions object, then the execution environment specified in the options object takes precedence.

    Output Arguments

    collapse all

    Adversarial example, returned as a dlarray object or a numeric array.

    • If you provide a dlnetwork object as input, the example is a dlarray object.

    • If you provide an ONNX or PyTorch modelfile as input, the example is a numeric array.

    The function finds a candidate adversarial example according to the algorithm described in the Adversarial Examples section. If the generated example is not misclassified as expected, then the function does not return it.

    If the function is unable to find an adversarial example, then this does not mean that the network is robust to adversarial attacks. To prove network robustness, use the verifyNetworkRobustness function.

    Predicted class of the adversarial example, returned as a numeric vector of class indices or as a categorical array. The datatype of mislabel is equal to the datatype of the label input argument.

    Input batch index, returned as a vector of integers.

    You can generate adversarial examples for several batches of input bounds and labels at once. However, the findAdversarialExamples function does not always find an adversarial example. If the generated example is not misclassified correctly, then the function does not return it. Therefore, the example output batch can be smaller than the input batches.

    Use the input batch index vector iX to index into XLower, XUpper, and label. For example, XLower(:,:,:,iX(n)) generates example(:,:,:,n).

    Example batch index, returned as a vector of integers.

    You can generate adversarial examples for several batches of input bounds and labels at once. However, the findAdversarialExamples function does not always find an adversarial example. If the generated example is not misclassified correctly, then the function does not return it. Therefore, the example output batch can be smaller than the input batches.

    Use the example index vector iE to index into example. For example, example(:,:,:,iE(m)) is generated by XLower(:,:,:,m).

    More About

    collapse all

    Algorithms

    collapse all

    References

    [1] Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. “Explaining and Harnessing Adversarial Examples.” Preprint, submitted March 20, 2015. https://arxiv.org/abs/1412.6572.

    Version History

    Introduced in R2026a