主要内容

moondream

Create pretrained Moondream vision-language model (VLM)

Since R2026a

    Description

    Add-On Required: This feature requires the Computer Vision Toolbox Model for Moondream Vision Language Model add-on.

    The moondream object configures a pretrained Moondream™ vision-language model (VLM).

    Use the Moondream model to quickly understand image content by generating descriptive captions. Due to the lightweight design and speed of the model, you can use it for downstream low-latency tasks like alt-text generation, image-text retrieval, and basic scene description.

    To generate image captions using Moondream, use the captionImage object function.

    Creation

    Description

    mdModel = moondream loads a pretrained Moondream vision-language model with 2 billion parameters.

    example

    Properties

    expand all

    Name of the pretrained Moondream vision-language model, specified as a string scalar or a character vector.

    Object Functions

    captionImageCaption images using Moondream vision-language model (VLM)

    Examples

    collapse all

    Load the Moondream vision-language model.

    mdModel = moondream;

    Load an image to caption into the workspace, and display the image.

    I = imread("peppers.png");
    imshow(I)

    Figure contains an axes object. The hidden axes object contains an object of type image.

    Caption the image using the captionImage object function.

    captions = captionImage(mdModel,I);

    Display the generated image caption.

    display(captions)
    captions = 
    " A purple tablecloth holds a vibrant array of red, green, yellow, and white peppers, onions, and garlic, arranged in a visually appealing composition."
    

    Tips

    • The quality of Moondream outputs can vary across different data domains. Validate its predictions using a data set from a domain similar to your intended application.

    References

    [1] “Moondream.” Accessed September 2, 2025. https://moondream.ai/.

    Version History

    Introduced in R2026a