Main Content

Classify Project Files Into File Sets for Precise Control of Bug Finder Analysis

When analyzing C/C++ code with Polyspace® Bug Finder™, you can define file sets in your project that need specific treatment during analysis. For instance, you might want to skip the definitions of function bodies in third-party libraries or force analysis of all functions in files that you own. You can enumerate file sets with specific behaviors in a classification XML file and fine-tune the Bug Finder analysis using this classification file.

This example shows how to create a classification file and breaks down the structure of such a file. In your projects, you can apply similar reasoning to partition files into file sets and create appropriate classification files.

Classification File Structure Based on Analysis Requirements

Typically, the file sets in your classification file reflect an actual folder structure (or file name patterns).

This example follows the structure shown in the sample classification file in polyspaceroot\polyspace\verifier\cxx\classification-template.xml (here, polyspaceroot is the Polyspace installation folder, for instance, C:\Program Files\Polyspace\R2024b). The classification file assumes a folder structure with these top-level folders and these source and header files. The nature of these files imposes requirements on how they must be analyzed.

FolderDescriptionFilesAnalysis Requirements
myproject

Folder containing the primary sources and headers to analyze.

Some header files are machine-generated. The names of these files end with the string -generated.

  • myproject/src/foo.cpp

  • myproject/src/foo.hpp

  • myproject/src/main.cpp

  • myproject/inc/bar-generated.hpp

  • myproject/inc/bar.hpp

All files excluding machine-generated ones must be fully analyzed. The analysis results must be available for review.

acmelib

Folder containing headers from a third-party library.

acmelib/inc/acme.hpp

All files must be included in analysis, but results must be suppressed from review. (Even if issues are found in third-party libraries, they cannot be typically fixed.)

usr

Folder containing system headers.

usr/include/sys/os.h

All files must be excluded from analysis. This means that only function declarations must be used in the analysis but function bodies discarded.

These requirements lead to the following file sets in the classification XML file. Each file set is defined through a fileset element inside the classification file:

<?xml version="1.0" encoding="UTF-8"?>
<specification>
   <classification product="bug-finder">
       <fileset name="Application implementation and header files">
          <!--File set definitions-->
       </fileset>
       <fileset name="Generated code">
          <!--File set definitions-->
       </fileset>
       <fileset name="Third-party libraries">
          <!--File set definitions-->
       </fileset>
       <fileset name="Everything else">
          <!--File set definitions-->
       </fileset>
    </classification>
</specification>
Note that there can be multiple classification elements inside the file as long as their product attributes are different. The product attribute takes two values:

  • bug-finder: The file sets in this classification element applies to a full Polyspace Bug Finder analysis.

  • bug-finder-access: The file sets in this classification element applies to a single-file analysis with Polyspace as You Code.

Classification File Usage

You specify a classification XML file using the Polyspace analysis option -classification. If you are running an analysis in the Polyspace user interface (desktop products only), you have to enter this option in the Other field.

For instance, suppose that your classification file is named my-classification-template.xml and is located in the folder that contains the top-level source folders listed previously. You can specify the classification file in a Polyspace options file written as follows:

-sources myproject/src/foo.cpp
-sources myproject/src/main.cpp
-I myproject/inc
-I acmelib/inc
-I usr/include/sys
-classification my-classification-template.xml
This options file is assumed to be in the same folder as my-classification-template.xml. The options file specifies:

  • Which source files must be analyzed (using option -sources).

  • Which folders must be used for header file lookup (using option -I).

  • Which classification XML file must be used to partition source and header files into file sets (using option -classification).

If the options file is named psoptions.txt, you can provide this options file to a Polyspace Bug Finder analysis as:

polyspace-bug-finder -options-file psoptions.txt
(An options file – a text file with one analysis option per line – is used in this example for convenience. You can also specify the analysis options directly at the command line with the polyspace-bug-finder command.)

Parts of Classification File

The classification file contains several file sets, defined through fileset elements. Each file set defines:

  • A set of files to include in a files-in-set element.

    For more information on how to specify files, see Select Files for Polyspace Analysis Using Pattern Matching.

  • A set of files to exclude in a files-not-in-set element.

    For more information on how to specify files, see Select Files for Polyspace Analysis Using Pattern Matching.

  • A set of behaviors to associate in a behaviors element. The currently supported behaviors are defined through these elements:

    • analyze-functions and do-not-analyze-functions: Using this element, you specify patterns for functions that must and must not be analyzed. You specify the patterns in a function-pattern element. The patterns currently supported are:

      Function PatternMeaning
      *All functions
      ::top_level_namespace::All functions in top level namespace top_level_namespace
      $(all-relevant-functions)All non-static functions and all static functions called somewhere in the code, excluding some pathological cases.

    • show-results: Using this element, you specify whether you want to see results in the files that are included in the file set. If if you do not want to see results, you set the value attribute of this element to false.

      Note that the show-results element applies to a result only if all files involved in the result are part of the file set. If a result involves two files, one in the file set and another not in the file set, and you specify a value of false for this file set, you might still see the result.

File Set 1: Application Sources and Headers

The first file set in your classification file typically defines the primary set of sources and headers to analyze. In this example, the fileset element contains the following file set definitions:

<fileset name="Application implementation and header files">
      <files-in-set>
           <file-pattern>$(source-files)</file-pattern>
           <file-pattern>$(source-headers)</file-pattern>
           <file-pattern>myproject/inc/**/*.hpp</file-pattern>
      </files-in-set>
      <files-not-in-set>
           <file-pattern>myproject/**/*-generated.hpp</file-pattern>
      </files-not-in-set>
      <behaviors>
           <analyze-functions>
                <function-pattern>*</function-pattern>
           </analyze-functions>
           <do-not-analyze-functions>
                <function-pattern>::self_test::</function-pattern>
           </do-not-analyze-functions>
           <show-results value="true"/>
     </behaviors>
</fileset>

Files Included.  This file set includes the following files in separate file-pattern elements:

  • All source files specified using the option -sources. The inclusion is done using the pattern $(source-files).

  • Header files in the same folders as source files or in subfolders thereof. The inclusion is done using the pattern $(source-headers).

  • Header files in the folder myproject/inc or in subfolders thereof. The inclusion is done using the pattern myproject/inc/**/*.hpp. The ** indicates recursive file lookup in subfolders.

Files Excluded.  This file set excludes all .hpp files in the folder myproject or in subfolders thereof, with names ending in the string -generated. The exclusion is done using the pattern myproject/**/*-generated.hpp. The ** indicates recursive file lookup in subfolders.

Behaviors.  This file set has the following behaviors:

  • All functions in the included files except ones in the C++ namespace self_test are analyzed.

    A function is analyzed if it matches a pattern in the analyze-functions element and does not match any pattern in the do-not-analyze-functions element. The pattern * indicates all functions, even ones that Bug Finder would normally skip, for instance, uncalled static functions.

  • All results in the included files are shown.

File Set 2: Generated Files

The second file set in this example covers generated files. The fileset element contains the following file set definitions:

<fileset name="Generated code">
      <files-in-set>
           <file-pattern>myproject/**/*-generated.hpp</file-pattern>
       </files-in-set>
       <behaviors>
           <analyze-functions>
               <function-pattern>$(all-relevant-functions)</function-pattern>
           </analyze-functions>
           <show-results value="false"/>
       </behaviors>
</fileset>

Files Included.  This file set includes all .hpp files in the folder myproject or in subfolders thereof, with names ending in the string -generated. The inclusion is done using the pattern myproject/**/*-generated.hpp. The ** indicates recursive file lookup in subfolders.

Behaviors.  This file set has the following behaviors:

  • All functions in the included files are analyzed, except ones that Bug Finder does not analyze by default (static uncalled functions and some pathological functions, for instance, ones containing several thousands of instructions).

    The pattern $(all-relevant-functions) indicates this default subset of functions.

  • Results in the included files are not shown.

The behaviors follow from the fact that functions in generated code can contribute to issues in handwritten code and must be included in the analysis. However, if the issues occur in the generated code itself, they are typically difficult to fix.

File Set 3: Third-Party Libraries

The third file set in this example covers third-party libraries. The fileset element contains the following file set definitions:

<fileset name="Third-party libraries">
      <files-in-set>
           <file-pattern>acmelib/**</file-pattern>
       </files-in-set>
       <behaviors>
           <analyze-functions>
               <function-pattern>$(all-relevant-functions)</function-pattern>
           </analyze-functions>
           <show-results value="false"/>
       </behaviors>
</fileset>

Files Included.  This file set includes all files in the folder acmelib or in subfolders thereof. The inclusion is done using the pattern acmelib/**/. The ** indicates recursive file lookup in subfolders.

Behaviors.  This file set has the following behaviors:

  • All functions in the included files are analyzed, except ones that Bug Finder does not analyze by default (static uncalled functions and some pathological functions, for instance, ones containing several thousands of instructions).

  • Results in the included files are not shown.

The behaviors follow from the fact that functions in third-party libraries can contribute to issues in the application code and must be included in the analysis. However, if the issues occur in the third-party libraries themselves, they are typically difficult to fix. This file set has the same behaviors as the generated code file set. However, a separate file set has been created to explicitly separate file patterns in generated code from ones in third-party libraries. This separation can make later maintenance easier.

File Set 4: Everything Else (Typically System Headers)

The fourth file set in this example is a catch-all subset that covers files not covered by included patterns in previous file sets. The fileset element contains the following file set definitions:

<fileset name="Everything else">
      <files-in-set>
           <file-pattern>*</file-pattern>
       </files-in-set>
       <behaviors>
           <do-not-analyze-functions>
               <function-pattern>*</function-pattern>
           </do-not-analyze-functions>
           <show-results value="false"/>
       </behaviors>
</fileset>

Files Included.  This file set includes all files not included by any of the previous file sets. (Note that the analysis tries to match file sets in order, so if a file matched a pattern in a previous file set, it will not fall through to this file set.)

This file set typically includes system headers. In this example, the system headers are in the folder usr/include/sys and are matched by the catch-all file pattern *.

Behaviors.  This file set has the following behaviors:

  • The bodies of functions in the included files are not analyzed.

  • Results in the included files are not shown.

The behaviors follow from the fact that system libraries typically contain many inline functions that could slow down the analysis. The behaviors make the analysis only consider function declarations but skip the function bodies.

See Also

Related Topics