Classify Project Files Into File Sets for Precise Control of Bug Finder Analysis
When analyzing C/C++ code with Polyspace® Bug Finder™, you can define file sets in your project that need specific treatment during analysis. For instance, you might want to skip the definitions of function bodies in third-party libraries or force analysis of all functions in files that you own. You can enumerate file sets with specific behaviors in a classification XML file and fine-tune the Bug Finder analysis using this classification file.
This example shows how to create a classification file and breaks down the structure of such a file. In your projects, you can apply similar reasoning to partition files into file sets and create appropriate classification files.
Classification File Structure Based on Analysis Requirements
Typically, the file sets in your classification file reflect an actual folder structure (or file name patterns).
This example follows the structure shown in the sample classification file in
(here, polyspaceroot
\polyspace\verifier\cxx\classification-template.xml
is the Polyspace installation folder, for instance, polyspaceroot
C:\Program Files\Polyspace\R2024b
). The classification file assumes a folder structure with these top-level folders and these source and header files. The nature of these files imposes requirements on how they must be analyzed.
Folder | Description | Files | Analysis Requirements |
---|---|---|---|
myproject | Folder containing the primary sources and headers to analyze. Some header files are machine-generated. The names of these files end with the string |
| All files excluding machine-generated ones must be fully analyzed. The analysis results must be available for review. |
acmelib | Folder containing headers from a third-party library. |
| All files must be included in analysis, but results must be suppressed from review. (Even if issues are found in third-party libraries, they cannot be typically fixed.) |
usr | Folder containing system headers. | usr/include/sys/os.h | All files must be excluded from analysis. This means that only function declarations must be used in the analysis but function bodies discarded. |
These requirements lead to the following file sets in the classification XML file. Each file set is defined through a fileset
element inside the classification
file:
<?xml version="1.0" encoding="UTF-8"?> <specification> <classification product="bug-finder"> <fileset name="Application implementation and header files"> <!--File set definitions--> </fileset> <fileset name="Generated code"> <!--File set definitions--> </fileset> <fileset name="Third-party libraries"> <!--File set definitions--> </fileset> <fileset name="Everything else"> <!--File set definitions--> </fileset> </classification> </specification>
classification
elements inside the file as long as their product
attributes are different. The product
attribute takes two values:
bug-finder
: The file sets in thisclassification
element applies to a full Polyspace Bug Finder analysis.bug-finder-access
: The file sets in thisclassification
element applies to a single-file analysis with Polyspace as You Code.
Classification File Usage
You specify a classification XML file using the Polyspace analysis option -classification
. If you are running an analysis in the Polyspace user interface (desktop products only), you have to enter this option in the Other
field.
For instance, suppose that your classification file is named my-classification-template.xml
and is located in the folder that contains the top-level source folders listed previously. You can specify the classification file in a Polyspace options file written as follows:
-sources myproject/src/foo.cpp -sources myproject/src/main.cpp -I myproject/inc -I acmelib/inc -I usr/include/sys -classification my-classification-template.xml
my-classification-template.xml
. The options file specifies:
Which source files must be analyzed (using option
-sources
).Which folders must be used for header file lookup (using option
-I
).Which classification XML file must be used to partition source and header files into file sets (using option
-classification
).
If the options file is named psoptions.txt
, you can provide this options file to a Polyspace
Bug Finder analysis as:
polyspace-bug-finder -options-file psoptions.txt
polyspace-bug-finder
command.)Parts of Classification File
The classification file contains several file sets, defined through fileset
elements. Each file set defines:
A set of files to include in a
files-in-set
element.For more information on how to specify files, see Select Files for Polyspace Analysis Using Pattern Matching.
A set of files to exclude in a
files-not-in-set
element.For more information on how to specify files, see Select Files for Polyspace Analysis Using Pattern Matching.
A set of behaviors to associate in a
behaviors
element. The currently supported behaviors are defined through these elements:analyze-functions
anddo-not-analyze-functions
: Using this element, you specify patterns for functions that must and must not be analyzed. You specify the patterns in afunction-pattern
element. The patterns currently supported are:Function Pattern Meaning *
All functions ::
top_level_namespace
::All functions in top level namespace top_level_namespace
$(all-relevant-functions)
All non- static
functions and allstatic
functions called somewhere in the code, excluding some pathological cases.show-results
: Using this element, you specify whether you want to see results in the files that are included in the file set. If if you do not want to see results, you set thevalue
attribute of this element tofalse
.Note that the
show-results
element applies to a result only if all files involved in the result are part of the file set. If a result involves two files, one in the file set and another not in the file set, and you specify avalue
offalse
for this file set, you might still see the result.
File Set 1: Application Sources and Headers
The first file set in your classification file typically defines the primary set of sources and headers to analyze. In this example, the fileset
element contains the following file set
definitions:
<fileset name="Application implementation and header files"> <files-in-set> <file-pattern>$(source-files)</file-pattern> <file-pattern>$(source-headers)</file-pattern> <file-pattern>myproject/inc/**/*.hpp</file-pattern> </files-in-set> <files-not-in-set> <file-pattern>myproject/**/*-generated.hpp</file-pattern> </files-not-in-set> <behaviors> <analyze-functions> <function-pattern>*</function-pattern> </analyze-functions> <do-not-analyze-functions> <function-pattern>::self_test::</function-pattern> </do-not-analyze-functions> <show-results value="true"/> </behaviors> </fileset>
Files Included. This file set includes the following files in separate file-pattern
elements:
All source files specified using the option
-sources
. The inclusion is done using the pattern$(source-files)
.Header files in the same folders as source files or in subfolders thereof. The inclusion is done using the pattern
$(source-headers)
.Header files in the folder
myproject/inc
or in subfolders thereof. The inclusion is done using the patternmyproject/inc/**/*.hpp
. The**
indicates recursive file lookup in subfolders.
Files Excluded. This file set excludes all .hpp
files in the folder myproject
or in subfolders thereof, with names ending in the string -generated
. The exclusion is done using the pattern myproject/**/*-generated.hpp
. The **
indicates recursive file lookup in subfolders.
Behaviors. This file set has the following behaviors:
All functions in the included files except ones in the C++ namespace
self_test
are analyzed.A function is analyzed if it matches a pattern in the
analyze-functions
element and does not match any pattern in thedo-not-analyze-functions
element. The pattern*
indicates all functions, even ones that Bug Finder would normally skip, for instance, uncalled static functions.All results in the included files are shown.
File Set 2: Generated Files
The second file set in this example covers generated files. The fileset
element contains the following file set
definitions:
<fileset name="Generated code"> <files-in-set> <file-pattern>myproject/**/*-generated.hpp</file-pattern> </files-in-set> <behaviors> <analyze-functions> <function-pattern>$(all-relevant-functions)</function-pattern> </analyze-functions> <show-results value="false"/> </behaviors> </fileset>
Files Included. This file set includes all .hpp
files in the folder myproject
or in subfolders thereof, with names ending in the string -generated
. The inclusion is done using the pattern myproject/**/*-generated.hpp
. The **
indicates recursive file lookup in subfolders.
Behaviors. This file set has the following behaviors:
All functions in the included files are analyzed, except ones that Bug Finder does not analyze by default (static uncalled functions and some pathological functions, for instance, ones containing several thousands of instructions).
The pattern
$(all-relevant-functions)
indicates this default subset of functions.Results in the included files are not shown.
The behaviors follow from the fact that functions in generated code can contribute to issues in handwritten code and must be included in the analysis. However, if the issues occur in the generated code itself, they are typically difficult to fix.
File Set 3: Third-Party Libraries
The third file set in this example covers third-party libraries. The fileset
element contains the following file set
definitions:
<fileset name="Third-party libraries"> <files-in-set> <file-pattern>acmelib/**</file-pattern> </files-in-set> <behaviors> <analyze-functions> <function-pattern>$(all-relevant-functions)</function-pattern> </analyze-functions> <show-results value="false"/> </behaviors> </fileset>
Files Included. This file set includes all files in the folder acmelib
or in subfolders thereof. The inclusion is done using the pattern acmelib/**/
. The **
indicates recursive file lookup in subfolders.
Behaviors. This file set has the following behaviors:
All functions in the included files are analyzed, except ones that Bug Finder does not analyze by default (static uncalled functions and some pathological functions, for instance, ones containing several thousands of instructions).
Results in the included files are not shown.
The behaviors follow from the fact that functions in third-party libraries can contribute to issues in the application code and must be included in the analysis. However, if the issues occur in the third-party libraries themselves, they are typically difficult to fix. This file set has the same behaviors as the generated code file set. However, a separate file set has been created to explicitly separate file patterns in generated code from ones in third-party libraries. This separation can make later maintenance easier.
File Set 4: Everything Else (Typically System Headers)
The fourth file set in this example is a catch-all subset that covers files not covered by included patterns in previous file sets. The fileset
element contains the following file set
definitions:
<fileset name="Everything else"> <files-in-set> <file-pattern>*</file-pattern> </files-in-set> <behaviors> <do-not-analyze-functions> <function-pattern>*</function-pattern> </do-not-analyze-functions> <show-results value="false"/> </behaviors> </fileset>
Files Included. This file set includes all files not included by any of the previous file sets. (Note that the analysis tries to match file sets in order, so if a file matched a pattern in a previous file set, it will not fall through to this file set.)
This file set typically includes system headers. In this example, the system headers are in the folder usr/include/sys
and are matched by the catch-all file pattern *
.
Behaviors. This file set has the following behaviors:
The bodies of functions in the included files are not analyzed.
Results in the included files are not shown.
The behaviors follow from the fact that system libraries typically contain many inline functions that could slow down the analysis. The behaviors make the analysis only consider function declarations but skip the function bodies.