Modify Bug Finder Checkers Through Code Behavior Specifications
Polyspace® Bug Finder™ checks C/C++ code for defects or coding rule violations. To find these issues, some Bug Finder checkers require additional information outside your code.
You can specify most external information using the option -code-behavior-specifications
. The option allows you to associate specific behaviors with elements in your code. For instance, you can map a custom library function to a standard function or specify that a function acts as a memory management function.
The option takes a file in one of these formats as argument:
XML
Datalog
The XML and Datalog format support different code behaviors. This topic describes the checker modifications supported with each file format. For the full list of options that can modify checkers, see Modify Default Behavior of Bug Finder Checkers.
XML Format
The code behavior specifications XML file associates specific behaviors with functions (or defines behaviors at global scope). The following sections list behaviors that are relevant to a Bug Finder analysis. For behaviors relevant to Code Prover, see Modify Code Prover Run-Time Checks Through Code Behavior Specifications (Polyspace Code Prover).
You can also see the supported behaviors in the sample template file code-behavior-specifications-template.xml
provided with a Polyspace installation. The file is in
where polyspaceroot
\polyspace\verifier\cxx\
is the Polyspace installation folder. polyspaceroot
Mapping to Standard Function for Precision Improvement
XML Syntax:
<?xml version="1.0" encoding="UTF-8"?> <specifications xmlns="http://www.mathworks.com/PolyspaceCodeBehaviorSpecifications"> <functions> <function name="custom_function" std="std_function"> </function> </functions> </specifications>
Use this entry in the XML file to reduce the number of false negatives in a Bug Finder analysis. Sometimes, the analysis does not model certain kinds of functions precisely because of inherent limitations in static verification. In those cases, if you find a standard function that is a close analog of your function, use this mapping. Though your function itself is not analyzed, the analysis is more precise at the locations where you call the function. For instance, if the
analysis cannot model your function cos32
precisely and considers full range for its return value, map it to the cos
function for a return value in [-1,1].
The analysis ignores the body of your function but emulates your function behavior in the following ways:
The analysis assumes the same return values for your function as the standard function.
For instance, if you map your function
cos32
to the standard functioncos
, the verification assumes thatcos32
returns values in [-1,1].The analysis checks for the same issues as it checks with the standard function.
For instance, if you map your function
acos32
to the standard functionacos
, theInvalid use of standard library routine
(Polyspace Code Prover) check determines if the argument ofacos32
is in [-1,1].
The functions that you can map to include:
Standard library functions from
math.h
.Memory management functions from
string.h
.
If you run Code Prover, you can map to other functions to improve the verification precision. See Modify Code Prover Run-Time Checks Through Code Behavior Specifications (Polyspace Code Prover).
See also Extend Bug Finder Checkers for Standard Library Functions to Custom Libraries.
Mapping to Standard Function for Concurrency Detection
XML Syntax:
<?xml version="1.0" encoding="UTF-8"?> <specifications xmlns="http://www.mathworks.com/PolyspaceCodeBehaviorSpecifications"> <functions> <function name="custom_function" std="std_function"> </function> </functions> </specifications>
Use this entry in the XML file for automatic detection of thread-creation functions and functions that begin and end critical sections. Polyspace supports automatic detection for certain families of multitasking primitives only. Extend the support using this XML entry.
If your thread-creation function, for instance, does not belong to one of the supported families, map your function to a supported concurrency primitive.
See Extend Concurrency Defect Checkers to Unsupported Multithreading Environments.
Blocking Functions, Keywords or Macros
XML Syntax:
<?xml version="1.0" encoding="UTF-8"?> <specifications xmlns="http://www.mathworks.com/PolyspaceCodeBehaviorSpecifications"> <functions> <function name="my_func"> <behavior name="FORBIDDEN_FUNC"/> </function> </functions> </specifications>
Use this entry in the XML file to specify that a function
must not be used in your source code.my_func
XML Syntax:
<?xml version="1.0" encoding="UTF-8"?> <specifications xmlns="http://www.mathworks.com/PolyspaceCodeBehaviorSpecifications"> <tokens> <token name="my_keyword" kind="keyword"> <behavior name="FORBIDDEN_KEYWORD"/> </token> </tokens> </specifications>
Use this entry in the XML file to specify that a keyword
must not be used in your source code.my_keyword
XML Syntax:
<?xml version="1.0" encoding="UTF-8"?> <specifications xmlns="http://www.mathworks.com/PolyspaceCodeBehaviorSpecifications"> <token name="my_macro" kind="macro"> <behavior name="FORBIDDEN_MACRO"/> </token> </specifications>
Use this entry in the XML file to specify that a macro
must not be used in your source code.my_macro
Note that you can use the *
wildcard for functions and keywords (but not for macros). For instance, to forbid all functions whose names contain DEBUG
, you can enter:
<function name="*DEBUG*"> <behavior name="FORBIDDEN_FUNC"/> </function>
name
and kind
, you have to set an attribute regex
to true
. For instance, to forbid all macros starting with DEFINED_
,
enter:<token name="DEFINED_.*" kind="macro" regex="true"> <behavior name="FORBIDDEN_MACRO"/> </token>
See Flag Deprecated or Unsafe Functions, Keywords, or Macros Using Bug Finder Checkers.
Extending Initialization Checks
XML Syntax:
<?xml version="1.0" encoding="UTF-8"?> <specifications xmlns="http://www.mathworks.com/PolyspaceCodeBehaviorSpecifications"> <functions> <function name="my_func"> <check name=name="ARGUMENT_POINTS_TO_INITIALIZED_VALUE" arg="n/> </function> </functions> </specifications>
Use this entry in the XML file to specify if the pointer argument to a function
must point to an initialized buffer. The number my_func
specifies which argument must be checked for buffer initialization.n
See Extend Checkers for Initialization to Check Function Arguments Passed by Pointers.
Modifying Global Behavior
XML Syntax:
<?xml version="1.0" encoding="UTF-8"?> <specifications xmlns="http://www.mathworks.com/PolyspaceCodeBehaviorSpecifications"> <global_scope> <parameter name="MAX_NUMBER_NESTED_LEVEL_CONTROL_FLOW" value="n1"/> <parameter name="MAX_NUMBER_NESTED_LEVEL_INCLUDES" value="n2"/> <parameter name="MAX_NUMBER_CONSTANT_IN_ENUMERATION" value="n3"/> <parameter name="MAX_NUMBER_MACROS_TRANSLATION_UNIT" value="n4"/> <parameter name="MAX_NUMBER_MEMBERS_IN_STRUCT" value="n5"/> <parameter name="MAX_NUMBER_NESTED_MEMBERS_IN_STRUCT" value="n6"/> <parameter name="NUMBER_SIGNIFICANT_CHARACTER_EXTERNAL_IDENTIFIER" value="n7"/> <parameter name="NUMBER_SIGNIFICANT_CHARACTER_INTERNAL_IDENTIFIER" value="n8"/> </global_scope> </specifications>
n1,..,n8
specifies numerical values.Use the entries
for the following parameters to specify limits on certain aspects of your program. The modifications affect the checking of n1,..,n6
MISRA C:2012 Rule 1.1
.
MAX_NUMBER_NESTED_LEVEL_CONTROL_FLOW
: Maximum depth of nesting allowed in control flow statements.MAX_NUMBER_NESTED_LEVEL_INCLUDES
: Maximum levels of inclusion allowed using include files.MAX_NUMBER_CONSTANT_IN_ENUMERATION
: Maximum number of constants allowed in an enumeration.MAX_NUMBER_MACROS_TRANSLATION_UNIT
: Maximum number of macros allowed in a translation unit.MAX_NUMBER_MEMBERS_IN_STRUCT
: Maximum number of members allowed in a structure.MAX_NUMBER_NESTED_MEMBERS_IN_STRUCT
: Maximum levels of nesting allowed in a structure.
Use the entries
and n7
to specify how many characters must be compared to determine if two identifiers as identical. The modifications affect the checking of Rules 5.x.n8
NUMBER_SIGNIFICANT_CHARACTER_EXTERNAL_IDENTIFIER
: Number of characters to compare for external identifiers. External identifiers are ones declared with global scope or storage classextern
.NUMBER_SIGNIFICANT_CHARACTER_INTERNAL_IDENTIFIER
: Number of characters to compare for internal identifiers.
Specifying Real-Time Functions
XML Syntax:
<?xml version="1.0" encoding="UTF-8"?> <specifications xmlns="http://www.mathworks.com/PolyspaceCodeBehaviorSpecifications"> <functions> <function name="my_func"> <behavior name="REAL_TIME_FUNC"/> </function> </functions> </specifications>
Use this entry in the XML file to specify a function
as a real-time function See my_func
AUTOSAR C++14 Rule A18-5-7
.
Specifying Functions that Manage Memory
XML Syntax:
<?xml version="1.0" encoding="UTF-8"?> <specifications xmlns="http://www.mathworks.com/PolyspaceCodeBehaviorSpecifications"> <functions> <function name="my_func"> <behavior name="MANAGES_MEMORY"/> </function> </functions> </specifications>
Use this entry in the XML file to specify a function
as a function that allocates or deallocates dynamic memory. See my_func
AUTOSAR C++14 Rule A18-5-7
.
Specifying Functions that Handle Exceptions
XML Syntax:
<?xml version="1.0" encoding="UTF-8"?> <specifications xmlns="http://www.mathworks.com/PolyspaceCodeBehaviorSpecifications"> <functions> <function name="my_func"> <behavior name="EXCEPTION_HANDLING"/> </function> </functions> </specifications>
Use this entry in the XML file to specify a function
as a function that handles one or more exceptions. See my_func
AUTOSAR C++14 Rule A15-0-7
.
Specifying Critical Data Members
XML Syntax:
<?xml version="1.0" encoding="UTF-8"?> <specifications xmlns="http://www.mathworks.com/PolyspaceCodeBehaviorSpecifications"> <members> <member name="my_critical_object" kind="variable"> <behavior name="CRITICAL_DATA"/> </member> </members> </specifications>
Use this entry in the XML file to specify the data member
as critical. See my_critical_object
Critical data member is not private
.
You can specify multiple similarly-named data members in the list of critical data members using regular expressions. To use regular expressions, in addition to the attributes name
and kind
, you have to set an attribute regex
to true
.
For instance, to consider all member names starting with user
as critical data members, enter:
<member name="user.*" kind="variable" regex="true"> <behavior name="CRITICAL_DATA"/> </member>
Datalog Format
Datalog is a declarative logic programming language that Polyspace Bug Finder uses to create abstractions of function behavior. You can enter code behavior specifications in Datalog and add to the set of functions already abstracted for the Polyspace analysis.
The Datalog language, as used in Polyspace, is based on declaring relations between elements in the code. However, you do not need to know the full Datalog language to hook into the Polyspace analysis. You can simply invoke already declared relations from the Datalog-based API, using a function-like syntax.
Datalog Example
Consider the following example. The function insertIntoDatabase()
reads a string from standard input and uses the string in the construction of an SQL
query.
#include <stdio.h> #include <string.h> #include <sqlite3.h> void addToName(char *, char *); // Function to insert names into database void insertIntoDatabase(void) { sqlite3* db; char query[256] = ""; char partialName[256]; char fullName[256]; char* error_message; int nameLength; sqlite3_open("userCredentials.db", &db); fread(partialName, 1, 128, stdin); addToName(partialName, fullName); //Function with unknown semantics strcat(query, "INSERT INTO privatenames (name) VALUES ('"); strcat(query, fullName); strcat(query, "');"); // Possible SQL injection here if (sqlite3_exec(db, query, NULL, NULL, &error_message)) { sqlite3_free(error_message); } }
addToName()
copies the content of its first argument to its second argument entirely or partly. The function is effectively passing on contents entered through standard input into the SQL query.However, the defect checker SQL injection
cannot find this defect because the function addToName()
is not defined in this code and its semantics are not known. (In some cases, if the function is sufficiently complex, the analysis might lose track of a data flow from one function argument to another despite the
function being defined.)
To help with the analysis, you can externally specify the semantics of the
addToName()
function using Datalog syntax. In a file with
extension .dl
, for instance
bufferFlows.dl
, enter the
following:
.include "models/interfaces/taint.dl" Taint.Basic.hasFlow("addToName", $InParameterDeref(0), $OutParameterDeref(1), "Source buffer copied to destination").
Includes the file
. Here,polyspaceroot
\polyspace\pql\operating_system
\models\interfaces\taint.dl
is the Polyspace installation folder, for instance,polyspaceroot
C:\Program Files\R2024b
and
is one ofoperating_system
win64
,glnxa64
, andmaci64
.This file contains declarations of the relations that you invoke in your file.
Invokes the relation
Taint.Basic.hasFlow
to specify that there is a data flow from the first parameter of the functionaddToName()
to the second parameter:$InParameterDeref(0)
indicates the first parameter and designates the parameter as an input parameter. If a parameter is designated as an input parameter, its value prior to the function call is relevant for the relation.$OutParameterDeref(1)
indicates the second parameter and designates the parameter as output parameter. If a parameter is designated as an output parameter, its value after the function call is relevant for the relation.The last argument is a message that can be associated with the function
addToName
. If Bug Finder finds a defect that uses the Datalog specification, the message appears in the event list below the defect.
Specify the Datalog file as argument for the analysis option -code-behavior-specifications
, for instance:
-code-behavior-specifications bufferFlows.dl
addToName()
and report a possible SQL injection when the function sqlite3_exec()
is executed.The two options that support Datalog file specification are:
-code-behavior-specifications
: Modifies the default specifications of existing checkers.-create-checkers
: Creates new checkers.
Behaviors Supported by Datalog Format
The following checkers can be modified using code behavior specifications in Datalog format.
SQL injection
(can be modified using option-code-behavior-specifications
)LDAP injection
(can be modified using option-code-behavior-specifications
)Tainted source used with sensitive function
(requires option-create-checkers
)
The following code behaviors can be specified to modify the checkers:
Taint.Basic.hasFlow
: Specifies a function that propagates data from one function argument to another.For instance, suppose that the following function
addToName()
propagates data from the dereference of the first parameter to the dereference of the second parameter:You can specify this behavior using the following Datalog relation:void addToName (char *, char *);
Taint.Basic.hasFlow("addToName", $InParameterDeref(0), $OutParameterDeref(1), "Source buffer copied to destination").
All supported checkers can be modified using this code behavior.
Untrusted.Basic.isSource
: Specifies a function parameter or return value that represents potentially tainted data.For instance, the following specifications:
Imply that these data are potentially tainted:Untrusted.Basic.isSource("fileOpen", $OutParameterDeref(1), "Opens an untrusted stream"). Untrusted.Basic.isSource("fileOpen_v2", $OutReturnDeref(), "Returns an untrusted stream").
Dereference of the second parameter of a
fileOpen()
function, for instance with signatureFILE * fileOpen (const char *, FILE * , const char*)
.Dereference of the return value of a
fileOpen_v2()
function, for instance with signatureFILE * fileOpen_v2 ( int, const char*)
.
All supported checkers can be modified using this code behavior.
Sql.Basic.execution
: Specifies a function that executes an SQL query.For instance,suppose that the following function
sql_exec()
executes an SQL query and the first parameter is the query string:You can specify this behavior using the following Datalog relation:int sql_exec (char*, sqlite3* , int (*callback)(void*,int,char**,char**) , void*, char**);
Sql.Basic.execution("sql_exec", $InParameterDeref(0)).
The checker that can be modified using this behavior is
SQL injection
.Sql.Basic.sanitizing
: Specifies a function that sanitizes data to be used in an SQL query.For instance, suppose that the following function
sql_sanitize()
sanitizes data to be used in an SQL query and the first parameter is the sanitized string:You can specify this behavior using the following Datalog relation:int sql_sanitize (char *, int);
Sql.Basic.sanitizing("sql_sanitize", $OutParameterDeref(0)).
The checker that can be modified using this behavior is
SQL injection
.Ldap.Basic.sensitive
: Specifies a function that executes an LDAP query.For instance, suppose that the following function
search_ldap_db()
executes an LDAP search and the fourth parameter is the search filter string:You can specify this behavior using the following Datalog relation:int search_ldap_db (LDAP *, const char *, int, const char *, char **, int);
Ldap.Basic.sensitive("search_ldap_db", $InParameterDeref(3)).
The checker that can be modified using this behavior is
LDAP injection
.Ldap.Basic.sanitizing
:Specifies a function that sanitizes data to be used in an LDAP query.For instance, suppose that the following function
ldap_validate_inputs()
sanitizes data to be used in an LDAP search and the first parameter is the sanitized string:You can specify this behavior using the following Datalog relation:int ldap_validate_inputs (char *, int);
Ldap.Basic.sanitizing("ldap_validate_inputs", $OutParameterDeref(0)).
The checker that can be modified using this behavior is
LDAP injection
.CustomTainted.Basic.taintSource
: Specifies a function that must be considered as an tainted or untrusted source for the checkerTainted source used with sensitive function
. This related must be specified as:.comp checkerConfigName : CustomTainted { Basic.taintSource("getResponse",$OutReturnValue(),"Data from getResponse() is tainted."). } .init checkerName = tainted_source_use_custom<checkerConfigName>
CustomTainted.Basic.sensitive
: Specifies a function that must be considered as a sensitive function for the checkerTainted source used with sensitive function
. This relation must be specified as:.comp checkerConfigName : CustomTainted { Basic.sensitive("changeSystemState",$InParameterValue(0),"changeSystemState() must not use tainted data."). } .init checkerName = tainted_source_use_custom<checkerConfigName>
CustomTainted.Basic.sanitizing
: Specifies a function that must be considered as a sanitizing function for the checkerTainted source used with sensitive function
. This related must be specified as:.comp checkerConfigName : CustomTainted { Basic.sanitizing("ensureSafeResponse",$OutParameterDeref(0)). } .init checkerName = tainted_source_use_custom<checkerConfigName>