Main Content

Code Prover Impact Analysis Assumptions

A Polyspace® Code Prover™ analysis can determine if one object in a C program impacts another. If you designate one object as a source and another as sink, Code Prover can determine if there is data flow from the source to the sink or if the sink acquires one of several values depending on the value of the source.

During impact analysis, Code Prover makes certain assumptions about data flow in the program. This topic lists some of the impact analysis assumptions.

Stubbed Functions

If a function is not defined during impact analysis, the analysis stubs the function and assumes that there is no data flow from one argument of the function to another argument or the function return value. To work around this assumption, you can explicitly define an impact-flow element in your impact specification file and define which function argument might affect another function argument or the function return value.

Consider the following example:

  • Source:

    int external_data;
    int sensitive_data;
    
    int copyData(int*, int);
    
    void main() {
        int local_data = external_data;
        copyData(&sensitive_data, local_data);
    }

  • Impact specification file:

    <?xml version="1.0" encoding="UTF-8"?>
    <impact-specifications xmlns="http://www.mathworks.com/PolyspaceImpactSpecifications">
         <variables>
           <variable>
               <match>
                   <match-declaration>
                       <match-qualified-name value="external_data"/>
                   </match-declaration>
               </match>
               <impact-source id="source1" event="onRead"/>
            </variable>
            <variable>
               <match>
                   <match-declaration>
                       <match-qualified-name value="sensitive_data"/>
                   </match-declaration>
               </match>
               <impact-sink id="sink1" event="onWrite"/>
            </variable>
        </variables>
        <expect>
            <impact source="source1" sink="sink1"/>
        </expect>
    </impact-specifications>

Since the function copyData() is not defined in the source code, the impact analysis assumes that there is no data flow through this function. As a result, the impact specification appears as violated. In other words, the analysis shows that there is no impact of the variable external_data on the variable sensitive_data.

If you know that copyData() writes from its second argument to the contents of the memory zone that the first argument points to, you can add an impact-flow definition in the impact specification file:

<?xml version="1.0" encoding="UTF-8"?>
<impact-specifications xmlns="http://www.mathworks.com/PolyspaceImpactSpecifications">
     <variables>
       <variable>
           <match>
               <match-declaration>
                   <match-qualified-name value="external_data"/>
               </match-declaration>
           </match>
           <impact-source id="source1" event="onRead"/>
        </variable>
        <variable>
           <match>
               <match-declaration>
                   <match-qualified-name value="sensitive_data"/>
               </match-declaration>
           </match>
           <impact-sink id="sink1" event="onWrite"/>
        </variable>
    </variables>
	<functions>
		<function>
			<match>
				<match-declaration>
					<match-qualified-name regex="copyData"/>
				</match-declaration>
			</match>
		</function>
		<impact-flow id="copy function"
					 from-memory="(Argument (Function) 1)"
					 to-memory="(Indirect(Argument (Function) 0)))"/>
	</functions>
    <expect>
        <no-impact source="source1" sink="sink1"/>
    </expect>
</impact-specifications>

Even though the impact-flow element is most useful when defining the behavior of a stubbed function, you can also use this element to override the behavior of a function already defined in your code. The impact analysis uses your impact flow specification for that function and skips the actual function definition.

Assembly Code

The analysis ignores assembly code in a C program and therefore assumes that there is no data flow between C code variables in the assembly code. You cannot work around this assumption.

Consider the following example:

  • Source:

    int external_data;
    int sensitive_data;
    
    int main()
    {
        int local_data;
        asm ("mov %1, %0\n\t"
             "add $1, %0"
             : "=r" (local_data)
             : "r" (external_data));
        sensitive_data = local_data;
        return 0;
    }
    

  • Impact specification file:

    <?xml version="1.0" encoding="UTF-8"?>
    <impact-specifications xmlns="http://www.mathworks.com/PolyspaceImpactSpecifications">
         <variables>
           <variable>
               <match>
                   <match-declaration>
                       <match-qualified-name value="external_data"/>
                   </match-declaration>
               </match>
               <impact-source id="source1" event="onRead"/>
            </variable>
            <variable>
               <match>
                   <match-declaration>
                       <match-qualified-name value="sensitive_data"/>
                   </match-declaration>
               </match>
               <impact-sink id="sink1" event="onWrite"/>
            </variable>
        </variables>
        <expect>
            <impact source="source1" sink="sink1"/>
        </expect>
    </impact-specifications>

In this example, the source variable external_data is copied to the sink variable sensitive_data through an intermediate variable local_data. However, the copy from external_data to local_data is done using assembly code instructions. Since impact analysis ignores assembly code, the impact specification appears as violated. In other words, the analysis shows that there is no impact of the variable external_data on the variable sensitive_data.

Code With Undefined Behavior

Impact analysis does not consider data flow across elements with undefined behavior. As a result, if your code has possible run-time errors, running impact analysis alone might lead to misleading conclusions. To avoid these situations, when possible, try to run a full Code Prover analysis instead of running impact analysis only, and fix red checks if any.

Consider the following example:

  • Source:

    #include <stdlib.h>
    
    struct message {
        int size;
        int payload[];
    };
    
    extern int input();
    extern void output(struct message*);
    
    int main() {
        struct message * msg = malloc(sizeof(*msg)); //Insufficient memory allocated
        msg->payload[0] = input(); //Illegal pointer dereference
        output(msg);
        return 0;
    }
  • Impact specification file:

    <?xml version="1.0" encoding="UTF-8"?>
    <impact-specifications xmlns="http://www.mathworks.com/PolyspaceImpactSpecifications">
    <functions>
    	<function name="input">
    		<impact-source id="input"
    		               event="onCallReturn"
    		               memory="(Return (Function))"/>
    	</function>
    	<function name="output">
    		<impact-sink id="output"
    		             event="onCallReturn"
    		             memory="(Array (Field (Indirect (Argument (Function) 0)) payload) (Any))"/>
    	</function>
    </functions>
    	<expect>
    		<impact source="input" sink="output"/>
    	</expect>
    </impact-specifications>

In this example, there is an impact between these two elements:

  • Source – Return value of the input() function

  • Sink – Field payload of the dereference of the pointer passed as first argument to the output() function.

However, the impact specification appears as violated. The reason is that the buffer allocated to the pointer msg does not have sufficient memory to allow dereference of the payload field, resulting in an illegal dereference on the line:

msg->payload[0] = input();
Following this illegal dereference, the behavior of the code is undefined and impact analysis does not show the possible impact.

If you change the code slightly to allocate sufficient memory to the msg pointer, the specification no longer appears as violated. A possible modification can be the following:

#include <stdlib.h>

struct message {
    int size;
    int payload[];
};

extern int input();
extern void output(struct message* );

int main() {
    struct message * msg = malloc(sizeof(*msg) + sizeof(int)); 
    msg->payload[0] = input(); 
    output(msg);
    return 0;
}

Variadic Functions

If a function has variable number of parameters, the function arguments or parameters cannot be specified as sources or sinks for impact analysis.

Compile-Time Constants

You cannot use a compile-time constant as a source for impact analysis since impact analysis occurs after the compilation phase where compile-time constants are replaced with their values.

Consider the following C++17 example:

  • Source:

    #include <numeric>
    
    constexpr int source = 42; 
    int sink;   
    
    int main() {
        sink = std::gcd(source, source);  
        return 0;
    }
    
  • Impact specification file:

    <?xml version="1.0" encoding="UTF-8"?>
    <impact-specifications xmlns="http://www.mathworks.com/PolyspaceImpactSpecifications">
         <variables>
           <variable>
               <match>
                   <match-declaration>
                       <match-qualified-name value="source"/>
                   </match-declaration>
               </match>
               <impact-source id="source1" event="onRead"/>
            </variable>
            <variable>
               <match>
                   <match-declaration>
                       <match-qualified-name value="sink"/>
                   </match-declaration>
               </match>
               <impact-sink id="sink1" event="onWrite"/>
            </variable>
        </variables>
        <expect>
            <impact source="source1" sink="sink1"/>
        </expect>
    </impact-specifications>

The results of impact analysis indicate that the variable source has no impact on the variable sink. The reason is that the variable source is a compile-time constant. Since a compile-time constant is replaced with its value following compilation, the subsequent impact analysis has no awareness of the variable.

See Also

| | |

Related Topics