Main Content

-regex-replace-rgx -regex-replace-fmt

Make replacements in preprocessor directives

Syntax

-regex-replace-rgx matchFile -regex-replace-fmt replacementFile

Description

-regex-replace-rgx matchFile -regex-replace-fmt replacementFile replaces tokens in preprocessor directives for the purposes of Polyspace® analysis. The original source code is unchanged. You match a token using a regular expression in the file matchFile and replace the token using a replacement in the file replacementFile.

Use the option only to replace or remove tokens in the preprocessor directives before preprocessing. Normally, if a token in your source code causes a compilation error, you can replace or remove the token from the preprocessed code by using the more convenient option Command/script to apply to preprocessed files (-post-preprocessing-command). However, you cannot use the option to replace tokens in preprocessor directives. In this case, use -regex-replace-rgx -regex-replace-fmt.

For a complete list of regular expressions available with this option, see Perl documentation. Note that:

  • Perl allows the syntax s/pattern/replacement/modifier for replacements. When using this option, you emulate this syntax only partially. You specify the pattern to match, pattern, in one file and its replacement, replacement, in another file. Search modifiers, that is, values of modifier in the Perl syntax, are not supported. For instance, by default, the option makes global replacements (that is, replaces the matched tokens wherever it finds them) and the matches are case-sensitive. You cannot change these defaults using search modifiers.

  • The option supports both numbered and named capture groups. You can define a capture group in the match file by including it in parenthesis and use $1, $2, etc., to refer to those capture groups in the replacement file. Alternatively, you can also name the capture group and refer to the group by name. For an example, see Replace Multiple Preprocessor Directives with Different Replacements Using Capture Groups.

If you are running an analysis from the user interface (Polyspace desktop products only), on the Configuration pane, you can enter this option in the Other field. See Other.

In the user interface, specify absolute paths to the text files with the search and replace patterns.

Examples

Replace Undefined Symbols in Preprocessor Directive with Simpler Alternatives

Suppose that you want to modify this #define directive:

#define ROM_BEG_ADDR (uint16_t)(&_rom_beg)
to:
#define ROM_BEG_ADDR (0x4000u)
The reason for replacement might be that _rom_beg is undefined in the code provided to Polyspace and causes compilation issues. Since the tokens (uint16_t)(&_rom_beg) indicate an address and a Polyspace analysis does not keep track of precise addresses, you can replace (uint16_t)(&_rom_beg) with a simple address such as (0x4000u). To complicate the issue slightly, suppose also that you want to allow for one or more white space characters after #define and ROM_BEG_ADDR.

To make the replacement:

  1. Specify this regular expression in a file match.txt:

    ^#define\s+ROM_BEG_ADDR\s+\(uint16_t\)\(\&_rom_beg\)
    These elements are used in the regular expression:

    • ^ asserts position at the start of a line.

    • \s+ represents one or more white space characters.

    The characters *, &, ( and ) are escaped with \.

  2. Specify the replacement in a file replace.txt.

    #define ROM_BEG_ADDR \(0x4000u\)
    
  3. Specify the two text files during analysis with the options -regex-replace-rgx and -regex-replace-fmt:

    • Polyspace Bug Finder™:

      polyspace-bug-finder -sources fileName -regex-replace-rgx match.txt -regex-replace-fmt replace.txt
    • Polyspace Code Prover™:

      polyspace-code-prover -sources fileName -regex-replace-rgx match.txt -regex-replace-fmt replace.txt
    • Polyspace Bug Finder Server™:

      polyspace-bug-finder-server -sources fileName -regex-replace-rgx match.txt -regex-replace-fmt replace.txt
    • Polyspace Code Prover Server:

      polyspace-code-prover-server -sources fileName -regex-replace-rgx match.txt -regex-replace-fmt replace.txt

Replace Multiple Preprocessor Directives with Different Replacements Using Capture Groups

The code below defines two macros bypass_UInt16_ and bypass_UInt32_, both of which contain the undefined symbols (UInt16_DO_NOT_EXIST) and (UInt32_DO_NOT_EXIST).

typedef unsigned short UInt16;
typedef signed short Int16;

typedef unsigned int UInt32;
typedef signed int Int32;

UInt16 x16;
Int16 y16, z16;

UInt32 x32;
Int32 y32, z32;

#define bypass_UInt16_(_var, _value, _add) _var = _value +/*CTO*/(UInt16_DOES_NOT_EXIST) _add
#define bypass_UInt32_(_var, _value, _add) _var = _value +/*CTO*/(UInt32_DOES_NOT_EXIST) _add

void main(void){
    bypass_UInt16_(x16, y16, z16); 
    bypass_UInt32_(x32, y32, z32);   
}

Both undefined symbols follow a comment /*CTO*/. Since the symbols are undefined, they cause compilation errors. Suppose that you want to modify the macro definitions so that the undefined symbols appear inside the comments instead of following them. Since you want to perform similar modifications to both definitions but do not want to replace both undefined symbols with the same replacement, you can use capture groups to keep the symbol names intact.

To make the replacements:

  1. Specify this regular expression in a file named match.txt:

    \/\* CTO \*\/([^\)]*\))
    These elements are used in the regular expression:

    • The sequence \/\*CTO\*\/ matches the comment /*CTO*/". In the matching sequence, the characters * and / are escaped with a backslash character \.

    • The sequence enclosed in parentheses ([^\)]*\)) corresponds to a capture group where:

      • [^\)]* matches any character except for a closing parenthesis.

      • The second escaped parenthesis \) matches the closing parenthesis of the expression we want to match.

      Together, the capture group matches any character that is not a closing parenthesis after the /*CTO*/ comment, and then stops at the closing parenthesis.

      This capture group can capture both the undefined symbols, (UInt16_DO_NOT_EXIST) and (UInt32_DO_NOT_EXIST).

  2. Specify this replacement text in a file named replace.txt.

    /*CTO $1*/

    The $1 stands for the previously captured group. This replacement simply places the capture group before the */ closing the comments.

    Alternatively, you can create a named capture group and refer to the group by name in the replacement file. For example, this regular expression creates a named capture group cto_group:

    \/\* CTO \*\/(?<cto_group>[^/)]*\))
    To specify the named capture group in the replacement file, enter:
    /* CTO $+{cto_group} */
    The name of the capture group is included in $+{ }.

  3. Specify the two text files during analysis with the options -regex-replace-rgx and -regex-replace-fmt (as shown in the previous example).

You see the following code in the analysis results with the undefined symbols now included in comments:

typedef unsigned short UInt16;
typedef signed short Int16;

typedef unsigned int UInt32;
typedef signed int Int32;

UInt16 x16;
Int16 y16, z16;

UInt32 x32;
Int32 y32, z32;

#define bypass_UInt16_(_var, _value, _add) _var = _value + /* CTO  (UInt16_DO_NOT_EXIST) */ _add
#define bypass_UInt32_(_var, _value, _add) _var = _value + /* CTO  (UInt32_DO_NOT_EXIST) */ _add

void main(void){
    bypass_UInt16_(x16, y16, z16);
    bypass_UInt32_(x32, y32, z32);
}

Tips

  • If you use Polyspace as You Code extensions in IDEs, enter this option in an analysis options file. See Options Files for Polyspace Analysis.

  • The Polyspace regular expression engine interprets the dot . character as matching every character including the linefeed character. If you do not intend to capture the linefeed character with your regular expression, specify tokens other than the dot character in your regular expression. For example, to match everything except the linefeed character, use the expression [^\n].

  • To make replacements in multiple kinds of preprocessor directives, enter one regular expression per line in the match file and its replacement on the corresponding line in the replacement file. Each preprocessor line that matches a regular expression in the match file is replaced with the corresponding replacement from the replacement file. You can also enter all the matches and replacements in one line separated by |. However, one entry per line improves the readability of the files.

    For instance, the match file can contain two regular expressions such as:

    ^#define\s+ROM_BEG_ADDR\s+\(uint16_t\)\(\&_rom_beg\)
    ^#define\s+ROM_END_ADDR\s+\(uint16_t\)\(\&_rom_end\)
    And the replacement file can contain these two replacements:
    #define ROM_BEG_ADDR \(0x4000u\)
    #define ROM_END_ADDR \(0x8000u\)
    With these matches and replacements, the following source code:
    #include <stdint.h>
    
    #define ROM_BEG_ADDR (uint16_t)(&_rom_beg)
    #define ROM_END_ADDR (uint16_t)(&_rom_end)
    
    
    void main() {
        uint16_t beg_addr = ROM_BEG_ADDR;
        uint16_t end_addr = ROM_END_ADDR;
    }
    is converted to the following preprocessed code before analysis:
    #include <stdint.h>
    
    #define ROM_BEG_ADDR (0x4000u)
    #define ROM_END_ADDR (0x8000u)
    
    
    void main() {
        uint16_t beg_addr = ROM_BEG_ADDR;
        uint16_t end_addr = ROM_END_ADDR;
    }