-regex-replace-rgx -regex-replace-fmt
Make replacements in preprocessor directives
Syntax
-regex-replace-rgx
matchFile
-regex-replace-fmt replacementFile
Description
-regex-replace-rgx
replaces tokens in preprocessor directives for the purposes of Polyspace® analysis. The original source code is unchanged. You match a token using a
regular expression in the file matchFile
-regex-replace-fmt replacementFile
and
replace the token using a replacement in the file
matchFile
replacementFile
.
Use the option only to replace or remove tokens in the preprocessor directives
before preprocessing. Normally, if a token in your source code causes a
compilation error, you can replace or remove the token from the preprocessed code by using the
more convenient option Command/script to apply to preprocessed files
(-post-preprocessing-command)
. However, you cannot use the option to replace
tokens in preprocessor directives. In this case, use -regex-replace-rgx
-regex-replace-fmt
.
For a complete list of regular expressions available with this option, see Perl documentation. Note that:
Perl allows the syntax
s/
for replacements. When using this option, you emulate this syntax only partially. You specify the pattern to match,pattern
/replacement
/modifier
, in one file and its replacement,pattern
, in another file. Search modifiers, that is, values ofreplacement
in the Perl syntax, are not supported. For instance, by default, the option makes global replacements (that is, replaces the matched tokens wherever it finds them) and the matches are case-sensitive. You cannot change these defaults using search modifiers.modifier
The option supports both numbered and named capture groups. You can define a capture group in the match file by including it in parenthesis and use
$1
,$2
, etc., to refer to those capture groups in the replacement file. Alternatively, you can also name the capture group and refer to the group by name. For an example, see Replace Multiple Preprocessor Directives with Different Replacements Using Capture Groups.
If you are running an analysis from the user interface
(Polyspace desktop products only), on the Configuration pane, you can
enter this option in the Other field. See Other
.
In the user interface, specify absolute paths to the text files with the search and replace patterns.
Examples
Replace Undefined Symbols in Preprocessor Directive with Simpler Alternatives
Suppose that you want to modify this #define
directive:
#define ROM_BEG_ADDR (uint16_t)(&_rom_beg)
#define ROM_BEG_ADDR (0x4000u)
_rom_beg
is undefined in the code
provided to Polyspace and causes compilation issues. Since the tokens
(uint16_t)(&_rom_beg)
indicate an address and a Polyspace analysis does not keep track of precise addresses, you can replace
(uint16_t)(&_rom_beg)
with a simple address such as
(0x4000u)
. To complicate the issue slightly, suppose also that you want
to allow for one or more white space characters after #define
and
ROM_BEG_ADDR
.To make the replacement:
Specify this regular expression in a file
match.txt
:These elements are used in the regular expression:^#define\s+ROM_BEG_ADDR\s+\(uint16_t\)\(\&_rom_beg\)
^
asserts position at the start of a line.\s+
represents one or more white space characters.
The characters
*
,&
, ( and ) are escaped with\
.Specify the replacement in a file
replace.txt
.#define ROM_BEG_ADDR \(0x4000u\)
Specify the two text files during analysis with the options
-regex-replace-rgx
and-regex-replace-fmt
:Polyspace Bug Finder™:
polyspace-bug-finder -sources fileName -regex-replace-rgx match.txt -regex-replace-fmt replace.txt
Polyspace Code Prover™:
polyspace-code-prover -sources fileName -regex-replace-rgx match.txt -regex-replace-fmt replace.txt
Polyspace Bug Finder Server™:
polyspace-bug-finder-server -sources fileName -regex-replace-rgx match.txt -regex-replace-fmt replace.txt
Polyspace Code Prover Server:
polyspace-code-prover-server -sources fileName -regex-replace-rgx match.txt -regex-replace-fmt replace.txt
Replace Multiple Preprocessor Directives with Different Replacements Using Capture Groups
The code below defines two macros bypass_UInt16_
and bypass_UInt32_
, both of which contain the
undefined symbols (UInt16_DO_NOT_EXIST
) and
(UInt32_DO_NOT_EXIST
).
typedef unsigned short UInt16; typedef signed short Int16; typedef unsigned int UInt32; typedef signed int Int32; UInt16 x16; Int16 y16, z16; UInt32 x32; Int32 y32, z32; #define bypass_UInt16_(_var, _value, _add) _var = _value +/*CTO*/(UInt16_DOES_NOT_EXIST) _add #define bypass_UInt32_(_var, _value, _add) _var = _value +/*CTO*/(UInt32_DOES_NOT_EXIST) _add void main(void){ bypass_UInt16_(x16, y16, z16); bypass_UInt32_(x32, y32, z32); }
Both undefined symbols follow a comment /*CTO*/
.
Since the symbols are undefined, they cause compilation errors. Suppose
that you want to modify the macro definitions so that the undefined
symbols appear inside the comments instead of following them. Since you
want to perform similar modifications to both definitions but do not
want to replace both undefined symbols with the same replacement, you
can use capture groups to keep the symbol names intact.
To make the replacements:
Specify this regular expression in a file named
match.txt
:These elements are used in the regular expression:\/\* CTO \*\/([^\)]*\))
The sequence
\/\*CTO\*\/
matches the comment/*CTO*/
". In the matching sequence, the characters*
and/
are escaped with a backslash character\
.The sequence enclosed in parentheses
([^\)]*\))
corresponds to a capture group where:[^\)]*
matches any character except for a closing parenthesis.The second escaped parenthesis
\)
matches the closing parenthesis of the expression we want to match.
Together, the capture group matches any character that is not a closing parenthesis after the
/*CTO*/
comment, and then stops at the closing parenthesis.This capture group can capture both the undefined symbols,
(UInt16_DO_NOT_EXIST)
and(UInt32_DO_NOT_EXIST)
.
Specify this replacement text in a file named
replace.txt
./*CTO $1*/
The
$1
stands for the previously captured group. This replacement simply places the capture group before the*/
closing the comments.Alternatively, you can create a named capture group and refer to the group by name in the replacement file. For example, this regular expression creates a named capture group
cto_group
:To specify the named capture group in the replacement file, enter:\/\* CTO \*\/(?<cto_group>[^/)]*\))
The name of the capture group is included in/* CTO $+{cto_group} */
$+{ }
.Specify the two text files during analysis with the options
-regex-replace-rgx
and-regex-replace-fmt
(as shown in the previous example).
You see the following code in the analysis results with the undefined symbols now included in comments:
typedef unsigned short UInt16; typedef signed short Int16; typedef unsigned int UInt32; typedef signed int Int32; UInt16 x16; Int16 y16, z16; UInt32 x32; Int32 y32, z32; #define bypass_UInt16_(_var, _value, _add) _var = _value + /* CTO (UInt16_DO_NOT_EXIST) */ _add #define bypass_UInt32_(_var, _value, _add) _var = _value + /* CTO (UInt32_DO_NOT_EXIST) */ _add void main(void){ bypass_UInt16_(x16, y16, z16); bypass_UInt32_(x32, y32, z32); }
Tips
If you use Polyspace as You Code extensions in IDEs, enter this option in an analysis options file. See Options Files for Polyspace Analysis.
The Polyspace regular expression engine interprets the dot
.
character as matching every character including the linefeed character. If you do not intend to capture the linefeed character with your regular expression, specify tokens other than the dot character in your regular expression. For example, to match everything except the linefeed character, use the expression[^\n]
.To make replacements in multiple kinds of preprocessor directives, enter one regular expression per line in the match file and its replacement on the corresponding line in the replacement file. Each preprocessor line that matches a regular expression in the match file is replaced with the corresponding replacement from the replacement file. You can also enter all the matches and replacements in one line separated by
|
. However, one entry per line improves the readability of the files.For instance, the match file can contain two regular expressions such as:
And the replacement file can contain these two replacements:^#define\s+ROM_BEG_ADDR\s+\(uint16_t\)\(\&_rom_beg\) ^#define\s+ROM_END_ADDR\s+\(uint16_t\)\(\&_rom_end\)
With these matches and replacements, the following source code:#define ROM_BEG_ADDR \(0x4000u\) #define ROM_END_ADDR \(0x8000u\)
is converted to the following preprocessed code before analysis:#include <stdint.h> #define ROM_BEG_ADDR (uint16_t)(&_rom_beg) #define ROM_END_ADDR (uint16_t)(&_rom_end) void main() { uint16_t beg_addr = ROM_BEG_ADDR; uint16_t end_addr = ROM_END_ADDR; }
#include <stdint.h> #define ROM_BEG_ADDR (0x4000u) #define ROM_END_ADDR (0x8000u) void main() { uint16_t beg_addr = ROM_BEG_ADDR; uint16_t end_addr = ROM_END_ADDR; }