Main Content

Expensive construction of std::string or std::regex from constant string

A constant std::string or std::regex object is constructed from constant data, resulting in inefficient code

Since R2020b

Description

Polyspace® reports this defect when both of these conditions are true:

  • You construct an std::string or std::regex object from constant data, such as a string literal or the output of a constexpr function.

  • The std::string or std::regex object remains constant or unmodified after the construction.

This defect checker does not flag class member variables, and string literals that are function arguments.

Risk

Consider a code block that constructs an std::string or std::regex object from constant data that remains unmodified after construction. Every time this code block executes, a new object is constructed without the same constant data. Repeated construction of such an object with no modification to the content is inefficient and difficult for you detect. Consider this code:

#include <string>
constexpr char* getStrPtr() { 
	return "abcd"; 
}
void foo(){
	std::string s1 = "abcd";
	std::regex reg1{"(\\w+)"};
	std::string s2 = getStrPtr();                 
}
int main(){
	//...
	for(int i = 0; i<10000; ++i)
	foo();
}
In this code, main() invokes the function foo 10000 times in a loop. Each time, the function foo constructs the objects reg1, s1, s2 from the same constant string literal, resulting in inefficient code. Because such inefficient code compiles and functions correctly, you might not notice the inefficient construction of these objects from constant data.

Fix

The fix for this defect depends on how you intend to use the constant data.

  • If you do not need to reuse the constant data, use is directly as temporary literals

  • If you need to reuse the constant data and you need the functionalities of std::string or std::regex class, store the constant data in a static string object.

  • If you need to reuse the constant data and you do not need the functionalities of or std::regex class, store the constant data by using a const character array or an std::string_view object. Use of std::string_view is supported by C++17 and later.

Consider this code:

constexpr char* getStrPtr() { 
	return "abcd"; 
}
void foo(){
	static const std::string s1 = getStrPtr();
	static const std::regex reg1{"(\\w+)"};
	std::string_view s1a{s1};
                
}
int main(){
	//...
	for(int i = 0; i<10000; ++i)
	foo();
}
In this code, you declare the objects reg1 and s1 as static const. Because these are static, they are constructed only once even if foo is called 10000 times. The std::string_view object s1a shows the content of s1 and avoids the construction of an std::string object. By using std::string_view and static const objects, you avoid unnecessary construction of constant objects. This method also clarifies that the objects s1 and s1a represent the same data.

Performance improvements might vary based on the compiler, library implementation, and environment that you are using.

Examples

expand all

In this example, you declare several std::string and std::regex objects that are not modified after their construction.

  • Because foo() constructs the objects reg1, s1, s2, s5, and s6 from string literals and these objects remains unmodified after construction, Polyspace flags these objects.

  • Because foo() constructs s4 and s3 from compile-time constants and these objects remains unmodified after construction, Polyspace flags these objects. The object s3a is not flagged because the output of getCount is not a constant.

  • Polyspace does not flag these objects when they are constructed from constant data:

    • An object that is not an std::string, such as *p.

    • Temporary objects that are constructed as a function argument, such as the object containing the string literal message in the argument of CallFunc.

#include <string>
#include <regex>
constexpr char* getStrPtr() { 
	return "abcd"; 
}
constexpr size_t FOUR(){
	return 4; 
}
size_t getCount();
void CallFunc(std::string s);
void foo(){
	std::regex reg1{"(\\w+)"};
	std::string s1 = "abcd";
	std::string s2{"abcd"};
	std::string s3 = getStrPtr();  
	std::string s4("abcd", FOUR());
	std::string s5("abcd"), s6("abcd");
}

void bar(){
	std::string s3a("abcd", getCount());
	char *p = "abcd";
	std::string s_p = p;
	CallFunc("message");
}
Correction

You can fix this defect using several enhancements:

  • Declare the objects s1, reg1 and s4 as static const. When an object is static, the compiler does not reconstruct it in different scopes. When you need the functionalities of an std::string or std::regex class, this declaration is a good fix.

  • Declare s3 by using a constant character pointer p.

  • Declare the constant strings s2, s5 and s6 as std::string_view objects. These objects do not contain copies of the constant strings, which makes these objects efficient.


#include <string>
#include <string_view>
#include<regex>
constexpr char* getStrPtr() { 
	return "abcd"; 
}
constexpr size_t FOUR(){
	return 4; 
}

void foo(){
	static const std::string s1 = "abcd";
	static const std::regex reg1{"(\\w+)"};
	std::string_view s2{s1};
	const char *p = getStrPtr();
	std::string s3 = p;
	static std::string s4("abcd", FOUR());
	std::string_view s5{s1}, s6{s4};
}

Result Information

Group: Performance
Language: C++
Default: Off
Command-Line Syntax: EXPENSIVE_CONSTANT_STD_STRING
Impact: Medium

Version History

Introduced in R2020b

expand all