Main Content

whitespacePattern

Match whitespace characters

Since R2020b

Description

pat = whitespacePattern creates a pattern that matches text composed of one or more whitespace characters such as spaces and tabs.

example

pat = whitespacePattern(N) matches text composed of exactly N whitespace characters.

example

pat = whitespacePattern(minCharacters,maxCharacters) matches text composed of a number of whitespace characters greater than or equal to minCharacters and less than or equal to maxCharacters. inf is a valid value for maxCharacters. whitespacePattern is greedy and matches a number of whitespace characters as close to maxCharacters as possible.

Examples

collapse all

Use whitespacePattern to match nonstandard whitespace characters like char(160).

Create a cell array of character vectors that each contain a different whitespace character including tab and newline characters.

whitespaces = {' ' char(9) newline char(32) char(160)}
whitespaces = 1×5 cell
    {' '}    {'→'}    {'↵'}    {' '}    {' '}

Build a pattern that matches whitespace characters using whitespacePattern. Determine which character vectors contain whitespace valid characters using contains.

pat = whitespacePattern;
contains(whitespaces,pat)
ans = 1×5 logical array

   1   1   1   1   1

Use whitespacePattern to place nonstandard whitespaces with the standard ' ' character.

Create txt as a character vector.

txt = ['This' char(9) 'char' newline 'vector' char(160) 'has' char(32) 'nonstandard' char(8193) 'spaces']
txt = 
    'This	char
     vector has nonstandard spaces'

Create pat as a pattern object that matches individual whitespace characters using whitespacePattern. Replace the parts of text matched with a single space.

pat = whitespacePattern(1);
txt = replace(txt,pat," ")
txt = 
'This char vector has nonstandard spaces'

Use whitespacePattern to correct spacing when more than one whitespace character exists.

Create txt as a string. Create pat as a pattern object that matches 2 or more whitespace characters using whitespacePattern. Replace the parts of text matched with a single space.

txt = "Text looks   strange    with    extra    spaces";
pat = whitespacePattern(2,inf);
txt = replace(txt,pat," ")
txt = 
"Text looks strange with extra spaces"

Input Arguments

collapse all

Number of characters to match, specified as a nonnegative integer scalar.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Minimum number of characters to match, specified as a nonnegative integer scalar.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Maximum number of characters to match, specified as a nonnegative integer scalar.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Output Arguments

collapse all

Pattern expression, returned as a pattern object.

More About

collapse all

Definitions

A whitespace is any character or series of characters that represent horizontal or vertical space. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area on a page. Common whitespace characters include:

Significant Whitespace Character

Description

char(32)

Standard whitespace character, ' '

char(133)

Next line

char(160)

Nonbreaking space

char(8199)

Figure space

char(8239)

Narrow no-break space

For more information, see Whitespace character.

Extended Capabilities

Thread-Based Environment
Run code in the background using MATLAB® backgroundPool or accelerate code with Parallel Computing Toolbox™ ThreadPool.

Version History

Introduced in R2020b