Reference  

Regular Expressions for Text Searches

Sometimes searching for a literal string is too limiting. For example, you may want to search for a word starting at the beginning of a line or two words separated by any number of spaces. The text-based editors in SilverStream eXtend Workbench support the use of regular expressions—patterns for describing string matching—to augment the usual search capabilities.

This chapter includes the following sections:

NOTE   Regular expression search is available in the Search>Find in Files dialog and in the Find dialog invoked from a native editor.

 
Top of page

Using regular expressions in search operations

In addition to allowing you to type regular expression syntax directly into the Search for text box, the Find dialog has a regular expression helper menu you can use in constructing regular expression searches.

Selecting a helper menu item appends to the expression one or more characters that make up a syntactical building block. For most regular expression searches you will need to use several of these syntactical building blocks in combination with text you type directly into the Search for text box.

    For more information on the regular expression helper menu items, see the Regular expression reference.

NOTE   When doing regular expression searches, you will not be able to use all the Find dialog search options. For example, the Match whole word option becomes meaningless, since this choice is made within the regular expression itself.

To use regular expressions in a search operation:

  1. Select Search>Find in Files or select Search>Find in a native editor.

    The Find dialog displays.

  2. Select the Regular Expression check box.

    OR

    Click the right-arrow to the right of the Search for text box and make a selection from the regular expression helper menu.

  3. Type a regular expression in the Search for text box, or use a combination of literal text and selections from the regular expression helper menu to construct your regular expression.

    For example:

    To match

    Enter

    getText or setText

    [gs]etText

    void followed by main, with any amount of white space between the two words

    void\s+main

  4. Click OK to begin the search.

    Text matching the search criteria appears highlighted in the Workbench editor.

 
Top of page

Regular expression reference

The tables in this section explain the syntactical building blocks of regular expressions for Workbench. Many of these building blocks are available on the regular expression helper menu.

There are several categories of building blocks for regular expressions:

 
Top of section

Characters

Syntax

Description

unicodeChar

Matches any identical unicode character

\

Used to quote a metacharacter (like *)

\\

Matches a single slash (\) character

\0nnn

Matches a given octal character

\xhh

Matches a given 8-bit hexadecimal character

\\uhhhh

Matches a given 16-bit hexadecimal character

\t

Matches an ASCII tab character

\n

Matches an ASCII newline character

\r

Matches an ASCII return character

\f

Matches an ASCII formfeed character

 
Top of section

Character classes

Syntax

Description

[abc]

Simple character class

[a-zA-Z]

Character class with ranges

[^abc]

Negated character class

 
Top of section

Standard POSIX character classes

Syntax

Description

[:alnum:]

Alphanumeric characters

[:alpha:]

Alphabetic characters

[:cntrl:]

Control characters

[:digit:]

Numeric characters

[:graph:]

Characters that are both printable and visible (for example, a space is printable but not visible, but an a is both printable and visible)

[:lower:]

Lowercase alphabetic characters

[:print:]

Printable characters (characters that are not control characters)

[:punct:]

Punctuation characters (characters that are not letters, digits, control characters, or space characters)

[:space:]

Space characters (such as space, tab, and formfeed)

[:upper:]

Uppercase alphabetic characters

 
Top of section

Nonstandard POSIX-style character classes

Syntax

Description

[:javastart:]

Start of a Java identifier

[:javapart:]

Part of a Java identifier

 
Top of section

Predefined classes

Syntax

Description

.

Matches any character other than newline

\w

Matches a word character (alphanumeric plus "_")

\W

Matches a nonword character

\s

Matches a whitespace character

\S

Matches a nonwhitespace character

\d

Matches a digit character

\D

Matches a nondigit character

 
Top of section

Boundary matchers

Syntax

Description

^

Matches only at the beginning of a line

$

Matches only at the end of a line

\b

Matches only at a word boundary

\B

Matches only at a nonword boundary

 
Top of section

Closure operators

All closure operators (+, *, ?, {m,n}) are by default greedy, meaning that they match as many elements of the string as possible without causing the overall match to fail.

Syntax

Description

A*

Matches A zero or more times

A+

Matches A one or more times

A?

Matches A zero or one time

A{n}

Matches A exactly n times

A{n,}

Matches A at least n times

A{n,m}

Matches A at least n but not more than m times

 
Top of section

Backreferences

You can refer to the contents of a parenthesized expression within a regular expression itself using a backreference. The first backreference in a regular expression is denoted by \1, the second by \2, and so on. For example, the expression:

  ([0-9]+)=\1

will match any string of the form n=n (like 0=0 or 2=2).

 

Reference  

Copyright © 2002, SilverStream Software, Inc. All rights reserved.