java.util.regex

Class Pattern

    • Field Detail

      • UNIX_LINES

        public static final int UNIX_LINES
        Enables Unix lines mode.

        In this mode, only the '\n' line terminator is recognized in the behavior of ., ^, and $.

        Unix lines mode can also be enabled via the embedded flag expression (?d).

        See Also:
        Constant Field Values
      • CASE_INSENSITIVE

        public static final int CASE_INSENSITIVE
        Enables case-insensitive matching.

        By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched. Unicode-aware case-insensitive matching can be enabled by specifying the UNICODE_CASE flag in conjunction with this flag.

        Case-insensitive matching can also be enabled via the embedded flag expression (?i).

        Specifying this flag may impose a slight performance penalty.

        See Also:
        Constant Field Values
      • COMMENTS

        public static final int COMMENTS
        Permits whitespace and comments in pattern.

        In this mode, whitespace is ignored, and embedded comments starting with # are ignored until the end of a line.

        Comments mode can also be enabled via the embedded flag expression (?x).

        See Also:
        Constant Field Values
      • MULTILINE

        public static final int MULTILINE
        Enables multiline mode.

        In multiline mode the expressions ^ and $ match just after or just before, respectively, a line terminator or the end of the input sequence. By default these expressions only match at the beginning and the end of the entire input sequence.

        Multiline mode can also be enabled via the embedded flag expression (?m).

        See Also:
        Constant Field Values
      • LITERAL

        public static final int LITERAL
        Enables literal parsing of the pattern.

        When this flag is specified then the input string that specifies the pattern is treated as a sequence of literal characters. Metacharacters or escape sequences in the input sequence will be given no special meaning.

        The flags CASE_INSENSITIVE and UNICODE_CASE retain their impact on matching when used in conjunction with this flag. The other flags become superfluous.

        There is no embedded flag character for enabling literal parsing.

        Since:
        1.5
        See Also:
        Constant Field Values
      • DOTALL

        public static final int DOTALL
        Enables dotall mode.

        In dotall mode, the expression . matches any character, including a line terminator. By default this expression does not match line terminators.

        Dotall mode can also be enabled via the embedded flag expression (?s). (The s is a mnemonic for "single-line" mode, which is what this is called in Perl.)

        See Also:
        Constant Field Values
      • UNICODE_CASE

        public static final int UNICODE_CASE
        Enables Unicode-aware case folding.

        When this flag is specified then case-insensitive matching, when enabled by the CASE_INSENSITIVE flag, is done in a manner consistent with the Unicode Standard. By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched.

        Unicode-aware case folding can also be enabled via the embedded flag expression (?u).

        Specifying this flag may impose a performance penalty.

        See Also:
        Constant Field Values
      • CANON_EQ

        public static final int CANON_EQ
        Enables canonical equivalence.

        When this flag is specified then two characters will be considered to match if, and only if, their full canonical decompositions match. The expression "a\u030A", for example, will match the string "\u00E5" when this flag is specified. By default, matching does not take canonical equivalence into account.

        There is no embedded flag character for enabling canonical equivalence.

        Specifying this flag may impose a performance penalty.

        See Also:
        Constant Field Values
      • UNICODE_CHARACTER_CLASS

        public static final int UNICODE_CHARACTER_CLASS
        Enables the Unicode version of Predefined character classes and POSIX character classes.

        When this flag is specified then the (US-ASCII only) Predefined character classes and POSIX character classes are in conformance with Unicode Technical Standard #18: Unicode Regular Expression Annex C: Compatibility Properties.

        The UNICODE_CHARACTER_CLASS mode can also be enabled via the embedded flag expression (?U).

        The flag implies UNICODE_CASE, that is, it enables Unicode-aware case folding.

        Specifying this flag may impose a performance penalty.

        Since:
        1.7
        See Also:
        Constant Field Values
    • Method Detail

      • compile

        public static Pattern compile(String regex)
        Compiles the given regular expression into a pattern.

        Parameters:
        regex - The expression to be compiled
        Throws:
        PatternSyntaxException - If the expression's syntax is invalid
      • pattern

        public String pattern()
        Returns the regular expression from which this pattern was compiled.

        Returns:
        The source of this pattern
      • toString

        public String toString()

        Returns the string representation of this pattern. This is the regular expression from which this pattern was compiled.

        Overrides:
        toString in class Object
        Returns:
        The string representation of this pattern
        Since:
        1.5
      • matcher

        public Matcher matcher(CharSequence input)
        Creates a matcher that will match the given input against this pattern.

        Parameters:
        input - The character sequence to be matched
        Returns:
        A new matcher for this pattern
      • flags

        public int flags()
        Returns this pattern's match flags.

        Returns:
        The match flags specified when this pattern was compiled
      • matches

        public static boolean matches(String regex,
                      CharSequence input)
        Compiles the given regular expression and attempts to match the given input against it.

        An invocation of this convenience method of the form

         Pattern.matches(regex, input);
        behaves in exactly the same way as the expression
         Pattern.compile(regex).matcher(input).matches()

        If a pattern is to be used multiple times, compiling it once and reusing it will be more efficient than invoking this method each time.

        Parameters:
        regex - The expression to be compiled
        input - The character sequence to be matched
        Throws:
        PatternSyntaxException - If the expression's syntax is invalid
      • split

        public String[] split(CharSequence input,
                     int limit)
        Splits the given input sequence around matches of this pattern.

        The array returned by this method contains each substring of the input sequence that is terminated by another subsequence that matches this pattern or is terminated by the end of the input sequence. The substrings in the array are in the order in which they occur in the input. If this pattern does not match any subsequence of the input then the resulting array has just one element, namely the input sequence in string form.

        The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

        The input "boo:and:foo", for example, yields the following results with these parameters:

        Regex    

        Limit    

        Result    

        : 2 { "boo", "and:foo" }
        : 5 { "boo", "and", "foo" }
        : -2 { "boo", "and", "foo" }
        o 5 { "b", "", ":and:f", "", "" }
        o -2 { "b", "", ":and:f", "", "" }
        o 0 { "b", "", ":and:f" }

        Pattern (Java Platform SE 7 ) Home of API Java Contents Haut

        Parameters:
        input - The character sequence to be split
        limit - The result threshold, as described above
        Returns:
        The array of strings computed by splitting the input around matches of this pattern
      • split

        public String[] split(CharSequence input)
        Splits the given input sequence around matches of this pattern.

        This method works as if by invoking the two-argument split method with the given input sequence and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.

        The input "boo:and:foo", for example, yields the following results with these expressions:

        Regex    

        Result

        : { "boo", "and", "foo" }
        o { "b", "", ":and:f" }

        Pattern (Java Platform SE 7 ) Home of API Java Contents Haut

        Parameters:
        input - The character sequence to be split
        Returns:
        The array of strings computed by splitting the input around matches of this pattern
      • quote

        public static String quote(String s)
        Returns a literal pattern String for the specified String.

        This method produces a String that can be used to create a Pattern that would match the string s as if it were a literal pattern.

        Metacharacters or escape sequences in the input sequence will be given no special meaning.
        Parameters:
        s - The string to be literalized
        Returns:
        A literal string replacement
        Since:
        1.5

Document created the 11/06/2005, last modified the 04/03/2020
Source of the printed document:https://www.gaudry.be/en/java-api-rf-java/util/regex/pattern.html

The infobrol is a personal site whose content is my sole responsibility. The text is available under CreativeCommons license (BY-NC-SA). More info on the terms of use and the author.

References

  1. View the html document Language of the document:fr Manuel PHP : https://docs.oracle.com

These references and links indicate documents consulted during the writing of this page, or which may provide additional information, but the authors of these sources can not be held responsible for the content of this page.
The author This site is solely responsible for the way in which the various concepts, and the freedoms that are taken with the reference works, are presented here. Remember that you must cross multiple source information to reduce the risk of errors.

Contents Haut