Regular expression (regex for short) is a language for specifying patterns to search for in text. If you need a regex, just ask an [[LLM]]. Here's some basic syntax that will help you read regex: ### Character Basics | Pattern | Meaning | Example Matches | |--------|---------|-----------------| | `.` | Any single character (except newline) | `a`, `7`, `_` | | `[abc]` | Character class: any one of `a`, `b`, or `c` | `a` | | `[^abc]` | Negated class: anything *except* `a`, `b`, or `c` | `d`, `5` | | `[A-Za-z]` | Any letter | `H`, `q` | | `[0-9]` or `\d` | Any digit | `4` | | `\w` | Word char: letters, digits, `_` | `A`, `3`, `_` | | `\s` | Whitespace | space, tab, newline | ### Repetition Quantifiers | Pattern | Meaning | Example Matches | |--------|---------|-----------------| | `*` | 0 or more | `""`, `a`, `aaaa` | | `+` | 1 or more | `a`, `aaaa` | | `?` | 0 or 1 | `""`, `a` | | `{n}` | Exactly `n` | `aaa` where `{3}` | | `{n,}` | `n` or more | `aaa...` | | `{n,m}` | Between `n` and `m` | `aa`, `aaa` where `{2,3}` | ### Anchors (Start/End of String) | Pattern | Meaning | Example Matches | |--------|---------|-----------------| | `^` | Start of string | `^abc` matches `"abc123"` | | `
| End of string | `abc
matches `"123abc"` | | `\b` | Word boundary | `\bcat\b` matches `"cat"` but not `"scatter"` | | `\B` | Not a word boundary | `\Bcat` matches `"scatter"` | ### Escaping Special Characters | Pattern | Meaning | Example | |--------|---------|---------| | `\.` | Literal period `.` | matches `"a.b"` | | `\*` | Literal asterisk `*` | matches `"5*3"` | | `\\` | Literal backslash | matches `"\\"` | ### Grouping and Capturing | Pattern | Meaning | Example Matches | |--------|---------|-----------------| | `(abc)` | Capturing group (stored match) | captures `"abc"` | | `(a\|b)` | Alternation (OR) | matches `"a"` or `"b"` | | `(?:abc)` | Non-capturing group | matches `"abc"` but not stored | | `(?P<name>abc)` | Named group | captures `"abc"` into `name` | ### Common Practical Patterns | Goal | Pattern | Example Matches | |------|---------|-----------------| | Letters only | `^[A-Za-z]+
| `"Hello"` | | Alphanumeric | `^[A-Za-z0-9]+
| `"abc123"` | | Email prefix starting w/ letter | `^[A-Za-z][A-Za-z0-9_.-]*` | `"A_user-12"` | | Integer | `^-?\d+
| `"-34"`, `"120"` | | Float | `^-?\d+(\.\d+)?
| `"34.5"` | | Simple email | `^\w+@\w+\.\w+
| `"[email protected]"` | ### Matching vs. Searching | Concept | Meaning | Example | |--------|---------|---------| | **Match** (`^...
) | Entire string must match | `^[A-Z]+
matches only `"HELLO"` | | **Search** | Pattern appears anywhere | searching for `cat` finds `"scattered"` | ### Putting It All Together (Example) Rule for valid emails ``` ^[A-Za-z][A-Za-z0-9_.-]*@domain\.com$ ``` | Part | Meaning | | ----------------- | ---------------------------- | | `^` | start of string | | `[A-Za-z]` | must start with a letter | | `[A-Za-z0-9_.-]*` | allowed characters in prefix | | `@domain\.com` | required domain | | `
| end of string | ## Python Use the `re` module in [[Python]] for regex. Specify as **raw strings**: `r"pattern"` . ```python import re re.match(r"^[A-Z]+quot;, "HELLO") re.search(r"cat", "scatter") re.findall(r"\d+", "a1b22c333") re.sub(r"\s+", "-", "a b c") ``` ## JavaScript In [[JavaScript]], regex is specified between `/pattern/` slashes or as a string in the new `RegExp("pattern")`. ```javascript "HELLO".match(/^[A-Z]+$/); "scatter".search(/cat/); "1a22b".match(/\d+/g); "hello world".replace(/\s+/g, "-"); ``` ## SQL Regex varies by database. For [[Postgres]] use these regex operators. | Operator | Meaning | | -------- | --------------------------------- | | ~ | regex match | | ~* | case-insensitive match | | !~ | does not match | | !~* | does not match (case-insensitive) | ```SQL SELECT * FROM Users WHERE mail ~ '^[A-Za-z][A-Za-z0-9_.-]*@leetcode\.com; ```