Regular expression (regex for short) is a language for specifying patterns to search for in text.
If you need a regex, just ask an [[LLM]].
Here's some basic syntax that will help you read regex:
### Character Basics
| Pattern | Meaning | Example Matches |
|--------|---------|-----------------|
| `.` | Any single character (except newline) | `a`, `7`, `_` |
| `[abc]` | Character class: any one of `a`, `b`, or `c` | `a` |
| `[^abc]` | Negated class: anything *except* `a`, `b`, or `c` | `d`, `5` |
| `[A-Za-z]` | Any letter | `H`, `q` |
| `[0-9]` or `\d` | Any digit | `4` |
| `\w` | Word char: letters, digits, `_` | `A`, `3`, `_` |
| `\s` | Whitespace | space, tab, newline |
### Repetition Quantifiers
| Pattern | Meaning | Example Matches |
|--------|---------|-----------------|
| `*` | 0 or more | `""`, `a`, `aaaa` |
| `+` | 1 or more | `a`, `aaaa` |
| `?` | 0 or 1 | `""`, `a` |
| `{n}` | Exactly `n` | `aaa` where `{3}` |
| `{n,}` | `n` or more | `aaa...` |
| `{n,m}` | Between `n` and `m` | `aa`, `aaa` where `{2,3}` |
### Anchors (Start/End of String)
| Pattern | Meaning | Example Matches |
|--------|---------|-----------------|
| `^` | Start of string | `^abc` matches `"abc123"` |
| `
| End of string | `abc
matches `"123abc"` |
| `\b` | Word boundary | `\bcat\b` matches `"cat"` but not `"scatter"` |
| `\B` | Not a word boundary | `\Bcat` matches `"scatter"` |
### Escaping Special Characters
| Pattern | Meaning | Example |
|--------|---------|---------|
| `\.` | Literal period `.` | matches `"a.b"` |
| `\*` | Literal asterisk `*` | matches `"5*3"` |
| `\\` | Literal backslash | matches `"\\"` |
### Grouping and Capturing
| Pattern | Meaning | Example Matches |
|--------|---------|-----------------|
| `(abc)` | Capturing group (stored match) | captures `"abc"` |
| `(a\|b)` | Alternation (OR) | matches `"a"` or `"b"` |
| `(?:abc)` | Non-capturing group | matches `"abc"` but not stored |
| `(?P<name>abc)` | Named group | captures `"abc"` into `name` |
### Common Practical Patterns
| Goal | Pattern | Example Matches |
|------|---------|-----------------|
| Letters only | `^[A-Za-z]+
| `"Hello"` |
| Alphanumeric | `^[A-Za-z0-9]+
| `"abc123"` |
| Email prefix starting w/ letter | `^[A-Za-z][A-Za-z0-9_.-]*` | `"A_user-12"` |
| Integer | `^-?\d+
| `"-34"`, `"120"` |
| Float | `^-?\d+(\.\d+)?
| `"34.5"` |
| Simple email | `^\w+@\w+\.\w+
| `"
[email protected]"` |
### Matching vs. Searching
| Concept | Meaning | Example |
|--------|---------|---------|
| **Match** (`^...
) | Entire string must match | `^[A-Z]+
matches only `"HELLO"` |
| **Search** | Pattern appears anywhere | searching for `cat` finds `"scattered"` |
### Putting It All Together (Example)
Rule for valid emails
```
^[A-Za-z][A-Za-z0-9_.-]*@domain\.com$
```
| Part | Meaning |
| ----------------- | ---------------------------- |
| `^` | start of string |
| `[A-Za-z]` | must start with a letter |
| `[A-Za-z0-9_.-]*` | allowed characters in prefix |
| `@domain\.com` | required domain |
| `
| end of string |
## Python
Use the `re` module in [[Python]] for regex. Specify as **raw strings**: `r"pattern"` .
```python
import re
re.match(r"^[A-Z]+quot;, "HELLO")
re.search(r"cat", "scatter")
re.findall(r"\d+", "a1b22c333")
re.sub(r"\s+", "-", "a b c")
```
## JavaScript
In [[JavaScript]], regex is specified between `/pattern/` slashes or as a string in the new `RegExp("pattern")`.
```javascript
"HELLO".match(/^[A-Z]+$/);
"scatter".search(/cat/);
"1a22b".match(/\d+/g);
"hello world".replace(/\s+/g, "-");
```
## SQL
Regex varies by database. For [[Postgres]] use these regex operators.
| Operator | Meaning |
| -------- | --------------------------------- |
| ~ | regex match |
| ~* | case-insensitive match |
| !~ | does not match |
| !~* | does not match (case-insensitive) |
```SQL
SELECT *
FROM Users
WHERE mail ~ '^[A-Za-z][A-Za-z0-9_.-]*@leetcode\.com