RXP
A descriptive constructor for regular expressions
Installation
npm install rxp
/ yarn add rxp
;;
Dependencies
- uniqid - generate unique names used for identifying regex variables
Features
Plain English Constructor
- using a Mocha/ Chai inspired syntax, regex are constructed to be readable at a glance and easy to reason about
- after initializing the constructor with a base text, various regex behavior can be applied through nested, descriptive method calls
- easily transforms to regex literals with the
construct
command
const userNameMatch = anyCharacteroccursOnceOrMoreand;const storyIntro = atStart; const ISBN = ;const multipleDigits = anyDigitoccursOnceOrMore;const ProductId = ;
Composable, Modular Units
- RXP can build regex expressions as small units that can then be reused, further modified as needed, and composed within other RXP units
const fourDigits = anyDigit;const fourDigitsWithOptionalHyphen = ;const CreditCardMatch = ;
Compatible with Regex Literals
- RXP can also accept regex literals (
/regex/g
) and parse these to be used within the constructor, allowing you to easily extend the behavior of regex literals you may already be using:
const myFavoriteRegex = /should this be optional\?/;const optionalVersion = isOptional;optionalVersiontext; // "(?:should this be optional\\?)?"
Convenient Shorthands and Presets
- frequently used regex behavior, such as marking text optional or providing alternatives, can be quickly defined using shorthand functions that retain RXP functionality
- common regex characters, such as
.
and\d
, are stored as descriptive presets ("anyCharacter", "anyDigit") with full RXP functionality built in
;; atStart;anyDigitoccursOnceOrMoreandisGreedy;
Simplifies Default Behavior
- auto-escapes user-submitted strings
- groups are noncapturing
- frequency searches are lazy
- noncapturing groupings and lazy searches can easily be overridden if needed
// convert to greedy searchtext; // (?:sample)+?andisGreedytext; // (?:sample)+ // convert to captured groupingsisOptionaltext; // (?:sample)?isOptionalandisCapturedtext; // ((?:sample)?)
Quick Guide
The RXP constructor works by accepting either a string, escaping any special characters, and then modifying the string argument using descriptive object methods to apply wanted behavior. When the prepared regex string is ready, it can be transformed to a regex literal (/regex/
) with any desired flags applied.
const regexSearch = occursOnceOrMoreand ; // => /(?<=intro: )(?:text with unescaped \[\])+?/g regexSearch; // trueregexSearch; // falseregexSearch; // "text with unescaped []"
Initialize Constructor
The main init
function is used to generate the RXP constructor. init
accepts any number of strings, combines them in order, applies a noncapture grouping, and escapes them, storing the modified string in the text
property.
const sample = ;sampletext; // "sample" const escaped = ;escapedtext; // "escape \\. and \\?"
The init
function can also accept regex literals or other RXP units. When providing another RXP unit, the full constructor object should be provided and not just the .text
property to avoid escaping special characters a second time:
const newSample = ;newSampletext; // "combine with regex literal and escape \\. and \\?"
Modify Behavior
After initializing the constructor, a variety of properties/ methods are available to modify the regex behavior:
text; // "(?:(?:this)|(?:that))"sampletext; // "(?:sample){3,5}"sampleandatEndtext; // "(?:(?:sample){2})$"
Convert to Regex Literal
When the regex is ready to be finalized, the construct
method can be used to convert the text string to a regex literal (/regex/
). A flag can be passed to construct
to apply matching behavior:
sample; // => /sample/sample; // => /sample/gsample; // => /sample/gsi
Presets
Commonly used special characters, such as .
and \d
, are stored as ready-to-use RXP units (called "Presets"), which have full RXP functionality:
const sectionID = anyDigitatStart;const sectionAbbr = ;
Shorthand Functions
A variety of shorthand functions are also available for commonly used behavior, such as marking text optional, defining frequency, or declaring alternatives:
; // equivalent to init("some text").isOptional; // equivalent to init("sample").occursOnceOrMore; // equivalent to init("A").or("B") //these are especially handy for improving the readability of composed RXP units:const regexSearch = ;
API
Constructor
method | example | equivalent |
---|---|---|
init | const RXPSample = init("sample") |
escapes text and initializes constructor |
or | RXPSample.or("other sample") |
(?:(?:sample)\|(?:other sample)) |
occurs | RXPSample.occurs(4) |
(?:sample){4} |
occursOnceOrMore | RXPSample.occursOnceOrMore |
(?:sample)+? |
occursZeroOrMore | RXPSample.occursZeroOrMore |
(?:sample)\*? |
occursAtLeast | RXPSample.occursAtLeast(3) |
(?:sample){3,} |
occursBetween | RXPSample.occursBetween(2,4) |
(?:sample){2,4} |
isGreedy | RXPSample.occursOnceOrMore.and.isGreedy |
(?:sample)+ |
followedBy | RXPSample.followedBy("text") |
sample(?=text) |
notFollowedBy | RXPSample.notFollowedBy("text") |
sample(?!text) |
precededBy | RXPSample.precededBy("text") |
(?<=text)sample |
notPrecededBy | RXPSample.notPrecededBy("text") |
(?<!text)sample |
atStart | RXPSample.atStart |
^(?:sample) |
atEnd | RXPSample.atEnd |
(?:sample)\$ |
isOptional | RXPSample.isOptional |
(?:sample)? |
isCaptured | RXPSample.isCaptured |
(sample) |
isVariable | RXPSample.isVariable |
(?\<var>sample) -or- //k<var> |
and | RXPSample.occurs(4).and.atEnd |
(?:(?:sample){4})$ |
construct | RXPSample.construct("g") |
/sample/g |
Shorthands
function | example | equivalent |
---|---|---|
either | either("this", "that") |
init("this").or("that") |
optional | optional("text") |
init("text").isOptional |
oneOrMore | oneOrMore("text") |
init("text").occursOnceOrMore |
zeroOrMore | zeroOrMore("text") |
init("text").occursZeroOrMore |
upperOrLowerCase | upperOrLowerCase("r") |
init("r").or("R") |
wrapRXP | wrapRXP("(", ")") |
new function: (innerText) => init("(", innerText, ")") |
withBoundaries | withBoundaries("land") |
init(/\b/, "land", /\b/) * |
* with occurs
modifiers removed
Presets
RXP unit | example | equivalent |
---|---|---|
anyCharacter | anyCharacter |
. |
anyCharacterExcept | anyCharacterExcept("t", "7") |
[^t7] |
anyDigit | anyDigit |
\d |
anyDigitExcept | anyDigitExcept("5") |
[012346789] |
anyLowerCase | anyLowerCase |
[a-z] |
anyLowerCaseExcept | anyLowerCaseExcept("a", "b", "c") |
[d-z] |
anyUpperCase | anyUpperCase |
[A-Z] |
anyUpperCaseExcept | anyUpperCaseExcept("Z") |
[A-Y] |
anyLetter | anyLetter |
\w |
anyLetterExcept | anyLetterExcept("a") |
[b-zA-Z] |
Examples
Matching a specific email pattern
// original regex:/[A-Z]{2}[a-z]+@?company\.net/; //RXP version:const emailMatch = ;
Matching someone with a specific family name
// original regex:/\w+?\s[rR]ose/; // RXP version:const nameMatch = ;
Matching a US zip code after the state abbreviation
//original regex:/\d{5}?/; // RXP version:const stateAbbreviation = ;const zipCode = anyDigit;const extendedZipCode = anyDigit;const extendedZipWithSpace = ; const zipCodeMatch = ;
Matching a phone number with extension
//original regex:/\(?\d{3}\)??\d{3}?\d{4}?/g; // RXP version:const areaCode = ; const firstThreeDigits = areaCode;const lastFourDigits = anyDigit;const extension = ; const phoneMatch = ;