js2peg
Converts a custom JavaScript syntax format into the PegJS file format. Offers the benefit of allowing more flexible building of grammars (and ready syntax highlighting by JavaScript-aware apps).
Installation
npm install js2peg
Usage
Require the module...
var $J = ;
Invoke the constructor to setup initial configuration:
// Optional options object (below are shown the defaults);
Instance properties
The properties of the above options object are also available as properties with the same name on the $J
object.
In addition, the property output
will exist on the object to indicate the string thus far built in the PegJS format (the empty string by default), and the property parser
will store the last built PegJS parser object (set to null
by default).
Instance methods
parse
Arguments: (str
, rules
, initializer = undefined
)
This is a convenience method which:
- calls
this.buildParser()
with the suppliedrules
andinitializer
- calls
parse()
(on the resulting PegJS parser object) with the suppliedstr
string
buildParser
Arguments: (rules
, initializer = undefined
)
This is a convenience method allowing you to use the same style of API as PegJS which:
- calls
this.convert()
with the suppliedrules
andinitializer
- calls
PegJS.buildParser()
with the resulting output (thethis.output
property) - sets the
parser
property to the return value and also returns this parser object
One can then manually call the PegJS parser methods such as parse() or toSource() on the returned parser object. See this.parse
to avoid the need for a separate parse()
call.
convert
Arguments: (rules
, initializer = undefined
)
The initializer
is optional and if provided, can be a string wrapped in curly braces ({...}
) or a function whose body will be copied (via Function.prototype.toString()
) and then wrapped in curly braces to work as a PegJS grammar file (one can define arguments to the function, especially if one is in ECMAScript strict mode, to prevent issues with undefined variables, even though the function will not depend on these variables, depending solely on its string contents).
The rules
object contains rule names (or rule names optionally followed by a colon and a string to serve as a PegJS rule name label) as keys.
The values on the rules
object may be either a string to be:
- preceded by whitespace...
- for regex matches:
.
,(
,)
,.
- for PegJS matching modifiers:
&
and!
- for PegJS OR condition:
/
- for:
[...]
(regex matches),{...}
(actions),"..."
(literals) - an expression label with a colon (optionally followed by content to be added directly, such as a rule name)
- a rule name
- for regex matches:
- not preceded by whitespace
*
,+
,?
...or an object or function to be serialized and then:
- preceded by whitespace...
- A function whose contents are to be converted into a string (via
Function.prototype.toString()
) and used as an action within the grammar output - An object with a
source
and optionalignoreCase
property (as with a RegExp object or literal) to be added directly to the grammar output - An object with a
literal
and optionali
orignoreCase
property (as with a RegExp object or literal) to be stringified and added to the grammar output
- A function whose contents are to be converted into a string (via
- currently not preceded by whitespace
- An object with a
expr
property to be added directly to the grammar output (it is preferred to use the class methods where possible to create such objects).
- An object with a
Note that while functions and regex objects/literals are possible for convenience, all of their functionality can be represented by represented by (or easily auto-converted into) JSON-friendly code (e.g., if a web-based IDE wished to allow users to store their own transforming parsers (whether for syntax highlighting or convenient custom conversion of user input into a more verbose but widely recognized format) and validate them in an easy manner).
Class methods
The following methods are of use internally or in defining modules.
isECMAScriptIdentifier
Arguments: (val
)
Indicates whether the supplied value val
matches as a valid ECMAScript identifier. (Probably of most use internally.)
stringify
Arguments: (str
)
Returns the supplied string str
surrounded by double-quotes and with all internal double-quotes escaped (useful for building literals without the likes of JSON.stringify()
).
repeat
Arguments: (ct
, str = ' '
)
Repeats the supplied string str
a ct
number of times. str
defaults to a single space.
getFunctionContents
Arguments: (f
)
Converts a function f
to a string (using Function.prototype.toString()
), extracts the inner contents, and surrounds the result with curly braces. (Probably of most use internally.)
isRegExp
Arguments: (obj
)
Duck type checks an object obj
for an exec
method to determine whether the object is a regular expression object. (Probably of most use internally.)
mixin
Arguments: (targetObj
, sourceObj
, inherited = true
)
Mixes the object sourceObj
onto the object targetObj
, optionally (and by default) copying inherited properties as well as "own" properties of the sourceObj
. If a property of targetObj
already exists and the sourceObj
is an array, the former will be overwritten by the latter (of use when a module redefines a rule), but otherwise the object will be recursively copied (creating an empty array or object when needed and not already present on the targetObj
). Regular expression literals will be converted into JSON-stringifyable RegExp objects.
or
Arguments: (...
)
Takes an indefinite number of arguments, joins them together with a PegJS OR condition (" / "), adds the result onto an object via an expr
property, and returns the object. The expr
property is used to identify strings which should be copied directly into the grammar result without escaping.
orStrings
Arguments: (...
)
A convenience method to stringify all supplied arguments and then pass them to js2peg.or()
.
range
Arguments: (item
, min
, max
)
Builds an object with an expr
property set to the supplied item
string being repeated (in PegJS format) in a space-separated list to the minimum min
number of times and then encapsulated and repeated in an option group "( ... ?)" to the maximum max
number of times.
See js2peg.or()
regarding the expr
property.
exactly
Arguments: (item
, num
)
A convenience method to call js2peg.range()
with its min
and max
arguments both set to num
.
Todos
- See about getting RFC rule examples added to https://github.com/andreineculau/core-pegjs/tree/master/src/RFC ?
- Allow multiple characters like ")+?" and utilize in example
- Redo regex strings as regex literals in example (to take advantage of more diversity in JS expression syntax highlighting) once _or refactored to convert non-strings appropriately to strings
Possible todos
- Adapt PegJS to allow equivalents for ranges, etc.
- Adapt PegJS to allow direct parsing of this JavaScript format
- Adapt PegJS to build array of objects whose keys reflect rule names
- Use PegJS to parse PegJS files and convert to the JavaScript format
- Converter for EBNF? http://standards.iso.org/ittf/PubliclyAvailableStandards/s026153_ISO_IEC_14977_1996(E).zip
- Check the whitespace generation more carefully to prevent redundant whitespace, etc.