Syntax Analysis

Syntax analysis (compiler step just after lexer).

Install

npm i @jlguenego/syntax-analysis

Usage

You should use Typescript in order to check your grammar in a easier way.

const t = defineTerminalAlphabet(['a', 'b'] as const);
const nt = defineNonTerminalAlphabet(['S', 'A'] as const);

const spec: CFGSpecifications<typeof t, typeof nt> = {
  nt,
  t,
  productions: [
    {LHS: 'S', RHS: ['a', 'A', 'a', 'a']},
    {LHS: 'S', RHS: ['b', 'A', 'b', 'a']},
    {LHS: 'A', RHS: []},
    {LHS: 'A', RHS: ['b']},
  ],
  startSymbol: 'S',
};
export const cfg = new ContextFreeGrammar(spec);

// coming from a lexer (ex: @jlguenego/lexer)
const sentence: Sentence = 'abaa'.split('').map(str => ({
  name: str,
}));

// the real job: get the parse tree.
const parseTree = parse(sentence, cfg, {
  method: 'LLk',
  lookaheadTokenNbr: 2,
});

Top down algorithm

Breadth First Search

BFS1: Naive Breadth First Search with nothing else (very slow, may take many days...).
BFS2: Like BFS1 with 2 checks for speeding BFS (slow, may take many hours...).
- checks the length of sentential form
- checks the sentence prefix of the sentential form
BFS3: Like BFS2 with LeftMost Derivation strategy (not so slow, mak take some minutes...).

Depth First Search

DFS1: Leftmost derivation strategy. (not so slow execpt for left recursive grammar)
DFS2: Like DFS1 but use one lookahead terminal to speed up a little bit.
LL1: Like DFS2 but use a LL1 table to know exactly wich production rule to use for the next sentential form. This one is linear O(ng), n is the size of the string to parse, and g is the size of the grammar.
- Warning: the grammar must be LL(1) compatible. So you may have to refactor your grammar in some case:
  - Convert left recursion to right recursion.
  - Left factoring
LLk: This one do not use anymore search tree algorithm with possible backtracking but a k predictive algorithm, exactly as described in the Aho Ullman book (see Theory). It parses real LLk grammars (ie not only the strong LLk grammar), k can be any integer ≥ 1.

Bottom up algorithm

LR0: Use an LR0 automaton, and decide to shift or reduce without lookahead.
LR1: Use an LR1 automaton, and decide to shift or reduce with one lookahead.
SLR1: Use the LR0 automaton augmented with the FOLLOW terminals, and decide to shift or reduce with one lookahead.
LALR1: Use the LALR1 automaton (constructed with the "Lazy Merging" technique), and decide to shift or reduce with one lookahead.

Note about grammar: LR0 < SLR1 < LALR1 < LR1.

Project related

Theory

Author

Jean-Louis GUENEGO jlguenego@gmail.com

@jlguenego/syntax-analysis

Syntax Analysis

Install

Usage

Top down algorithm

Breadth First Search

Depth First Search

Bottom up algorithm

Project related

Theory

Author

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

@jlguenego/syntax-analysis

Syntax Analysis

Install

Usage

Top down algorithm

Breadth First Search

Depth First Search

Bottom up algorithm

Project related

Theory

Author

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

Weekly Downloads