Abstract

The lexical scanner uses Python regular expressions. The text is split before being parsed by the grammar rules.

Syntactic parser

TPG isn't based on predictive algorithms with tables like LL(k). The main idea was instead to try every possible choices and to accept the first choice that match the input. So when a choice point is reached - say A|B|C - the parser will first try to recognize A. If this fails it will try B and if necessary C. So contrary to LL(k) parsers the order of the branches of choice points is very important for TPG. In fact this method has been inspired from Prolog DGC parsers. But remember that when a choice has been done, even if their are more possible choices, it can't be undone (in Prolog it can). The text to be parsed has to be stored in a string in memory (backtracking is simpler this way). During the parsing, the current position is stored in internal TPG variables for all terminal and non-terminal symbols.

This algorithm is easily implementable. Any rule is translated into a class method without having to compute a prediction table. The main drawbacks of this method is that you have to be careful when you write your grammar (as in Prolog).

Example

#!/usr/bin/env python

import math
import operator
import string
import tpg

if tpg.__python__ == 3:
    operator.div = operator.truediv
    raw_input = input

def make_op(op):
    return {
        '+'   : operator.add,
        '-'   : operator.sub,
        '*'   : operator.mul,
        '/'   : operator.div,
        '%'   : operator.mod,
        '^'   : lambda x,y:x**y,
        '**'  : lambda x,y:x**y,
        'cos' : math.cos,
        'sin' : math.sin,
        'tan' : math.tan,
        'acos': math.acos,
        'asin': math.asin,
        'atan': math.atan,
        'sqr' : lambda x:x*x,
        'sqrt': math.sqrt,
        'abs' : abs,
        'norm': lambda x,y:math.sqrt(x*x+y*y),
    }[op]

class Calc(tpg.Parser, dict):
    r"""
        separator space '\s+' ;

        token pow_op    '\^|\*\*'                                               $ make_op
        token add_op    '[+-]'                                                  $ make_op
        token mul_op    '[*/%]'                                                 $ make_op
        token funct1    '(cos|sin|tan|acos|asin|atan|sqr|sqrt|abs)\b'           $ make_op
        token funct2    '(norm)\b'                                              $ make_op
        token real      '(\d+\.\d*|\d*\.\d+)([eE][-+]?\d+)?|\d+[eE][-+]?\d+'    $ float
        token integer   '\d+'                                                   $ int
        token VarId     '[a-zA-Z_]\w*'                                          ;

        START/e ->
                'vars'                  $ e=self.mem()
            |   VarId/v '=' Expr/e      $ self[v]=e
            |   Expr/e
        ;

        Var/$self.get(v,0)$ -> VarId/v ;

        Expr/e -> Term/e ( add_op/op Term/t     $ e=op(e,t)
                         )*
        ;

        Term/t -> Fact/t ( mul_op/op Fact/f     $ t=op(t,f)
                         )*
        ;

        Fact/f ->
                add_op/op Fact/f                $ f=op(0,f)
            |   Pow/f
        ;

        Pow/f -> Atom/f ( pow_op/op Fact/e      $ f=op(f,e)
                        )?
        ;

        Atom/a ->
                real/a
            |   integer/a
            |   Function/a
            |   Var/a
            |   '\(' Expr/a '\)'
        ;

        Function/y ->
                funct1/f '\(' Expr/x '\)'               $ y = f(x)
            |   funct2/f '\(' Expr/x1 ',' Expr/x2 '\)'  $ y = f(x1,x2)
        ;

    """

    def mem(self):
        vars = sorted(self.items())
        memory = [ "%s = %s"%(var, val) for (var, val) in vars ]
        return "\n\t" + "\n\t".join(memory)

print("Calc (TPG example)")
calc = Calc()
while 1:
    l = raw_input("\n:")
    if l:
        try:
            print(calc(l))
        except Exception:
            print(tpg.exc())
    else:
        break

Documentation

Installation

Linux / Unix / Sources

Windows

The Windows installer is not available anymore because of a virus infection. I will now only distribute source packages.

Abstract

Abstract

Download

Contact

Prerequisites

How it works

Lexical scanner

Syntactic parser

Example

Documentation

Installation

Linux / Unix / Sources

Windows

Projects using TPG

Links