Forums
Hello Ed -2! - Printable Version

+- Forums (http://euradioboard.createmybb3.com)
+-- Forum: Programming (/forum-5.html)
+--- Forum: Interpreters (/forum-6.html)
+--- Thread: Hello Ed -2! (/thread-36.html)

Pages: 1 2 3


Hello Ed -2! - Aurel - 11-20-2018 04:24 AM

Hi
Here are those 3 functions
I hope that understand my question now..Smile
same as in email:
Is this 3 functions enough for interpreter expression evaluator?
Of course code would be first tokenized .
And if you remember tokenizer i already build (need some crystalization
- read fine tuning ) -
Is this way OK ?

Code:
'fn expression -----------------------------------------------
function Expression() as Float
float res
if tokType = tNUM OR tokType = tVAR  'num or numeric variable
    if nextTokenType = tPLUS OR nextTokenType = tMINUS
        if tPLUS '(+)
            GetToken() ' get token from token list/array
            res = token + Term()
        elseif tMINUS '(-)
            GetToken()
            res = token - Term()
        end if
    end if
end if
return res
end function

'term -------------------------------------------------------
function Term() as Float
float res
res = Factor()
if nextTokenType = tMULTI OR nextTokenType = tDIVIDE
    if tMULTI '(*)
        GetToken()
        res = res * Factor()
    elseif tDIVIDE '(/)
        GetToken()
        res = res / Factor()
    end if
end if
return res
end function

'factor -----------------------------------------------------
function Factor() as Float
float res
if tokType = tNUM
    return GetNum(token)
end if

if tokType = tVAR
    return GetVar(token)
end if

if tokType = tLParen '(
Match("(")
res = Expression()
Match(")")
end if
'else?
return res ' from func->Expression()
end function



RE: Hello Ed -2! - Ed Davis - 11-20-2018 05:41 AM

(11-20-2018 04:24 AM)Aurel Wrote:  Hi
Here are those 3 functions
I hope that understand my question now..Smile
same as in email:
Is this 3 functions enough for interpreter expression evaluator?
Of course code would be first tokenized .
And if you remember tokenizer i already build (need some crystalization
- read fine tuning ) -
Is this way OK ?

Code:
'fn expression -----------------------------------------------
function Expression() as Float
float res
if tokType = tNUM OR tokType = tVAR  'num or numeric variable
    if nextTokenType = tPLUS OR nextTokenType = tMINUS
        if tPLUS '(+)
            GetToken() ' get token from token list/array
            res = token + Term()
        elseif tMINUS '(-)
            GetToken()
            res = token - Term()
        end if
    end if
end if
return res
end function

'term -------------------------------------------------------
function Term() as Float
float res
res = Factor()
if nextTokenType = tMULTI OR nextTokenType = tDIVIDE
    if tMULTI '(*)
        GetToken()
        res = res * Factor()
    elseif tDIVIDE '(/)
        GetToken()
        res = res / Factor()
    end if
end if
return res
end function

'factor -----------------------------------------------------
function Factor() as Float
float res
if tokType = tNUM
    return GetNum(token)
end if

if tokType = tVAR
    return GetVar(token)
end if

if tokType = tLParen '(
Match("(")
res = Expression()
Match(")")
end if
'else?
return res ' from func->Expression()
end function

If you only want to have three precedence levels, then this is ok.

With 3 (expr(), term(), factor()) functions, you can only handle
3 levels of precedence, for instance:

A grammar with 3 levels of precedence:

Code:
expr   = term   { "+|-" term }.
  term   = factor { "*"|"/" factor }.
  factor = number | "(" expr ")" | ("+"|"-") factor .

Code:
factor() {
    if (accept(number)) return num;
    if (accept(minus)) return -factor();
    if (accept(lparen)) {
        local n = expression();
        if (!accept(rparen))
            error("expecting ')'");
        return n;
    }
    return error("factor: syntax error");
}

term() {
    local n = factor();
    while (sym == star || sym == slash) {
        Symbol op = sym;
        sym = getsym();
        switch (op) {
            case star:  n *= factor(); break;
            case slash: n /= factor(); break;
        }
    }
    return n;
}

expression() {
    local n = term();
    while (sym == plus || sym == minus) {
        Symbol op = sym;
        sym = getsym();
        switch (op) {
            case plus:  n += term(); break;
            case minus: n -= term(); break;
        }
    }
    return n;
}

But you'll need more levels if you want to have standard BASIC
precedence. Standard BASIC appears to have 8 precedence levels.
Standard BASIC precedence appears to be:

Code:
* highest to lowest:
- Parenthesis
- Exponentiation (^)
- multiply, divide
- addition, subtraction
- equal, not equal, less than, greater than, less than equal,
    greater than equal
- and
- or
- Equivalence (eqv), Implication (imp), Exclusive Disjunction
    (exp)

A grammar for (simple) standard BASIC expressions:

Code:
level1 = level2 [ "eqv"|"imp"|"exp" level2 ].
  level2 = level3 [ "or" level3 ].
  level3 = level4 [ "and" level4 ].
  level4 = level5 [ "="|"<>"|"<"|">"|"<="|">=" level5 ].
  level5 = level6 { "+|-" level6 }.
  level6 = level7 { "*"|"/" level7 }.
  level7 = level8 [ "^" level8 ].
  level8 = number | "(" expr ")" | ("+"|"-") level8 .

Given the grammar, it is pretty easy to turn these into
functions. The "{" indicates looping - 0 or more, and the "["
indicates an optional sequence, e.g., 0 or 1. Purely mechanical.

So, for standard BASIC, you'd need to have 8 functions - assuming
one for primaries/atoms.

That is the main reason I like precedence climbing - for a
language like BASIC, you only need two functions expr() and
primary(). For me, it is much easier to follow.

I hope this makes sense!


RE: Hello Ed -2! - Aurel - 12-17-2018 07:21 AM

Well yes
I agree with you
but in Ruben i have only four functions
expr()
term()
factor()
match() - for parenthesis

but still i am not sure which algo is faster
i prpbably need to test them..

Hey ..Ed ...what about Early parser?
I see that is used in JavaScript version of simplified QB ?
do you know that?


RE: Hello Ed -2! - Aurel - 12-17-2018 07:52 AM

Recursive descent even in C looking very clear:

C implementation

What follows is a implementation of a recursive descent parser for the above language in C. The parser reads in source code, and exits with an error message if the code fails to parse, exiting silently if the code parses correctly.

Notice how closely the predictive parser below mirrors the grammar above. There is a procedure for each nonterminal in the grammar. Parsing descends in a top-down manner, until the final nonterminal has been processed. The program fragment depends on a global variable, sym, which contains the current symbol from the input, and the function nextsym, which updates sym when called.

The implementations of the functions nextsym and error are omitted for simplicity.

typedef enum {ident, number, lparen, rparen, times, slash, plus,
minus, eql, neq, lss, leq, gtr, geq, callsym, beginsym, semicolon,
endsym, ifsym, whilesym, becomes, thensym, dosym, constsym, comma,
varsym, procsym, period, oddsym} Symbol;

Symbol sym;
void nextsym(void);
void error(const char msg[]);

int accept(Symbol s) {
if (sym == s) {
nextsym();
return 1;
}
return 0;
}

int expect(Symbol s) {
if (accept(s))
return 1;
error("expect: unexpected symbol");
return 0;
}

void factor(void) {
if (accept(ident)) {
;
} else if (accept(number)) {
;
} else if (accept(lparen)) {
expression();
expect(rparen);
} else {
error("factor: syntax error");
nextsym();
}
}

void term(void) {
factor();
while (sym == times || sym == slash) {
nextsym();
factor();
}
}

void expression(void) {
if (sym == plus || sym == minus)
nextsym();
term();
while (sym == plus || sym == minus) {
nextsym();
term();
}
}

void condition(void) {
if (accept(oddsym)) {
expression();
} else {
expression();
if (sym == eql || sym == neq || sym == lss || sym == leq || sym == gtr || sym == geq) {
nextsym();
expression();
} else {
error("condition: invalid operator");
nextsym();
}
}
}

void statement(void) {
if (accept(ident)) {
expect(becomes);
expression();
} else if (accept(callsym)) {
expect(ident);
} else if (accept(beginsym)) {
do {
statement();
} while (accept(semicolon));
expect(endsym);
} else if (accept(ifsym)) {
condition();
expect(thensym);
statement();
} else if (accept(whilesym)) {
condition();
expect(dosym);
statement();
} else {
error("statement: syntax error");
nextsym();
}
}

void block(void) {
if (accept(constsym)) {
do {
expect(ident);
expect(eql);
expect(number);
} while (accept(comma));
expect(semicolon);
}
if (accept(varsym)) {
do {
expect(ident);
} while (accept(comma));
expect(semicolon);
}
while (accept(procsym)) {
expect(ident);
expect(semicolon);
block();
expect(semicolon);
}
statement();
}

void program(void) {
nextsym();
block();
expect(period);
}



RE: Hello Ed -2! - Ed Davis - 12-17-2018 12:16 PM

(12-17-2018 07:21 AM)Aurel Wrote:  Well yes
I agree with you
but in Ruben i have only four functions
expr()
term()
factor()
match() - for parenthesis

Typically, in a recursive descent parser, the number of functions coincides to the number of precedence levels.

For simple languages (e.g., very few operators) that is sufficient.

Quote:but still i am not sure which algo is faster
i prpbably need to test them..

Someone has already done that.
Remember, a typical compiler has the following phases:
  1. Scanner (or Lexer or tokenizer - all the same thing)
  2. Parser
  3. Semantic analyzer
  4. Code generator

I believe it was Donald Knuth, who many years analyzed this problem, and discovered that, in general, the speed of a compiler was heavily proportional to the speed of the first phase, the Scanner.

As long as your parsing algorithm is reasonable, and your language grammar is reasonable, then parsing will not be the bottle-neck.

Of course home-grown parsers that don't faithfully implement tried and true parsing algorithms, or are written by poor programmers, can be very, very slow.

Quote:Hey ..Ed ...what about Early parser?
I see that is used in JavaScript version of simplified QB ?
do you know that?

Yes, I've seen that. It was really, really cool.
However, if you are hand-coding a compiler/interpreter, I don't see any need to use anything other than precedence climbing for expressions, and recursive descent for everything else.

(12-17-2018 07:52 AM)Aurel Wrote:  Recursive descent even in C looking very clear:

C implementation

...

Yes, that is from the Wikipedia recursive descent parsing page.

It is a C implementation of Wirth's Pascal implementation of a byte-code compiler for Wirth's PL/0 teaching language.

You can find very similar code around the web in Pascal, Java, Python and probably others.

PL/0 was explicitly created to teach students how to write compilers. Very simple, but just enough features for a student to get a good feel of what a compiler entails.

It was very popular in the late 70's - it was introduced in Wirth's book Algorithms + Data Structures = Programs, in 1976 - but has fallen out of favor in recent years. This was a great book, and is one of my favorites.

Now many intro compiler courses use tiny or mini Java as the language to be compiled by students.


RE: Hello Ed -2! - Aurel - 12-18-2018 04:48 AM

Quote:and discovered that, in general, the speed of a compiler was heavily proportional to the speed of the first phase, the Scanner.

..and then i think that he was wrong.
why? simple

1.step - tokenize
2 step - execute

Inside step execute is our expression parser which must do
hard work to execute math expressions properly.
Do you agree with that ?

So ..as you see i need only two steps.
In your toy you speed up processing transformig code - tokens
to bytecode and that is why is fast not because
precedence climbing parser....
do I have a right about that OR i "miss the point" again?
sorry if I bothering you with that ?

In addition
I know now why my stupid/simple ruben is slow
simply because his expression parser is not just parser
than in same time lexer/parser processing each char to token
then execute token ..and that is time consuming.


RE: Hello Ed -2! - Aurel - 12-18-2018 04:54 AM

Quote:Now many intro compiler courses use tiny or mini Java as the language to be compiled by students.

Oh man ...
I don't know that...but also
i have found simple interpreter(just tokenized- i think?)
called JImage - or ImageJ ( source code is on web)
which is very clean and easy to understand written in Java.


RE: Hello Ed -2! - Ed Davis - 12-18-2018 07:28 AM

(12-18-2018 04:48 AM)Aurel Wrote:  
Quote:and discovered that, in general, the speed of a compiler was heavily proportional to the speed of the first phase, the Scanner.

..and then i think that he was wrong.
why? simple

1.step - tokenize
2 step - execute

Inside step execute is our expression parser which must do
hard work to execute math expressions properly.
Do you agree with that ?

Nope, I disagree. But do note, you missed what I said. I said Compiler, and not interpreter.

Yes, in a interpreter, the lions share of the time will be spent "interpreting/executing" the code.

But still, actual parsing, done correctly can be quite fast. But of course as you have to repeatedly parse loops, you will take an enormous hit.

If you tokenize the input, so that you have integers for comparing with, and you using a language that compares integers faster than it does strings (C, C++, Assembly, Java, C#) then

Quote:So ..as you see i need only two steps.
In your toy you speed up processing transformig code - tokens
to bytecode and that is why is fast not because
precedence climbing parser....
do I have a right about that OR i "miss the point" again?
sorry if I bothering you with that ?

No bother - I love to talk about compilers and interpreters! :-)

Correct, the byte code made it faster, then interpreting other ways.
But even with that, precedence climbing is faster for expressions that recursive descent, because in precedence climbing you only have one recursive procedure. In recursive descent, you have one for each level of precedence. For simple languages, that only have 2-3 levels of precedence, then no big deal. But (for instance), BASIC (which has 5+ levels), C (which has 10+) and others, precedence climbing is faster than recursive descent. You can read the precedence climbing paper if you want the details.

Quote:In addition
I know now why my stupid/simple ruben is slow
simply because his expression parser is not just parser
than in same time lexer/parser processing each char to token
then execute token ..and that is time consuming.

Yep, I agree.


RE: Hello Ed -2! - Aurel - 12-18-2018 09:07 AM

Quote:But even with that, precedence climbing is faster for expressions that recursive descent, because in precedence climbing you only have one recursive procedure

OK ..
That is what i want to know...
I must say that i am little bit confused about with what i read
around net.
Some says ... recursive descent is more robust and faster
and is implemented in gcc and some other power languages
some says that prece-climbing is better and more flexibile
etc..etc

I mean so many different opinions is out there?


RE: Hello Ed -2! - Ed Davis - 12-18-2018 09:16 AM

(12-18-2018 09:07 AM)Aurel Wrote:  
Quote:But even with that, precedence climbing is faster for expressions that recursive descent, because in precedence climbing you only have one recursive procedure

OK ..
That is what i want to know...
I must say that i am little bit confused about with what i read
around net.
Some says ... recursive descent is more robust and faster
and is implemented in gcc and some other power languages
some says that prece-climbing is better and more flexibile
etc..etc

I mean so many different opinions is out there?

Note that precedence climbing is only used for expressions. Recursive descent is used for everything else.
And, do note that precedence climbing is really just a different form of recursive descent.

In really life, the speed difference between the two doesn't matter.
If you want to use recursive descent for everything, go for it! It won't hurt at all.

For me, precedence climbing seems simpler. But again, it is only used for expressions.

So, use what works best for you. Neither one is really "better". What is best is what you understand, and can implement.

For instance, several folks think that Pratt parsing (for expressions) is the best. For me, Pratt parsing is a little hard to follow, e.g., it is a little too abstract for me.

So again, use whatever works best for you.