 Hello Ed -2! - Printable Version +- Forums (http://euradioboard.createmybb3.com) +-- Forum: Programming (/forum-5.html) +--- Forum: Interpreters (/forum-6.html) +--- Thread: Hello Ed -2! (/thread-36.html) Pages: 1 2 3 4 5 Hello Ed -2! - Aurel - 11-20-2018 04:24 AM Hi Here are those 3 functions I hope that understand my question now.. same as in email: Is this 3 functions enough for interpreter expression evaluator? Of course code would be first tokenized . And if you remember tokenizer i already build (need some crystalization - read fine tuning ) - Is this way OK ? Code: ```'fn expression ----------------------------------------------- function Expression() as Float float res if tokType = tNUM OR tokType = tVAR  'num or numeric variable     if nextTokenType = tPLUS OR nextTokenType = tMINUS         if tPLUS '(+)             GetToken() ' get token from token list/array             res = token + Term()         elseif tMINUS '(-)             GetToken()             res = token - Term()         end if     end if end if return res end function 'term ------------------------------------------------------- function Term() as Float float res res = Factor() if nextTokenType = tMULTI OR nextTokenType = tDIVIDE     if tMULTI '(*)         GetToken()         res = res * Factor()     elseif tDIVIDE '(/)         GetToken()         res = res / Factor()     end if end if return res end function 'factor ----------------------------------------------------- function Factor() as Float float res if tokType = tNUM     return GetNum(token) end if if tokType = tVAR     return GetVar(token) end if if tokType = tLParen '( Match("(") res = Expression() Match(")") end if 'else? return res ' from func->Expression() end function``` RE: Hello Ed -2! - Ed Davis - 11-20-2018 05:41 AM (11-20-2018 04:24 AM)Aurel Wrote:  Hi Here are those 3 functions I hope that understand my question now.. same as in email: Is this 3 functions enough for interpreter expression evaluator? Of course code would be first tokenized . And if you remember tokenizer i already build (need some crystalization - read fine tuning ) - Is this way OK ? Code: ```'fn expression ----------------------------------------------- function Expression() as Float float res if tokType = tNUM OR tokType = tVAR  'num or numeric variable     if nextTokenType = tPLUS OR nextTokenType = tMINUS         if tPLUS '(+)             GetToken() ' get token from token list/array             res = token + Term()         elseif tMINUS '(-)             GetToken()             res = token - Term()         end if     end if end if return res end function 'term ------------------------------------------------------- function Term() as Float float res res = Factor() if nextTokenType = tMULTI OR nextTokenType = tDIVIDE     if tMULTI '(*)         GetToken()         res = res * Factor()     elseif tDIVIDE '(/)         GetToken()         res = res / Factor()     end if end if return res end function 'factor ----------------------------------------------------- function Factor() as Float float res if tokType = tNUM     return GetNum(token) end if if tokType = tVAR     return GetVar(token) end if if tokType = tLParen '( Match("(") res = Expression() Match(")") end if 'else? return res ' from func->Expression() end function``` If you only want to have three precedence levels, then this is ok. With 3 (expr(), term(), factor()) functions, you can only handle 3 levels of precedence, for instance: A grammar with 3 levels of precedence: Code: ```expr   = term   { "+|-" term }.   term   = factor { "*"|"/" factor }.   factor = number | "(" expr ")" | ("+"|"-") factor .``` Code: ```factor() {     if (accept(number)) return num;     if (accept(minus)) return -factor();     if (accept(lparen)) {         local n = expression();         if (!accept(rparen))             error("expecting ')'");         return n;     }     return error("factor: syntax error"); } term() {     local n = factor();     while (sym == star || sym == slash) {         Symbol op = sym;         sym = getsym();         switch (op) {             case star:  n *= factor(); break;             case slash: n /= factor(); break;         }     }     return n; } expression() {     local n = term();     while (sym == plus || sym == minus) {         Symbol op = sym;         sym = getsym();         switch (op) {             case plus:  n += term(); break;             case minus: n -= term(); break;         }     }     return n; }``` But you'll need more levels if you want to have standard BASIC precedence. Standard BASIC appears to have 8 precedence levels. Standard BASIC precedence appears to be: Code: ```* highest to lowest: - Parenthesis - Exponentiation (^) - multiply, divide - addition, subtraction - equal, not equal, less than, greater than, less than equal,     greater than equal - and - or - Equivalence (eqv), Implication (imp), Exclusive Disjunction     (exp)``` A grammar for (simple) standard BASIC expressions: Code: ```level1 = level2 [ "eqv"|"imp"|"exp" level2 ].   level2 = level3 [ "or" level3 ].   level3 = level4 [ "and" level4 ].   level4 = level5 [ "="|"<>"|"<"|">"|"<="|">=" level5 ].   level5 = level6 { "+|-" level6 }.   level6 = level7 { "*"|"/" level7 }.   level7 = level8 [ "^" level8 ].   level8 = number | "(" expr ")" | ("+"|"-") level8 .``` Given the grammar, it is pretty easy to turn these into functions. The "{" indicates looping - 0 or more, and the "[" indicates an optional sequence, e.g., 0 or 1. Purely mechanical. So, for standard BASIC, you'd need to have 8 functions - assuming one for primaries/atoms. That is the main reason I like precedence climbing - for a language like BASIC, you only need two functions expr() and primary(). For me, it is much easier to follow. I hope this makes sense! RE: Hello Ed -2! - Aurel - 12-17-2018 07:21 AM Well yes I agree with you but in Ruben i have only four functions expr() term() factor() match() - for parenthesis but still i am not sure which algo is faster i prpbably need to test them.. Hey ..Ed ...what about Early parser? I see that is used in JavaScript version of simplified QB ? do you know that? RE: Hello Ed -2! - Aurel - 12-17-2018 07:52 AM Recursive descent even in C looking very clear: C implementation What follows is a implementation of a recursive descent parser for the above language in C. The parser reads in source code, and exits with an error message if the code fails to parse, exiting silently if the code parses correctly. Notice how closely the predictive parser below mirrors the grammar above. There is a procedure for each nonterminal in the grammar. Parsing descends in a top-down manner, until the final nonterminal has been processed. The program fragment depends on a global variable, sym, which contains the current symbol from the input, and the function nextsym, which updates sym when called. The implementations of the functions nextsym and error are omitted for simplicity. typedef enum {ident, number, lparen, rparen, times, slash, plus, minus, eql, neq, lss, leq, gtr, geq, callsym, beginsym, semicolon, endsym, ifsym, whilesym, becomes, thensym, dosym, constsym, comma, varsym, procsym, period, oddsym} Symbol; Symbol sym; void nextsym(void); void error(const char msg[]); int accept(Symbol s) { if (sym == s) { nextsym(); return 1; } return 0; } int expect(Symbol s) { if (accept(s)) return 1; error("expect: unexpected symbol"); return 0; } void factor(void) { if (accept(ident)) { ; } else if (accept(number)) { ; } else if (accept(lparen)) { expression(); expect(rparen); } else { error("factor: syntax error"); nextsym(); } } void term(void) { factor(); while (sym == times || sym == slash) { nextsym(); factor(); } } void expression(void) { if (sym == plus || sym == minus) nextsym(); term(); while (sym == plus || sym == minus) { nextsym(); term(); } } void condition(void) { if (accept(oddsym)) { expression(); } else { expression(); if (sym == eql || sym == neq || sym == lss || sym == leq || sym == gtr || sym == geq) { nextsym(); expression(); } else { error("condition: invalid operator"); nextsym(); } } } void statement(void) { if (accept(ident)) { expect(becomes); expression(); } else if (accept(callsym)) { expect(ident); } else if (accept(beginsym)) { do { statement(); } while (accept(semicolon)); expect(endsym); } else if (accept(ifsym)) { condition(); expect(thensym); statement(); } else if (accept(whilesym)) { condition(); expect(dosym); statement(); } else { error("statement: syntax error"); nextsym(); } } void block(void) { if (accept(constsym)) { do { expect(ident); expect(eql); expect(number); } while (accept(comma)); expect(semicolon); } if (accept(varsym)) { do { expect(ident); } while (accept(comma)); expect(semicolon); } while (accept(procsym)) { expect(ident); expect(semicolon); block(); expect(semicolon); } statement(); } void program(void) { nextsym(); block(); expect(period); } RE: Hello Ed -2! - Ed Davis - 12-17-2018 12:16 PM (12-17-2018 07:21 AM)Aurel Wrote:  Well yes I agree with you but in Ruben i have only four functions expr() term() factor() match() - for parenthesis Typically, in a recursive descent parser, the number of functions coincides to the number of precedence levels. For simple languages (e.g., very few operators) that is sufficient. Quote:but still i am not sure which algo is faster i prpbably need to test them.. Someone has already done that. Remember, a typical compiler has the following phases: Scanner (or Lexer or tokenizer - all the same thing) Parser Semantic analyzer Code generator I believe it was Donald Knuth, who many years analyzed this problem, and discovered that, in general, the speed of a compiler was heavily proportional to the speed of the first phase, the Scanner. As long as your parsing algorithm is reasonable, and your language grammar is reasonable, then parsing will not be the bottle-neck. Of course home-grown parsers that don't faithfully implement tried and true parsing algorithms, or are written by poor programmers, can be very, very slow. Quote:Hey ..Ed ...what about Early parser? I see that is used in JavaScript version of simplified QB ? do you know that? Yes, I've seen that. It was really, really cool. However, if you are hand-coding a compiler/interpreter, I don't see any need to use anything other than precedence climbing for expressions, and recursive descent for everything else. (12-17-2018 07:52 AM)Aurel Wrote:  Recursive descent even in C looking very clear: C implementation ... Yes, that is from the Wikipedia recursive descent parsing page. It is a C implementation of Wirth's Pascal implementation of a byte-code compiler for Wirth's PL/0 teaching language. You can find very similar code around the web in Pascal, Java, Python and probably others. PL/0 was explicitly created to teach students how to write compilers. Very simple, but just enough features for a student to get a good feel of what a compiler entails. It was very popular in the late 70's - it was introduced in Wirth's book Algorithms + Data Structures = Programs, in 1976 - but has fallen out of favor in recent years. This was a great book, and is one of my favorites. Now many intro compiler courses use tiny or mini Java as the language to be compiled by students. RE: Hello Ed -2! - Aurel - 12-18-2018 04:48 AM Quote:and discovered that, in general, the speed of a compiler was heavily proportional to the speed of the first phase, the Scanner. ..and then i think that he was wrong. why? simple 1.step - tokenize 2 step - execute Inside step execute is our expression parser which must do hard work to execute math expressions properly. Do you agree with that ? So ..as you see i need only two steps. In your toy you speed up processing transformig code - tokens to bytecode and that is why is fast not because precedence climbing parser.... do I have a right about that OR i "miss the point" again? sorry if I bothering you with that ? In addition I know now why my stupid/simple ruben is slow simply because his expression parser is not just parser than in same time lexer/parser processing each char to token then execute token ..and that is time consuming. RE: Hello Ed -2! - Aurel - 12-18-2018 04:54 AM Quote:Now many intro compiler courses use tiny or mini Java as the language to be compiled by students. Oh man ... I don't know that...but also i have found simple interpreter(just tokenized- i think?) called JImage - or ImageJ ( source code is on web) which is very clean and easy to understand written in Java. RE: Hello Ed -2! - Ed Davis - 12-18-2018 07:28 AM (12-18-2018 04:48 AM)Aurel Wrote:  Quote:and discovered that, in general, the speed of a compiler was heavily proportional to the speed of the first phase, the Scanner. ..and then i think that he was wrong. why? simple 1.step - tokenize 2 step - execute Inside step execute is our expression parser which must do hard work to execute math expressions properly. Do you agree with that ? Nope, I disagree. But do note, you missed what I said. I said Compiler, and not interpreter. Yes, in a interpreter, the lions share of the time will be spent "interpreting/executing" the code. But still, actual parsing, done correctly can be quite fast. But of course as you have to repeatedly parse loops, you will take an enormous hit. If you tokenize the input, so that you have integers for comparing with, and you using a language that compares integers faster than it does strings (C, C++, Assembly, Java, C#) then Quote:So ..as you see i need only two steps. In your toy you speed up processing transformig code - tokens to bytecode and that is why is fast not because precedence climbing parser.... do I have a right about that OR i "miss the point" again? sorry if I bothering you with that ? No bother - I love to talk about compilers and interpreters! :-) Correct, the byte code made it faster, then interpreting other ways. But even with that, precedence climbing is faster for expressions that recursive descent, because in precedence climbing you only have one recursive procedure. In recursive descent, you have one for each level of precedence. For simple languages, that only have 2-3 levels of precedence, then no big deal. But (for instance), BASIC (which has 5+ levels), C (which has 10+) and others, precedence climbing is faster than recursive descent. You can read the precedence climbing paper if you want the details. Quote:In addition I know now why my stupid/simple ruben is slow simply because his expression parser is not just parser than in same time lexer/parser processing each char to token then execute token ..and that is time consuming. Yep, I agree. RE: Hello Ed -2! - Aurel - 12-18-2018 09:07 AM Quote:But even with that, precedence climbing is faster for expressions that recursive descent, because in precedence climbing you only have one recursive procedure OK .. That is what i want to know... I must say that i am little bit confused about with what i read around net. Some says ... recursive descent is more robust and faster and is implemented in gcc and some other power languages some says that prece-climbing is better and more flexibile etc..etc I mean so many different opinions is out there? RE: Hello Ed -2! - Ed Davis - 12-18-2018 09:16 AM (12-18-2018 09:07 AM)Aurel Wrote:  Quote:But even with that, precedence climbing is faster for expressions that recursive descent, because in precedence climbing you only have one recursive procedure OK .. That is what i want to know... I must say that i am little bit confused about with what i read around net. Some says ... recursive descent is more robust and faster and is implemented in gcc and some other power languages some says that prece-climbing is better and more flexibile etc..etc I mean so many different opinions is out there? Note that precedence climbing is only used for expressions. Recursive descent is used for everything else. And, do note that precedence climbing is really just a different form of recursive descent. In really life, the speed difference between the two doesn't matter. If you want to use recursive descent for everything, go for it! It won't hurt at all. For me, precedence climbing seems simpler. But again, it is only used for expressions. So, use what works best for you. Neither one is really "better". What is best is what you understand, and can implement. For instance, several folks think that Pratt parsing (for expressions) is the best. For me, Pratt parsing is a little hard to follow, e.g., it is a little too abstract for me. So again, use whatever works best for you.