osiyuk
Kiril Osiyuk
6 posts

#14683
Understanding grammar and syntax in EBNF 9 months, 3 weeks ago Edited by Kiril Osiyuk on March 25, 2018, 1:17 a.m. Reason: Initial post
Help! I am stuck on Day 2 homework. I have read chapters 2 through 4.1 of Wirth book.
And I have issues implementing parser outputting sexpression. I believe understand method of recursive descent. I do not understand construction of grammar. I have done that:
I think, there some ambiguity issues. How to resolve those issues? How to check grammar being deterministic? Are this grammar is regular? 
pervognsen
Per Vognsen
49 posts
/ 1 project

#14686
Understanding grammar and syntax in EBNF 9 months, 3 weeks ago Edited by Per Vognsen on March 25, 2018, 6:23 a.m.
You say you can't figure out how to output Sexpressions. As a first step, I would recommend writing a recognizer that can tell whether a given string matches a valid expression. Did you do that?
Feel free to look at the parser code I wrote on stream if you get stuck or confused. All the theory can be easily ignored if you just focus on what the code is doing. I'm going to focus on theory in this post because you asked about theory. But I want to make it clear that I don't think understanding this stuff formally is necessary or even that helpful if you can just think in terms of code. As for your other questions: The binary operator part of the grammar looks good to me. Your rule for unary only needs to change slightly: unary = INT  (''  '~') unary I'm using INT in all caps to signify that it's a primitive token. To check that the grammar is deterministic, for each alternative you need to compute the FIRST sets. Starting with unary, the FIRST set of the first alternative is {INT}, and the FIRST set of the second alternative is {'', '~'}. These are disjoint, hence unary is deterministic. Any repetition pattern implicitly presents an alternative between matching the pattern vs matching a followup pattern. I can't remember how exactly Wirth chooses to define it, but the idea here is that since the pattern enclosed by {} in term and factor starts with a certain set of operator tokens, and even for a symbol like  that occurs as both a binary operator and a unary operator, in the context where you're trying to match an infix  there's no alternative that could match a prefix , so it's deterministic. This stuff is actually a lot clearer when you just write out the code, following Wirth's template. In code terms, determinism just means you don't have multiple ifelseif cases that could be simultaneously true (if their FIRST sets overlap, the first case in the ifelseif chain to match would arbitrarily be selected by the parser as the right one). The grammar is not regular in its current form, according to Wirth's definition. A rule (the syntactic equation for a nonterminal) is regular if all nonterminal references occur solely in tail position. And a grammar is regular if all its rules are regular. That is true for unary, but it is not true for term and factor. E.g. in the rule term = factor {('+'  ''  ''  '^') factor}, factor occurs in the head position. That said, you can actually rewrite the grammar to be regular. You just have to "copy and paste" righthand side definitions for every occurrence of the lefthand side. Let me show you how the transformation goes:
Ugly, right? I do not recommend doing this! And you wouldn't be able to regularize the grammar as soon as you added support for parenthesized expressions, so it's not really useful. But it shows that it's possible for a nonregular grammar to be equivalent to a regular grammar after appropriate transformations. A formal language (a set of strings) is said to be regular if there exists a regular grammar which matches the language. But like this example shows, it's possible for a regular language to also have corresponding nonregular grammars. 
osiyuk
Kiril Osiyuk
6 posts

#14843
Understanding grammar and syntax in EBNF 9 months, 2 weeks ago
Finally, I have done parser. Results:
I have used classic approach to recursivedescent parsing, based on the idea of grammar with nonterminals for each level of precedence.
Now looking into shunting yard algorithm and precedence climbing. I have published my code: https://github.com/osiyuk/Ion/blob/master/sexpr.c Thank you Per Vognsen for challenging homework. References http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm 