Discussion:
On parsers, yet again
(too old to reply)
Johann 'Myrkraverk' Oskarsson
2022-02-16 01:36:01 UTC
Permalink
Dear r.a.i-f,

I have been wondering, how are IF parsers generally constructed? Is
there literature on this topic? As in, is it more like programming
language parsing, for which there's abundant literature in compilers,
or is it more like natural language parsing, which I guess is slightly
different? Or neither?

For creating a game, I would probably use TADS, or Inform 6, or some
other ready made environment for exactly that. However, I have been
wondering if parsers are really that /hard/ to do, or just more like
/annoying/ to make?

Anyone here to share anything on the subject?
--
Johann | email: invalid -> com | www.myrkraverk.com/blog/
I'm not from the Internet, I just work there. | twitter: @myrkraverk
Greg Ewing
2022-02-16 12:19:57 UTC
Permalink
I have been wondering, how are IF parsers generally constructed?  Is
there literature on this topic?  As in, is it more like programming
language parsing, for which there's abundant literature in compilers,
or is it more like natural language parsing, which I guess is slightly
different?  Or neither?
In my experience they're much more like programming language
parsers than natural language parsers. IF input languages are
usually a very restricted subset of natural languages, so you
don't tend to have the same problems of vagueness and ambiguity
that you get when trying to parse natural languages.
I have been
wondering if parsers are really that /hard/ to do, or just more like
/annoying/ to make?
They're not really hard, especially if you have some familiarity
with the techniques used for parsing programming languages. In
fact, IF input languages are usually a lot simpler than typical
programming languages. Most of the complexity comes in figuring
out what to *do* in response to what the player typed.
--
Greg
Johann 'Myrkraverk' Oskarsson
2022-02-16 13:02:51 UTC
Permalink
Post by Greg Ewing
I have been wondering, how are IF parsers generally constructed?  Is
there literature on this topic?  As in, is it more like programming
language parsing, for which there's abundant literature in compilers,
or is it more like natural language parsing, which I guess is slightly
different?  Or neither?
In my experience they're much more like programming language
parsers than natural language parsers. IF input languages are
usually a very restricted subset of natural languages, so you
don't tend to have the same problems of vagueness and ambiguity
that you get when trying to parse natural languages.
Right.
Post by Greg Ewing
I have been
wondering if parsers are really that /hard/ to do, or just more like
/annoying/ to make?
They're not really hard, especially if you have some familiarity
with the techniques used for parsing programming languages. In
fact, IF input languages are usually a lot simpler than typical
programming languages. Most of the complexity comes in figuring
out what to *do* in response to what the player typed.
I see. I have to say I'm not /very familiar/ with parsing programming
languages, however, recently I have been reading several compiler books,
and I think I'm starting to get -- at least some of -- it. [*]

Then I was thinking, if all of this has been written about compilers,
hasn't /something/ been written about IF parsers? Maybe it hasn't
and it's all in the compiler literature? One thing is different, IME,
in IF, and that's the game itself can add keywords and nouns. Though
maybe that's not too different from adding types in languages like C++.
The difference being that the compiler grammar is /fixed/ while the IF
grammar is more flexible with verbs being added and nouns changing as
the game progresses.

[*] To name two, /Modern Compiler Implementation in ML/ by Appel, and
/Compiler Design in C/ by Holub. The latter is available on the
author's website as pdf. Then I have a usable familiarity with flex
and yacc.
--
Johann | email: invalid -> com | www.myrkraverk.com/blog/
I'm not from the Internet, I just work there. | twitter: @myrkraverk
Adam Thornton
2022-02-16 18:34:07 UTC
Permalink
Post by Johann 'Myrkraverk' Oskarsson
Then I was thinking, if all of this has been written about compilers,
hasn't /something/ been written about IF parsers? Maybe it hasn't
and it's all in the compiler literature? One thing is different, IME,
in IF, and that's the game itself can add keywords and nouns. Though
maybe that's not too different from adding types in languages like C++.
The difference being that the compiler grammar is /fixed/ while the IF
grammar is more flexible with verbs being added and nouns changing as
the game progresses.
Maybe? But plenty of languages let you extend the syntax. FORTH is
my favorite example, but anything LISP-like (and FORTH's stack is just
a LISP expression stood up on end) encourages you to do exactly that.

If you're not scared of wading through source...even though Inform 7
isn't yet open-source, you can wade through its implementation of the
parser and standard library, since that's written in Inform 6 and 7
and bundled with the application.

I'm working with the Mac app, so inside the Inform.app directory,
you'd want to go to Contents/Resources. Linux and Windows will have
analogous structures. Once inside there...Library/6.11 contains a
bunch of Inform 6, including parserm.h, which contains the input
tokenizer and parser. The I6 standard world model is in that
directory as well. Going back up to Contents/Resources, and then down
to Internal/Extensions/Graham\ Nelson will bring you to Standard\
Rules.i7x, which is both the definition of the I7 standard model and
the glue that binds it to I6.

It's an enlightening read, if you want to see how the sausage is made.
What you will find is what Greg Ewing said: the tokenizer is pretty
straightforward, and the parser...recognizes a lot less than you think
it might. The language extensibility is the cool bit, and Inform 7 is
a really neat experiment in making extending the language -- which is
to say, writing Interactive Fiction -- an awful lot like playing a
game written in the language.

Adam
Greg Ewing
2022-02-18 00:37:31 UTC
Permalink
Post by Johann 'Myrkraverk' Oskarsson
I have to say I'm not /very familiar/ with parsing programming
languages, however, recently I have been reading several compiler books,
and I think I'm starting to get -- at least some of -- it.
Don't worry about getting deeply into the theory of parsing,
most of it is overkill for this purpose.
Post by Johann 'Myrkraverk' Oskarsson
the game itself can add keywords and nouns.  Though
maybe that's not too different from adding types in languages like C++.
The difference being that the compiler grammar is /fixed/ while the IF
grammar is more flexible with verbs being added and nouns changing as
the game progresses.
I'm not sure that's a helpful way to think about it. Rather than
the grammar changing, it's more like different variables being in
scope in different places in a program. The set of verbs, nouns,
adjectives etc. understood by the game is fixed by the game author,
but different objects become accessible at different times.

What might be a bit different is that whereas in many programming
languages you have reserved words such as "if", "while", etc. that
can't be used for any other purpose, an IF parser needs to treat
tokens more flexibly. E.g. if you decide that the word "plant"
is always a noun so that you can have an object called "green plant",
you're going to have trouble with a command like "plant the plant".

For that reason you may find tools like yacc that are designed
for keyword-oriented languages don't help very much.
--
Greg
Loading...