-
Notifications
You must be signed in to change notification settings - Fork 0
How parsing works
identifierToken
basically takes one word or symbol (read: @chunk
) at a
time, assigns it a name or type and creates a token in the form of a token tuple
[ tag, value, offsetInChunk, length, origin ]
. This is what the functions
token
and subsequently makeToken
create.
In identifierToken
there are a few key variables and functions that are needed:
-
@chunk
: the current string to handle, this is split up into[input, id, colon]
with theIDENTIFIER
regular expression at the bottom -
id
: in case of import, this is literally'import'
-
@tag()
: gets thetag
(first value of the token tuple) of the last processed token. When processingfoo
(as in the second chunk ofimport 'foo'
),@tag()
will return'IMPORT'
. -
@value()
: gets thevalue
(second value of the token tuple) of the last processed token. When processingfoo
(as in the second chunk ofimport 'foo'
),@value()
will returnimport
, the very string that was held inid
in the last chunk's handling.
So basically what I added to identifierToken
was the tags IMPORT
,
IMPORT_AS
, IMPORT_FROM
as well as the variable @seenImport
to know
that when I encounter an as
or a from
, this will be from an import and
not a yield or similar. This also means in theory that from
can still be
used as an identifier as well. We have to test that though. :)
These three tags are then used in grammar.coffee
.
There's also code the reset @seenImport
when the statement is terminated (in
lineToken
iirc).
For this part I took a look at the spec for imports and basically copied the structure from there.
The DSL used here basically mixes and matches tags and named grammar forms. In
this case the tags are 'IMPORT'
, 'IMPORT_AS
', 'IMPORT_FROM'
as
replaced in lexer.coffee
's identifierToken
. The other parts of those
strings are just other named grammar forms (ImportsList
, OptComma
,
Identifier
, etc.).
The structure builds up through references to other grammar forms and functions
that create and return data structures, like -> new Import $2
. $n
variables are just references to the nth
word in the string.
This process leads to an AST that is passed to the Import
class defined in
nodes.coffee
.
Off the top of my head this should look as follows:
# import 'foo' will yield something like:
new Import(Value { value: 'foo' })
# import { foo } from 'foo' will yield something like:
new Import(Value { value: 'foo' }, ImportsList { .... })
You can look at this AST quite easily by just prepending a console.log
before calling new Import
:
Import: [
o 'IMPORT String', -> console.log($2); new Import $2
o 'IMPORT ImportClause IMPORT_FROM String', -> console.log($4, $2); new Import $4, $2
]
Taking the AST from grammar.coffee
, the classes in nodes.coffee
are
supposed to create tupels of "code" through @makeCode
and compileNode
functions. I'm not entirely clear on this part yet, but each node is compiled to
a string by calling compileNode
or compileToFragments
. What
Import.compileNode
basically does is just look at the AST and either return
an array of strings passed through @makdeCode
directly OR it calls the
token's compileNode
function.
This part is a bit of magic for me still, as there function names and processes don't line up with my way of thinking it seems.
In this video Jeremy explains the concepts and the parts where 'cheating' happens.