Interested in CYK and Earley chart parsers? #183

Hugo-ter-Doest · 2014-08-22T11:01:46Z

I implemented CYK and Earley chart parsers in Node. Interested in including it in the natural package? Please let me know.

Regards,
Hugo

kkoch986 · 2014-08-22T14:18:31Z

Hugo,

I've been really interested in parsers lately (working on something like an LR(1) parser myself right now), I'd love to check them out and we can talk about the best way to integrate them into natural.

-Ken

Hugo-ter-Doest · 2014-08-24T15:04:52Z

Hi Ken,

I created a repository with the chart parsers:
https://github.com/Hugo-ter-Doest/chart_parsers
The parsers are in /routes. I built a web app around it for testing and demonstration purposes. I think we/I can strip the parsers down to the algorithms and integrate it into natural. Also test code and documentation needs to be added.

Regards, Hugo

kkoch986 · 2014-09-03T23:26:22Z

Hugo,

Sorry for the delay on this, it looks pretty solid. I think if you could isolate the parsers and stand some tests up around them we could definitely merge them in.

-Ken

Hugo-ter-Doest · 2014-09-30T20:02:09Z

Here is a sign of life...
I did a lot of work on the parsers:

Parsers are now working with the same Chart, Item and Grammar objects.
The data structure of items follows the format for InfoVis so that parse trees can be visualised easily.
Grammar object is based on a PEG using http://pegjs.majda.cz/. It reads unification grammars as well. That's for later when I add feature structures and unification.
Added Left Corner parser.
Earley and Left Corner inherit from a generic Chart Parser.
Turned the recognisers in parsers: they record the parse now by keeping track of the children used to recognise an item (a rule).

Right now I'm working on tests using the Jasmin framework. At first I started writing tests using assert, but I saw that natural uses Jasmin.

Maybe we should look at the interfacing between existing modules in the natural module and the parsers. At the moment the parsers accept an a tagged sentence of the form:
[['I', 'NP'],
['saw', 'V'],
['the', 'DET'],
['man', 'N'],
['with', 'P'],
['the', 'DET'],
['telescope', 'N']]

Best regards,
Hugo

kkoch986 · 2014-09-30T20:11:16Z

Hugo,

Looks really exciting! I'll take a closer look tonight if i can. To be honest i'll have to quickly look through how our parsers work but it sounds like the tagged sentence makes sense.

-Ken

kkoch986 · 2014-10-15T13:37:02Z

Hugo,

Sorry for the delay on this, its still on my list just been traveling for work the past week so i've been a little distracted. Going to pull everything down now, look at our parsers and see what makes sense.

-Ken

Hugo-ter-Doest · 2014-10-15T14:10:50Z

Tests are in the spec folder. ChartParser_spec will fail because I'm using it to test the Head-Corner parser which is still under development. If you switch these lines:
//[LeftCornerParser, EarleyParser].forEach(function(ChartParser) {
[HeadCornerParser].forEach(function(ChartParser) {

the Left-Corner and Earley parsers will be tested.

Regards,
Hugo

kkoch986 · 2014-10-23T14:11:05Z

@Hugo-ter-Doest, looking over some of the stuff today. I'm trying to think of what the best fit with our existing code will be.

It seems to me if the parser requires the sentence to be tagged, for now one would have to use the wordnet module since we don't have a working POS tagger currently, it would be cool to have an example of that in the docs.

Something seems wrong in the unit tests, i tried changing a few values in an effort to break them and they didn't break. I'll work on them a bit and see what I can come up with.

Hugo-ter-Doest · 2014-10-23T20:07:05Z

I will take a look at the Wordnet module to see how I can connect it to the parsers.

Regarding the unit tests I cannot see what is going on. In my environment they fail if I change expected values in the spec files. For instance, if you exchange the indices of parse_trees in lines 71/72 of ChartParser_spec.js, the test will/should fail. If you let me know what happens I will look into this.

Regards,
Hugo

kkoch986 · 2014-10-23T20:09:25Z

Ok i'll take a look, i was just running through it quickly maybe i missed
something. Also, re: wordnet, I dont think it needs to be integrated with
the parsers but maybe an example of taking a string tokenizing it, looking
it up in wordnet and producing a list of tagged tokens to pass to the
parser. I think it'll be a useful example for people who want to use the
parsers in the future.

Thanks!

On Thu, Oct 23, 2014 at 4:07 PM, Hugo ter Doest [email protected]
wrote:

I will take a look at the Wordnet module to see how I can connect it to
the parsers.

Regarding the unit tests I cannot see what is going on. In my environment
they fail if I change expected values in the spec files. For instance, if
you exchange the indices of parse_trees in lines 71/72 of
ChartParser_spec.js, the test will/should fail. If you let me know what
happens I will look into this.

Regards,
Hugo

—
Reply to this email directly or view it on GitHub
#183 (comment).

Hugo-ter-Doest · 2014-10-23T20:16:30Z

Regarding Wordnet, I will create an example.

Unit tests: it's good idea to pull the latest code because I'm working on
it on a daily basis.

Hugo

2014-10-23 22:09 GMT+02:00 Ken Koch [email protected]:

Ok i'll take a look, i was just running through it quickly maybe i missed
something. Also, re: wordnet, I dont think it needs to be integrated with
the parsers but maybe an example of taking a string tokenizing it, looking
it up in wordnet and producing a list of tagged tokens to pass to the
parser. I think it'll be a useful example for people who want to use the
parsers in the future.

Thanks!

On Thu, Oct 23, 2014 at 4:07 PM, Hugo ter Doest [email protected]

wrote:

I will take a look at the Wordnet module to see how I can connect it to
the parsers.

Regarding the unit tests I cannot see what is going on. In my
environment
they fail if I change expected values in the spec files. For instance,
if
you exchange the indices of parse_trees in lines 71/72 of
ChartParser_spec.js, the test will/should fail. If you let me know what
happens I will look into this.

Regards,
Hugo

—
Reply to this email directly or view it on GitHub
#183 (comment).

—
Reply to this email directly or view it on GitHub
#183 (comment).

Met vriendelijke groeten,
Hugo ter Doest
06-23 43 60 33

kkoch986 · 2014-10-24T13:52:13Z

Yea i think it was my mistake on the unit tests sorry about that was just playing with them and everything seems fine.

Would you be able to outline for me just the files we would need to include the parsers in natural as well as what new external dependencies we would have to add? That way i know exactly what I need to do to integrate them.

Thanks!

Hugo-ter-Doest · 2014-10-24T15:13:04Z

External dependencies are:
"lodash" : "",
"fs" : "",
"log4js": "",
"typeof": ""

plus jasmine-node for testing.

Files you need are:
from lib:
Agenda.js
ChartParser.js
CYKParser.js
EarleyItem.js
GrammarParser.js
LeftCornerParser.js
Chart.js
CYK_Item.js
DoubleDottedItem.js
EarleyParser.js
GoalItem.js
HeadCornerParser.js
PEG-grammar-for-unification-grammar.txt

And from spec:
ChartParser_spec.js
Chart_spec.js
CYKParser_spec.js
EarleyItem_spec.js
GrammarParser_spec.js
HeadCornerParser_spec.js

Hugo

Hugo-ter-Doest · 2014-10-24T17:07:53Z

I forgot to mention the data files for the unit tests. From data you need:
math_expressions.txt
test_grammar_for_CYK.txt
minimal_grammar.txt
test_grammar_for_CFG.txt

Hugo-ter-Doest · 2014-10-25T20:09:28Z

Did some work on the example with Wordnet. I found out that Wordnet supports the following POS tags:
n NOUN
v VERB
a ADJECTIVE
s ADJECTIVE SATELLITE
r ADVERB

which is quite limited for full parsing. I will think of an example that makes some sense. In the mean time you can check the example. It is in the example folder (where else :-) of chart_parsers.

Also, I made the parsers work with a tagged sentence that may have multiple tags per token, like this:
[["I","s","n"],["saw","v","n"],["the","unknown"],["man","v","n"],["with","unknown"],["the","unknown"],["telescope","v","n"]]

Hugo

kkoch986 · 2014-10-27T00:49:50Z

Ok cool, was just looking for some end to end kind of example but i agree wordnet may not be the best. I think a better method for POS tagging is definitely a high priority for me as soon as i get some time to work on it.

Hugo-ter-Doest · 2014-10-27T07:04:32Z

I did something else to complement the Wordnet tags: I wrote a module FunctionWordTagger that reads a bunch of files with function words and tags the rest of the sentence. The module is in the lib folder, the dictionary files are in the data folder.

Hugo

kkoch986 · 2014-11-12T20:20:45Z

Hugo,

Sorry again for the slowness on this been a crazy few weeks at work, can you just let me know if there are any lodash features you use that arent included in underscore? I don't really want to have both dependencies since theyre pretty interchangable.

I think eventually we could port natural over to lodash but for now i just want to find the quickest path to getting the parsers integrated.

I just created a branch for this so its at the top of my list hopefully by the end of the weekend i'll have made some serious progress.

-Ken

Hugo-ter-Doest · 2014-11-12T20:32:06Z

I will take a look at the underscore features. If I remember right, I use lodash for deep comparison of objects only. I checked and this is supported by underscore as well. So it's probably no problem to exchange them.

Hugo

Hugo-ter-Doest · 2014-11-12T20:34:15Z

Oh and there's no hurry. I'm writing this stuff just for fun and to learn a new language and libraries.

Hugo-ter-Doest · 2014-11-13T21:25:46Z

I replaced lodash with underscore!

Hugo

kkoch986 · 2014-11-13T21:31:17Z

Perfect! i was playing around with it and replaced it in a few places as well so i had a feeling it would have worked.

silentrob · 2015-01-24T06:21:20Z

This looks amazing. We are still missing a CCG or minimalist grammar. But that is a great list.

Hugo-ter-Doest · 2015-01-24T09:10:15Z

Thanks! Plan is to integrate this into the natural module.

Regards,
Hugo

Hugo-ter-Doest · 2015-01-24T21:51:48Z

I published the chart parsers on npm:
npm install chart-parsers

Regards,
Hugo

kkoch986 added the Request For Feedback label Aug 22, 2014

kkoch986 added Feature In Progress and removed Request For Feedback labels Sep 30, 2014

Hugo-ter-Doest self-assigned this Apr 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interested in CYK and Earley chart parsers? #183

Interested in CYK and Earley chart parsers? #183

Hugo-ter-Doest commented Aug 22, 2014

kkoch986 commented Aug 22, 2014

Hugo-ter-Doest commented Aug 24, 2014

kkoch986 commented Sep 3, 2014

Hugo-ter-Doest commented Sep 30, 2014

kkoch986 commented Sep 30, 2014

kkoch986 commented Oct 15, 2014

Hugo-ter-Doest commented Oct 15, 2014

kkoch986 commented Oct 23, 2014

Hugo-ter-Doest commented Oct 23, 2014

kkoch986 commented Oct 23, 2014

Hugo-ter-Doest commented Oct 23, 2014

kkoch986 commented Oct 24, 2014

Hugo-ter-Doest commented Oct 24, 2014

Hugo-ter-Doest commented Oct 24, 2014

Hugo-ter-Doest commented Oct 25, 2014

kkoch986 commented Oct 27, 2014

Hugo-ter-Doest commented Oct 27, 2014

kkoch986 commented Nov 12, 2014

Hugo-ter-Doest commented Nov 12, 2014

Hugo-ter-Doest commented Nov 12, 2014

Hugo-ter-Doest commented Nov 13, 2014

kkoch986 commented Nov 13, 2014

silentrob commented Jan 24, 2015

Hugo-ter-Doest commented Jan 24, 2015

Hugo-ter-Doest commented Jan 24, 2015

Interested in CYK and Earley chart parsers? #183

Interested in CYK and Earley chart parsers? #183

Comments

Hugo-ter-Doest commented Aug 22, 2014

kkoch986 commented Aug 22, 2014

Hugo-ter-Doest commented Aug 24, 2014

kkoch986 commented Sep 3, 2014

Hugo-ter-Doest commented Sep 30, 2014

kkoch986 commented Sep 30, 2014

kkoch986 commented Oct 15, 2014

Hugo-ter-Doest commented Oct 15, 2014

kkoch986 commented Oct 23, 2014

Hugo-ter-Doest commented Oct 23, 2014

kkoch986 commented Oct 23, 2014

Hugo-ter-Doest commented Oct 23, 2014

kkoch986 commented Oct 24, 2014

Hugo-ter-Doest commented Oct 24, 2014

Hugo-ter-Doest commented Oct 24, 2014

Hugo-ter-Doest commented Oct 25, 2014

kkoch986 commented Oct 27, 2014

Hugo-ter-Doest commented Oct 27, 2014

kkoch986 commented Nov 12, 2014

Hugo-ter-Doest commented Nov 12, 2014

Hugo-ter-Doest commented Nov 12, 2014

Hugo-ter-Doest commented Nov 13, 2014

kkoch986 commented Nov 13, 2014

silentrob commented Jan 24, 2015

Hugo-ter-Doest commented Jan 24, 2015

Hugo-ter-Doest commented Jan 24, 2015