Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named entity recognition #107

Open
robdefeo opened this issue Nov 26, 2013 · 14 comments
Open

Named entity recognition #107

robdefeo opened this issue Nov 26, 2013 · 14 comments

Comments

@robdefeo
Copy link

Do you have any plans for named entity recognition, I have seen that it would require a sequential classifier. The ability to train it with your own data set (json document) of POS tags and other key attributes.

@sabatinim
Copy link

This is very interesting!!! there is a plan?

@kkoch986
Copy link
Member

Don't currently have a plan, if anyone wants to tackle this it would be great though!

@mbc1990
Copy link
Contributor

mbc1990 commented Dec 15, 2014

Anyone have more thoughts on what algorithm might be best to implement for this?

@liwenzhu
Copy link
Contributor

@mbc1990 I think crf is the best model for NER, the pipeline is tokenize -> pos tag -> NER, the challenge is you need find a NER training data, it's a hard work.

@hbakhtiyor
Copy link

any news of the feature?

@gagan-bansal
Copy link

+1

@gagan-bansal
Copy link

A detailed approach is given in nltk document for NER extraction.

@diegodorgam
Copy link

Hi there everyone, I was just studying this subject and found some real interesting stuff about NER that I want to share:

There some ways of doing this feature, the CharWNN seems to be the one with best results, but not by far. The others seems to need specific training corpus. For me it looks pretty similar to the PoS Tagger.
I'm still not able to reproduce the algorithm detailed in those papers, also haven't found anything in javascript, only a few examples in python.
Hope this will help to get any talented developer here inspired =)

@Hugo-ter-Doest
Copy link
Collaborator

Hugo-ter-Doest commented Apr 28, 2018

I'm working on named entity recognition for natural. I'm working on three ways of recognition:

  • regular expressions: can be used for time, date, uri's, currency, etc.
  • vocabulary of named entities
  • a (trained) model that classifies named entities

It will be possible to combine these approaches, so a hybrid approach.
The methods returns a list of edges of the form (recognised string, start index, end index, category, score). Score only makes sense for the trained model. I'm thinking of using a maximum entropy model. Is that a viable route, any ideas on useful feature functions?

Hugo

@diegodorgam
Copy link

how is this going @Hugo-ter-Doest ? do you had any progress on this?

@Hugo-ter-Doest
Copy link
Collaborator

Yes, I did some work on this:
https://github.com/Hugo-ter-Doest/natural/tree/NER

I am trying to make a hybrid approach. First find the easy to define and match entities with regular expressions and lexicons, then apply a statistical model to do more advanced detection.

Hugo

@jseijas
Copy link

jseijas commented Jul 31, 2018

Hi! Perhaps I can help with that, I did a NER but only for "enumerateds", with similar search, and my next step was to add regular expression entities (I see that you already had them!!! Great job!!!).

@GeorgeNance
Copy link

Are there any plans to incorporate this ?

@dorgan
Copy link

dorgan commented Sep 1, 2020

Any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests