Skip to content

Commit

Permalink
improving Readme file
Browse files Browse the repository at this point in the history
  • Loading branch information
loribeiro committed Feb 12, 2021
1 parent aff56f3 commit 31ff5d2
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 2 deletions.
31 changes: 29 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,29 @@
# Singularity-POS-Tagger
Portuguese POS-Tagger writen in core Node.JS, without any external modules.

I developed this library to use as base for another personal project. There are planing of room to improve accuracy with heuristics and twiks.

It's designed specially for Node.Js Streams, which can improve speed and memory use when working on servers or large corpus of data. Nonethless, one can still use a built in method to work with strings.

There is no need to pre-processing the corpus, there is a built in function that cleans everything before the POS classification.

Because the nature of JavaScript be single threaded and NLP jobs are usually very resource intense, almost everything in this package runs asyncronous.

# Installation
In a Node.JS environment you can run on your terminal.
>npm i singularity-tagger
# What is a POS Tagger?

A POS Tagger or Part of Speach Tagger is a piece of software that analyzes a corpus and taggs the words with it's respective gramatical class.

# Applications of POS Taggers
- Sentiment analysis
- Question answering
- Word sense disambiguation

Basically every Natural Language Processing task uses a POS tagger as sub task.

# Implementation
- Model trained on Mac-Morpho's anotated corpus available on: http://nilc.icmc.usp.br/macmorpho/
- Stochastic algorithm used:
Expand All @@ -13,7 +36,7 @@ Portuguese POS-Tagger writen in core Node.JS, without any external modules.
*__Singularity__* is designed to be used on async functions or as ECMS6 promise.
There are two main methods available:
- __analyzeString__
* receive as parameter one string and returns an array with the words normalized
* receive as parameter one string and returns an array with normalized words
along side the tags of the string
- __analyzeStream__
* receive as parameter an input stream and output stream and returns to the output stream the normalized text along side the tags
Expand Down Expand Up @@ -49,4 +72,8 @@ There are two main methods available:
>
>...
>
> PosTagger().then((tagger)=> tagger.analyzeStream(inputStream, outputStream)).catch(err=>console.log(err))
> PosTagger().then((tagger)=> tagger.analyzeStream(inputStream, outputStream)).catch(err=>console.log(err))
# Tags meaning table

![image info](./assets/table.png)
Binary file added assets/table.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 31ff5d2

Please sign in to comment.