-
Notifications
You must be signed in to change notification settings - Fork 2
bobanj/article-categorizer
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Guesses the category of a given article, depending from previously parsed articles Decission is determined from parsed articles from www.a1.com.mk Instalation: Get the project git clone git://github.com/bobanj/article_categorizer_mk.git Create your database.yml into the config dir. Install needed libs for Nokogiri Linux: install these libs: libxslt1-dev libxml2-dev install gem with: gem install nokogiri Windows: install with: gem install nokogiri -s http://tenderlovemaking.com/ Install the other gems by running: rake gems:install Prepare the database: rake db:create rake db:migrate rake db:seed; Get the articles: First make sure that http://www.a1.com.mk is alive :) You can run one rake process to get all the articles if you dont mind waiting the rake task to parse all the articles with: rake media:get_articles #(better use start and count params because the training is done # after getting the articles - due to performance) Or you can run multiple rake tasks to get all the data faster by setting a start and cont parameters rake media:get_articles start=50 count=10000 where start is the newsid from a1 and count is the number of articles that will be parsed so you can run multiple rake tasks to get more articles at the same time: rake media:get_articles start=10000 count=9999 rake media:get_articles start=20000 count=9999 rake media:get_articles start=30000 count=9999 (you get the point) Dont worry if there is an overlap, one article cant be inserted multiple times After getting all the articles you need to start the training: Training rake media:train_all #Please be patient After the training is finished you can set a crone task to get new articles every day and train with them rake media:get_articles train=true Create manualy one account # I'll change this in future commits User.create :email => your_email, :password => your_pass, :password_confirmation => your_pass Start the server ######### # TO DO # ######### - Manage Users ? - Write learining in db ? - New layout
About
Article Categorizer Using Bayes
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published