Skip to content

Latest commit

 

History

History
25 lines (15 loc) · 799 Bytes

README.md

File metadata and controls

25 lines (15 loc) · 799 Bytes

Update DMOZ categories and domains categories

Forked from dmo2db.

Installation

  • Make sure you have pip and sqlalchemy 0.6.5 or higher installed

  • Download structure.rdf.u8 from DMOZ

  • Download content.rdf.u8 from DMOZ

  • Create database named dmoz createdb dmoz

  • Copy src/db.sample.conf to db.conf and update config

# Should be run from src folder
python dmoz2db.py --keep-db -s structure.rdf.u8 -c content.rdf.u8
  • Normalize table by renaming column and table names psql dmoz < src/normalize.sql

  • Backup tables and upload them to live db server pg_dump --table headlines_domains_categories --table headlines_categories --data-only dmoz > dmoz_categories.sql