Skip to content
@TurkuNLP

TurkuNLP Group - IT Department - University of Turku

Popular repositories Loading

  1. Turku-neural-parser-pipeline Turku-neural-parser-pipeline Public

    A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more than 50 languages. Top ranker in the CoNLL-18 Shared Task.

    Python 112 31

  2. FinBERT FinBERT Public

    BERT model trained from scratch on Finnish

    Shell 96 7

  3. Finnish-dep-parser Finnish-dep-parser Public

    The Finnish dependency parsing pipeline being developed by the Turku NLP group. Documentation:

    Python 49 10

  4. wikibert wikibert Public

    BERT models for many languages created from Wikipedia texts

    34 1

  5. Text_Mining_Course Text_Mining_Course Public

    Stuff for the Text Mining course

    Jupyter Notebook 28 9

  6. ocr-correction ocr-correction Public

    Post-processing OCR errors with seq2seq models

    Python 28 2

Repositories

Showing 10 of 129 repositories
  • TurkuNLP/textual-data-analysis-course’s past year of commit activity
    Jupyter Notebook 4 Apache-2.0 0 0 0 Updated Jan 28, 2025
  • Text_Mining_Course Public

    Stuff for the Text Mining course

    TurkuNLP/Text_Mining_Course’s past year of commit activity
    Jupyter Notebook 28 9 0 0 Updated Jan 28, 2025
  • forest-in-s24 Public
    TurkuNLP/forest-in-s24’s past year of commit activity
    0 0 0 0 Updated Jan 28, 2025
  • list-of-publications Public

    Turku NLP list of publications

    TurkuNLP/list-of-publications’s past year of commit activity
    TeX 0 2 0 0 Updated Jan 28, 2025
  • TurkuNLP/LLM_document_descriptors’s past year of commit activity
    Jupyter Notebook 0 0 0 0 Updated Jan 22, 2025
  • htr-table-pipeline Public

    Handwritten text recognition pipeline for table data

    TurkuNLP/htr-table-pipeline’s past year of commit activity
    Jupyter Notebook 0 Apache-2.0 0 0 0 Updated Jan 21, 2025
  • finerweb-10bt Public

    Code for FinerWeb-10BT – tools for cleaning web data line by line using LLMs

    TurkuNLP/finerweb-10bt’s past year of commit activity
    Python 0 MIT 1 0 0 Updated Jan 16, 2025
  • ECCO-ocr-large-run Public

    Code for the large LUMI run of ECCO ocr correction

    TurkuNLP/ECCO-ocr-large-run’s past year of commit activity
    Python 0 Apache-2.0 0 0 0 Updated Jan 16, 2025
  • TurkuNLP/DIKI1002-Working-with-Text-in-Python’s past year of commit activity
    5 0 0 0 Updated Jan 14, 2025
  • Keyword-embeddings-clusters Public

    Clusters with keywords grouped based on their word embeddings

    TurkuNLP/Keyword-embeddings-clusters’s past year of commit activity
    0 0 0 0 Updated Jan 14, 2025

Top languages

Loading…

Most used topics

Loading…