Skip to content

Latest commit

 

History

History
644 lines (400 loc) · 28.4 KB

TreasureTroves.md

File metadata and controls

644 lines (400 loc) · 28.4 KB

Aspect Oriented Programming

Authentification, Single Sign-on and Authorization

BDD + TDD

Chat

Content Management Systems

JSON Schema

Web-based XML Editors

  • CKEditor.XML

  • FontoXML

    649 Euro per concurrent user per year

  • jquery.xmleditor is a web browser based XML editor. It provides a general use graphical tool for creating new or modifying existing XML documents in your web browser. Information is extracted from an XML schema (XSD file) to provide the user with information about what elements, subelements and attributes are available at different points in the structure, and a GUI based means of adding or removing them from the document

  • <oXygen/> XML Editor Authoring SDK

  • ProseMirror is an in-browser semantic rich text editor component

  • Xeditor

    2500 Euro + 250 Euro per named user

  • SDL Xopus

  • XStandard

    from 1.90 US$ per user for 1000 users (Σ 1899 US$) to 19.90 US$ per user for 10 users (Σ 199 US$)

Data Exchange, Import + Export

AtomPub

Java

Ruby

  • Alumina is a Ruby gem that implements Atom, both the syndication format and the publishing protocol

  • Atom Protocol Exerciser is a tool to test your AtomPub protocol server

  • atompub-server to add AtomPub server capabilities to your Rails application

CMIS

  • ActiveCMIS is a Ruby library aimed at easing the interaction with various CMIS providers. It creates Ruby objects for CMIS objects, and creates Ruby classes that correspond to CMIS types

  • cmis client for JRuby

  • CmisSync: Dropbox-like sync for your enterprise file server (Github)

  • yaccl is a Ruby CMIS browser binding client library implementation

WebDAV

  • sabre/dav is an open source CardDAV, CalDAV and WebDAV server written in PHP

  • WsgiDAV is a generic WebDAV server written in Python and based on WSGI

Rails 2:

Rails 3:

Web Services

Desktop Integration

  • Add-in Express are sets of components, visual designers and wizards for developing secure, managed, isolated, deployable and version-neutral Microsoft Office extensions, including COM add-ins, Outlook plug-ins, Office smart tags, Excel RTD servers, Excel XLL and UDF add-ons.

Documentation

  • Rocco is a Ruby port of Docco, the quick-and-dirty, hundred-line-long, literate-programming-style documentation generator

  • GitBook, GitBook Editor

  • YARD is a documentation generation tool for the Ruby programming language

Linguistics + Statistics

Linguistics @ StackExchange

Natural Language Toolkit is a book, too

Natural Language Processing Tools

NLP @ StackOverflow

Stanford NLP Annotated List of Resources

Algorithmen

  • Plagiarism detection algorithms

    • Greedy String Tiling (GST), Running-Karp-Rabin Greedy-String-Tiling (RKS-GST)

    • nGram Overlap

    • SPEX

    • Winnowing

Java / JVM

Javascript

Strip HTML from text (Source)

jQuery(html).text();

Libraries

  • bjspell JavaScript based spell checker compatible with Hunspell dictionaries

  • Hyphenator.js Javascript that implements client-side hyphenation of HTML-Documents

  • Hypher JavaScript hyphenation engine (Github)

  • jspos is a Javascript port of Mark Watson's FastTag part of speech tagger

  • TextStatistics.js generates information about texts including syllable counts and Flesch-Kincaid, Gunning-Fog, Coleman-Liau, SMOG and Automated Readability scores (English only) (a port of TextStatistics.php)

  • Typo.js Client-side JavaScript Spellchecking

Python

To integrate a Python library like the NLTK with JVM-based application we could resort to

See also Communication between Java and Python

Ruby

  • Completeness-Fu

  • Classifier is a general module to allow Bayesian and other types of classifications.

  • CRM114.rb, a Ruby interface to CRM114 Controllable Regex Mutilator, an advanced and fast text classifier that uses sparse binary polynomial matching with a Bayesian Chain Rule evaluator and a hidden Markov model to categorize data with up to a 99.87% accuracy.

  • ffi-hunspell, Ruby FFI bindings for Hunspell

  • Ruby Linguistics Framework

  • treetagger-ruby a Ruby based wrapper for the TreeTagger by Helmut Schmid which is the foundation of LinguLab

Ruby + GNU Scientific Library

  • GNU Scientific Library

  • RubyGSL is a ruby interface to the GSL (GNU Scientific Library), for numerical computing with Ruby.

  • Ruby/GSL-ng is a new generation Ruby/GSL wrapper that strives for code simplicity while retaining acceptable performance. Other GSL wrappers are either utterly complicated (lots of C code) or poorly documented. Ruby/GSL-ng uses Ruby/FFI and little bits of C code to achieve a simple implementation that integrates neatly with Ruby's standard classes and follows most of its conventions.

Ruby + R

R

R is a free software environment for statistical computing and graphics. Rweb is a Web based interface to R.

Stefan Th. Gries' web page has more links.

Misc

  • Copyfind

  • METER (MEasuring TExt Reuse) Project incl. algorithm implementations written in Perl

  • Snowball is a language in which stemming algorithms can be easily represented. The Snowball compiler translates a Snowball script (a .sbl file) into either a thread-safe ANSI C program or a Java program.

Linked Data + Semantic Web

  • Linked Data is about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods

Markups

  • Radius is a powerful tag-based template language for Ruby inspired by the template languages used in MovableType and TextPattern. It uses tags similar to XML, but can be used to generate any form of plain text (HTML, e-mail, etc…).

Markdown

  • Babelmark Markdown Testbed

  • MultiMarkdown

    A derivative of the original Markdown.pl with added features by Fletcher T. Penney.

Ruby

  • kramdown is yet-another-markdown-parser but fast, pure Ruby, using a strict syntax definition and supporting several common extensions

  • Maruku: a Markdown-superset interpreter

    Andrea Censi's Markdown parser with additional features, some borrowed from Markdown Extra.

  • RDiscount

Project Mgmt + Team Coordination

Resource Description Framework (RDF)

Telephony + VoIP

Web Services

Mehrwertsteuernummer (MWSt-ID), Umsatzsteuernummer (USt-ID)

Web Server

Misc