Markdown-inspired code to turn office documents into clean HTML, using simple cues like indent and headings
What do we have here?
tools/w2html5
is Javascript code that can clean up HTML documents produced by Word and the OpenOffice family and turn them into nice HTML5 - can run in a browser but designed to be run in lights-off mode via phantomjstools/commandline
contains a python script to automate the process of turning .doc(x)? into HTML