Skip to content

Latest commit

 

History

History
4 lines (3 loc) · 305 Bytes

README.md

File metadata and controls

4 lines (3 loc) · 305 Bytes

findSimilarLines

Find and extract similar lines in text files. The purpose of this is to find similar sentences in the Common Voice dataset - such as a wrong sentence and its corrected version.

Make sure to turn on compiler optimizations, especially if you are dealing with text files of > 1000 lines.