Find and extract similar lines in text files. The purpose of this is to find similar sentences in the Common Voice dataset - such as a wrong sentence and its corrected version.
Make sure to turn on compiler optimizations, especially if you are dealing with text files of > 1000 lines.