Skip to content

Convert any video and srt into a dataset to train an ASR model. Each sentence with transcript gets sliced out with transcript into an audio sample with csv.

Notifications You must be signed in to change notification settings

sine2pi/mp4_to_dataset_converter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Created this out of a need to create audio samples of specific lengths with corresponding transcripts for NLP datasets. There are programs that kind of do this but not exactly to huggiung face spcifics.

About

Convert any video and srt into a dataset to train an ASR model. Each sentence with transcript gets sliced out with transcript into an audio sample with csv.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages