Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm making a windows desktop application that needs to transcribe videos and I'm looking for a good free API to help me achieve that. I looked a lot but most of the API's that I've found have bad accuracies.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
2.6k views
Welcome To Ask or Share your Answers For Others

1 Answer

Google's Speech-to-Text API has state of the art accuracy, a simple interface, and client libraries in many languages. You get 60 minutes free per month.

Link: https://cloud.google.com/speech-to-text/

If you want online API that is totally free, you most likely will not find it.

If you are willing to go offline, you will probably have to come up with a custom solution using the weights of some openly available deep learning model. Read some papers on state-of-the-art transcription models and see if any of the weights are available on GitHub. Keep in mind that performing such a task offline is very computationally expensive, and might require a GPU to give you results in a reasonable amount of time.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...