I am trying to do domain adaptation with text data to improve the speech to text results of Google Cloud Speech-to-Text. I have already done this with the Azure and AWS Speech to text systems. There you just throw a huge text corpus with domain specific language at the system and you usually get better results after that.
For the Google speech to text system I have not found anything like that. What I found is this tutorial: https://cloud.google.com/speech-to-text/docs/speech-adaptation
This sadly only allows very specific adaptations (manually adding words that should be recognized better).
I have tried doing a keyword extraction on my text corpus and putting the extracted words in the speech_contexts[{"phrases": []}]
parameter but this didn't change my results.
Is there any way to train the Google speech to text service (language model) with a large text corpus for domain adaptation?
question from:https://stackoverflow.com/questions/65602081/google-speech-to-text-domain-adaptation