Typically in cases where you have blobs so large I think it is best to pre-process them. This also has the advantage of having it staged in case you ever need to geo-replication or quickly restore from backup. For example in Azure Functions there are Blob triggers that can be fired to execute some code. In this, you could leverage Apache Tika to extract the text from the files and store them back to a separate blob container. Then have Cognitive Search pick up the extracted text from there. Please note, extracting this much text from files this large can be quite compute and memory intensive, so it is possible that your pre-processing might actually need some higher compute / memory.
The code is a little older now, but hopefully this example of using TikaDotNet in an Azure Function might also help: https://github.com/liamca/AzureSearch-AzureFunctions-CognitiveServices/blob/master/ApacheTika/run.csx
Please note, I have never tried this code on a file so large though.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…