Комментарии:
how can you have it process a multilingual audio?
ОтветитьVery cool. Can we use openai whisper to IVR telephony. Like it needs to address clients from multiple languages like Hindi, telugu, Malayalam, tamil, English and respond accordingly
Ответитьit is so confusing
ОтветитьDo you have a tutorial for translation? can you please help me out? just like the def transcribe, I want a function for translate and I want to integrate with gradio. I can't make the translate work? can you please give me the code for translate?
ОтветитьI would recommend Streamlit to build front-end interface.
ОтветитьHello dear, video is really very helpful for me. I am trying to build asr for Sanskrit language. It is not working for that. Could you help me how to train sanskrit data? Or any videos that will help me for building sanskrit asr. I have a parallel sanskrit data.
ОтветитьTo run OpenAI Whisper LARGE model, how does the RTX 4090 compare to this setup on AWS - NVIDIA A10G Tensor Core GPU, g5.xlarge with 16GB RAM. Can I expect faster or slower transcription with the 4090?
ОтветитьCan the RTX 4090 run Openai Whisper LARGE model well, on an i9 1TB Nvme SSD 12th Gen gig that has 64GB DDR5 RAM?
ОтветитьHi. Tq for your tutorial. I ' be tried your Web UI and hope that it can help me transcribe some of the discussions related to my job. Unfortunately every time I tried to use your web ui - failed. Is it because. Of the size of the recorded oudio - more than 1 hour? Please help. Tq
ОтветитьGreat Video, thanks for sharing.
Non coder here, but I see great application for this in terms of improving productivity. I was wondering:
1. How straight forward would it be for a non-coder to install it on windows?
2. I can it cannot currently differentiate between 2 different speakers, is that something in the pipeline?
Use-Case: I have been looking for a tool, that will take my recorded meeting conversations, and transcribe them, with proper formatting and differentiate between the different participants, wonder if its possible to achieve this with Whisper or another tool?
Thanks
Hello, thank you so much for your tutorial. I am trying to use Whisper for my master's thesis in translation technologies. The only issue I had was that after importing gradio and recording live a short audio so Whisper can transcribe, it doesn't work, it just keeps loading and loading forever even if it's just a 6 second audio. What do you suggest I can do? Thank you again from Spain!
ОтветитьBest content ! Thanks
Can we calculate confidence interval of each word transcribed?
Hello sir i saw your all video.
i am your big follower!
please tell me how I can convert long Bangla language mp3 to Bengali text?
please sir you make a video about this topic.
Can you use this in Unreal Engine
Ответитьlove from Pakistan :)
ОтветитьMay I ask, once the web demo is done with basic UI web using Gradio, how can we migrate this to a proper web app, like standalone webapp, can you please guide a little ?
ОтветитьThank you for this. Subbed!
ОтветитьBro really amazing content hatsoff to you
ОтветитьHi ! do you know if it's posible to do it in nodejs? how can you use whisper in a web app ?
Ответитьcan it do realtime transcription instead of processing audio file ?
ОтветитьHey @1littkecoder can we train this model on our own dataset
ОтветитьHey guys please can anyone help me with this issue. I am trying to run whisper on my machine and I am getting this error in cmd. UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead").
I use a windows 10 with gpu RTX2060. Also it seems it runs on my cpu instead of NVIDIA GPU. I created a python virtual environment and pip installed whisper in that virtual environment just for more details.
indian acccent vs british accent does it shows any different or only shows english language
ОтветитьLove the channel, you should have many more subs! ❤
ОтветитьThanks to openAI and thanks to you தல. ஒரு வெள அந்த 5 sec limitation இல்லேன்னா, இத நம்ம podcast transcriptionக்கு use pannalamன்னு நினச்சேன். Thanks for making this video.
ОтветитьThanks for sharing.
ОтветитьGreat work !
ОтветитьThank you!
I have two questions:
1) When I try to run the notebook it says "no such file" after I uploaded my audio file. How should I make sure it can access the audio file?
2) when I do model.device it returns cpu instead of cuda. How do I change this?
Kudos to you if you prepared the Colab files!
ОтветитьThis is a great demo, thank you!
I am new to programming. Can our local machines handle this or should we do it in google collab?
Thank you! Are you able to explain how to do simple voice activity detection with this model?
ОтветитьCan I integrate whisper in my android application? Is there any API keys for this?
ОтветитьHello, how do you import audios ? I'm stuck on the 4th step
ОтветитьHow do I download the models and weights without using Colab? Eg a local conda env. There is no way I can see how to do this on the GIT page.
ОтветитьSuperb
ОтветитьHi and thank you! I find your content so inspiring! Definetly trying this app.
ОтветитьComparison to speech recognition with Google pixel 6 Pro would be interesting
ОтветитьThank you for the tutorial.
When I tried to step through your Gradio app, I got errors when trying to import your audio clips.
When I disconnected and copied your code to my own Google Drive, I was able to at least record audio with my own microphone and see Whisper transcribe up to 30 seconds.
Golden Content! Just started working on a project and this is a very helpful resource to implement. Thank you!
ОтветитьIf it could distinguish and tag/timestamp multiple speakers in a recording (e.g. of a meeting) then that would be awesome.
Ответить