OpenAI Whisper - MultiLingual AI Speech Recognition Live App Tutorial

1littlecoder

1 год назад

40,891 Просмотров

Скачать видео

Комментарии:

layla Bitar - 31.05.2023 10:15

how can you have it process a multilingual audio?

Ответить

George Mathew - 25.05.2023 02:20

Very cool. Can we use openai whisper to IVR telephony. Like it needs to address clients from multiple languages like Hindi, telugu, Malayalam, tamil, English and respond accordingly

Ответить

Bassam - 17.05.2023 17:56

it is so confusing

Ответить

Anukiran Ghosh - 12.05.2023 15:28

Do you have a tutorial for translation? can you please help me out? just like the def transcribe, I want a function for translate and I want to integrate with gradio. I can't make the translate work? can you please give me the code for translate?

Ответить

Rayden X - 19.04.2023 02:35

I would recommend Streamlit to build front-end interface.

Ответить

Tapan Ray - 12.03.2023 23:34

Hello dear, video is really very helpful for me. I am trying to build asr for Sanskrit language. It is not working for that. Could you help me how to train sanskrit data? Or any videos that will help me for building sanskrit asr. I have a parallel sanskrit data.

Ответить

George Patronus - 11.02.2023 12:51

To run OpenAI Whisper LARGE model, how does the RTX 4090 compare to this setup on AWS - NVIDIA A10G Tensor Core GPU, g5.xlarge with 16GB RAM. Can I expect faster or slower transcription with the 4090?

Ответить

George Patronus - 11.02.2023 12:10

Can the RTX 4090 run Openai Whisper LARGE model well, on an i9 1TB Nvme SSD 12th Gen gig that has 64GB DDR5 RAM?

Ответить

Dos Hanif - 07.02.2023 05:13

Hi. Tq for your tutorial. I ' be tried your Web UI and hope that it can help me transcribe some of the discussions related to my job. Unfortunately every time I tried to use your web ui - failed. Is it because. Of the size of the recorded oudio - more than 1 hour? Please help. Tq

Ответить

David Thompson - 17.01.2023 17:37

Great Video, thanks for sharing.
Non coder here, but I see great application for this in terms of improving productivity. I was wondering:
1. How straight forward would it be for a non-coder to install it on windows?
2. I can it cannot currently differentiate between 2 different speakers, is that something in the pipeline?

Use-Case: I have been looking for a tool, that will take my recorded meeting conversations, and transcribe them, with proper formatting and differentiate between the different participants, wonder if its possible to achieve this with Whisper or another tool?

Thanks

Ответить

Anna Acedo Ortega - 12.01.2023 14:24

Hello, thank you so much for your tutorial. I am trying to use Whisper for my master's thesis in translation technologies. The only issue I had was that after importing gradio and recording live a short audio so Whisper can transcribe, it doesn't work, it just keeps loading and loading forever even if it's just a 6 second audio. What do you suggest I can do? Thank you again from Spain!

Ответить

Tejas Narola - 06.01.2023 14:51

Best content ! Thanks
Can we calculate confidence interval of each word transcribed?

Ответить

Avijit barua - 25.12.2022 07:25

Hello sir i saw your all video.
i am your big follower!
please tell me how I can convert long Bangla language mp3 to Bengali text?
please sir you make a video about this topic.

Ответить

REAL VIBES TV - 22.12.2022 11:31

Can you use this in Unreal Engine

Ответить

Engg M. Ali Mirza Short Clips, Whatsapp Status - 12.12.2022 12:34

love from Pakistan :)

Ответить

App Stuff - 10.12.2022 14:55

May I ask, once the web demo is done with basic UI web using Gradio, how can we migrate this to a proper web app, like standalone webapp, can you please guide a little ?

Ответить

App Stuff - 10.12.2022 12:32

Thank you for this. Subbed!

Ответить

Gowtham Dora - 18.11.2022 20:47

Bro really amazing content hatsoff to you

Ответить

IdeaAi - 18.11.2022 20:37

Hi ! do you know if it's posible to do it in nodejs? how can you use whisper in a web app ?

Ответить

ABHIGNA CONSCIENCE - 04.11.2022 01:20

can it do realtime transcription instead of processing audio file ?

Ответить

Danish a - 26.10.2022 22:19

Hey @1littkecoder can we train this model on our own dataset

Ответить

Dimoris Chinyui - 30.09.2022 00:12

Hey guys please can anyone help me with this issue. I am trying to run whisper on my machine and I am getting this error in cmd. UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead").
I use a windows 10 with gpu RTX2060. Also it seems it runs on my cpu instead of NVIDIA GPU. I created a python virtual environment and pip installed whisper in that virtual environment just for more details.

Ответить

Arun Kumar - 28.09.2022 12:41

indian acccent vs british accent does it shows any different or only shows english language

Ответить

concretec0w - 27.09.2022 15:43

Love the channel, you should have many more subs! ❤

Ответить

Shyam Siddarth - 27.09.2022 12:57

Thanks to openAI and thanks to you தல. ஒரு வெள அந்த 5 sec limitation இல்லேன்னா, இத நம்ம podcast transcriptionக்கு use pannalamன்னு நினச்சேன். Thanks for making this video.

Ответить

Abhilekh Kalita - 26.09.2022 09:11

Thanks for sharing.

Ответить

Ashutosh Kumar - 26.09.2022 08:00

Great work !

Ответить

Homeless in America - 26.09.2022 01:19

Thank you!

I have two questions:

1) When I try to run the notebook it says "no such file" after I uploaded my audio file. How should I make sure it can access the audio file?

2) when I do model.device it returns cpu instead of cuda. How do I change this?

Ответить

byGDur - 25.09.2022 00:28

Kudos to you if you prepared the Colab files!

Ответить

Flawed Thoughts - 23.09.2022 04:27

This is a great demo, thank you!

I am new to programming. Can our local machines handle this or should we do it in google collab?

Ответить

Alastair van Heerden - 22.09.2022 23:52

Thank you! Are you able to explain how to do simple voice activity detection with this model?

Ответить

Ayush Singhal - 22.09.2022 15:44

Can I integrate whisper in my android application? Is there any API keys for this?

Ответить

CosmicVibing - 22.09.2022 13:58

Hello, how do you import audios ? I'm stuck on the 4th step

Ответить

Chris Lloyd - 22.09.2022 13:07

How do I download the models and weights without using Colab? Eg a local conda env. There is no way I can see how to do this on the GIT page.

Ответить

Sathish Kumar - 22.09.2022 04:39

Superb

Ответить

fedahumada - 22.09.2022 03:13

Hi and thank you! I find your content so inspiring! Definetly trying this app.

Ответить

Azmo - 22.09.2022 01:01

Comparison to speech recognition with Google pixel 6 Pro would be interesting

Ответить

Chrontexto - 22.09.2022 00:21

Thank you for the tutorial.

When I tried to step through your Gradio app, I got errors when trying to import your audio clips.
When I disconnected and copied your code to my own Google Drive, I was able to at least record audio with my own microphone and see Whisper transcribe up to 30 seconds.

Ответить