Comparing 2 Data Curation Methods for Training AI Voice in RVC

9 месяцев назад

8,108 Просмотров

Комментарии:

@Samuel-wl4fw - 27.12.2023 05:20

I think the noise truncation is really useful! It might benefit from having a video about this itself or mentioned in the first data curation video! I am making my own Melina dataset to test with and the video I use as a reference has a lot of silence after applying UVCR on it.

Ответить

@user-fl3fb1vv1s - 14.12.2023 03:35

thats way split data set its better :)

Ответить

@seeyou2winyou - 10.12.2023 20:13

[W CUDAGuardImpl.h:124] Warning: CUDA warning: out of memory (function destroyEvent)
Got this error, is running a 1h not splitted set is too much?
do i need to split it?

Ответить

@blackarrow3138 - 10.12.2023 00:36

What if you mix both seperated and unseperated? Also how many epochs was this example trained on?

Ответить

@JR-bn3ev - 01.12.2023 14:11

But shouldn't the dataset consist of a vocal file in MONO???

Ответить

@denblindedjaligator5300 - 17.10.2023 22:34

if I choose that a module should have no tone and I train it in the new version of RVC, I can still choose which tone algorithm to use. This means that it still uses RMVPE, i.e. the new version and the quality is not particularly good either. Hope it gets fixed. try to choose false in the old and in the new version.

Ответить

@user-km5ry2zn1n - 02.10.2023 02:51

Hi Jarod, thank you so much for your videos!

I still have a question though. So i have 12-14 minutes audio of pure voice. I truncated silence, removed noise, reverb, echo, sibilance. What should i do? So you are telling us that simply dividing the audio file into 10 seconds is not desirable right? And I should clip the audio into meaningful bits with complete sentences, for which you btw use whisperx?

If so, is whisperx good for let's say nonenglish languages? For example languages of central asia or let's say exotic languages?

Ответить

@ahmedsarosh578 - 29.09.2023 22:52

hi rvc is now ot allow on colab for free note book whats the alternative

Ответить

@SosyalMedyaArge-so5bs - 13.09.2023 20:11

Dude, couldn't you get a better quality result if the silences of the single piece file were left?
I mean, wouldn't you have gotten a better result if you didn't truncate?
How would you know?

Ответить

@KenDoStudios - 10.09.2023 07:28

what have you to say about the Google Colab Crash? many users canot use colab anymore as google is cracking down on deepfakes code and banning IPs.

Ответить

@__-mk8dv - 03.09.2023 19:26

How to uninstall the ai voice-changer program? Because on the app page that we want to uninstall, there is no program name ai voice-changer. Or we can delete the extracted file right away because it runs with the command program.

Ответить

@bigdaveproduction168 - 03.09.2023 18:42

And the ideal duration ? How much ? :/

Ответить

@blueplanet... - 02.09.2023 23:58

I've tested it, and the Splits version is way way better. The non-splits one trained faster, but the result is worse.

Ответить

@aji9666 - 02.09.2023 20:59

So I tell you the cause of the problem that I have

Ответить

@aji9666 - 02.09.2023 20:59

Do you have an Instagram to connect with you

Ответить

@benman36 - 02.09.2023 20:47

The sound files I recorded are 44.1 khz but there is no such sample rate in rvc training. There are only 40k and 48k sample rates. Which one should we choose in this case? After UVR, I cleaned the silence in audacity as you explained in the video, then I set the sample rate to 48000 from the settings and saved it. I did the training in two different ways with the same dataset (by selecting 40k and 48k sample rate in rvc). In Tensorboard, the 48k sample rate result resulted in less loss than the 40k sample rate result.

Ответить

@CiniVoice - 02.09.2023 18:43

How to install in RVC in windows say it

Ответить

@user-ku2hc3mr3m - 02.09.2023 15:39

Hello! Thanks for the video. Could you say where to get well prepared voice audios for training, please?

Ответить

@Dare2Dream.Official - 02.09.2023 14:00

Can I do all this with a mobile phone? Someone please answer

Ответить

@Dare2Dream.Official - 02.09.2023 13:58

Bro how do you expect us someone to understand what you're talking about when you talk like people watching are professionals? Please explain in simple terms

Ответить

@fizskip9136 - 02.09.2023 12:30

Great video as always, I was wondering if you know any way to separate voices, for example if there are two or more people talking. Would love to see a video on that!

Ответить

@miguelangel-nj8cq - 02.09.2023 03:30

I have always had the doubt if I should also normalize the sound, for example the audio of video game characters, they have dialogues in which they scream, get excited and use a great variety of voice tones. Very different from your dataset which looks very drab. Besides trimming silences with Audacity, is it worth using the Normalize Audio option to avoid spikes caused by shouting or loud dialog? Or should they stay natural? Should I do some other transformation?

Ответить

@aji9666 - 02.09.2023 02:55

Please 🥺🙏 I have a 500 track in my pc i need convert in one time 😭

Ответить

@Lord_V20 - 02.09.2023 02:29

Rtx 3080 12GB any good for Ai?

Ответить

@cleatersv - 02.09.2023 01:50

I think the separated version is much better, and will continue using it as well. For the whisperer tool you mentioned is it the audio splitter you made (the one that removes silence and separate files that's over 10 sec)?

Ответить