AI powered dialogs research

  • very important to have extra word at the beginning (ideally numbers, as it helps order the audio tracks)

  • turn xVASynth sample rate to 44100 (check ffmpeg first)

  • in 2.3.0, male V sounds metallic out of the box, that's just like that: applying a gate helps somewhat but is not perfect.

  • recording one's voice and importing in xVASynth can help with the phonetic for a better pronunciation.

it might be worth having a look at ElevenLabs (TBC: train models ?).

it might be worth trying other vocoders too.

credits: thanks to bespokecomp on Github for helping out