Challenges using Hugging Face example, line "speaker_embeddings = np.load("xvector_speaker_embedding.npy")"
Hi there,
I'm trying to use the example code for speecht5_vc on the model card. I'm running into issues when I get to the line:
speaker_embeddings = np.load("xvector_speaker_embedding.npy")
Where I get error:
FileNotFoundError: [Errno 2] No such file or directory: 'xvector_speaker_embedding.npy'
I'm not sure if this because I have an incorrect numpy version or if there is another issue. Please let me know how I might get around this.
The speaker embeddings are not included in this repo, so xvector_speaker_embedding.npy
is a placeholder name. You can find a more complete example in the blog post: http://hf.co/blog/speecht5
In the blog, it mentioned that you can get the xvector_speaker_embedding from the dataset "Matthijs/cmu-arctic-xvectors":
embeddings_dataset = load_dataset("Matthijs/cmu-arctic-xvectors", split="validation")
speaker_embeddings = torch.tensor(embeddings_dataset[7306]["xvector"]).unsqueeze(0)
And then use it to generate new voice:
speech = model.generate_speech(inputs["input_values"], speaker_embeddings, vocoder=vocoder)
How to convert speech to xvector any one know about this ?