You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm testing this model with Turkish i used one script showed in the docs, modified a bit to use gpu, but this wasn't good and each run it become worse.
fromtransformersimportAutoProcessor, BarkModelimportgradioasgrimporttorchimportpandasaspdprocessor=AutoProcessor.from_pretrained("suno/bark")
model=BarkModel.from_pretrained("suno/bark")
# Move model to GPU if availabledevice="cuda"iftorch.cuda.is_available() else"cpu"model=model.to(device)
# Read voice presets from CSVdf=pd.read_csv('data.csv', header=None)
voice_presets= {f"{row[0]} ({row[2]})": row[1] for_, rowindf.iterrows()}
defbark(text, voice_preset):
# Process text with padding and attention maskinputs=processor(
text, voice_preset
)
# Move inputs to GPU firstinputs= {
k: v.to(device) ifhasattr(v, 'to') elsevfork, vininputs.items()
}
# Get attention mask from the moved inputs# attention_mask = inputs["attention_mask"]audio_array=model.generate(
input_ids=inputs["input_ids"],
pad_token_id=processor.tokenizer.pad_token_id
)
audio_array=audio_array.cpu().numpy().squeeze()
sample_rate=model.generation_config.sample_ratereturn (sample_rate, audio_array)
interface=gr.Interface(
fn=bark,
inputs=[
gr.Textbox(label="Text to speak", placeholder="Enter text here..."),
gr.Dropdown(
choices=list(voice_presets.items()),
value=list(voice_presets.items())[0][1], # Set first value as defaultlabel="Voice"
)
],
outputs=gr.Audio(label="Generated Speech"),
title="Bark Text-to-Speech",
description="Generate speech from text using different voices",
)
if__name__=="__main__":
interface.launch()
Then i cloned huggingface space and it was better but still far from good for turkish language also generation speed become much more slower like ~1m
What do you suggest me?
The text was updated successfully, but these errors were encountered:
Hi, I'm testing this model with Turkish i used one script showed in the docs, modified a bit to use gpu, but this wasn't good and each run it become worse.
Then i cloned huggingface space and it was better but still far from good for turkish language also generation speed become much more slower like ~1m
What do you suggest me?
The text was updated successfully, but these errors were encountered: