Spanish GPT-2 as backbone
Fine-tuned model on Spanish language using Opensubtitle dataset. The original GPT-2 model was used as backbone which has been trained from scratch on the Spanish portion of OSCAR dataset, according to the Flax/Jax Community by HuggingFace.
Model description and fine tunning
First, the model used as backbone was the OpenAI's GPT-2, introduced in the paper "Language Models are Unsupervised Multitask Learners" by Alec Radford et al. Second, transfer learning approach with a large dataset in Spanish was used to transform the text generation model to conversational tasks. The use of special tokens plays a key role in the process of fine-tuning.
tokenizer.add_special_tokens({"pad_token": "<pad>",
"bos_token": "<startofstring>",
"eos_token": "<endofstring>"})
tokenizer.add_tokens(["<bot>:"])
How to use
You can use this model directly with a pipeline for auto model with casual LM:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("erikycd/chatbot_hadita")
model = AutoModelForCausalLM.from_pretrained("erikycd/chatbot_hadita")
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
model = model.to(device)
def infer(inp):
inp = "<startofstring> "+ inp +" <bot>: "
inp = tokenizer(inp, return_tensors = "pt")
X = inp["input_ids"].to(device)
attn = inp["attention_mask"].to(device)
output = model.generate(X, attention_mask = attn, pad_token_id = tokenizer.eos_token_id)
output = tokenizer.decode(output[0], skip_special_tokens = True)
return output
exit_commands = ('bye', 'quit')
text = ''
while text not in exit_commands:
text = input('\nUser: ')
output = infer(text)
print('Bot: ', output)
- Downloads last month
- 26
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.