[Learning; Long-thread] Where to begin to learn how to configure your models' generation output?
Start here: https://huggingface.co/blog/how-to-generate
There are numerous configurations and trade-offs to learn which instantly produce vastly superior results to the out-of-the-box configuration. If you post your efforts and experiments here I will try to run them myself. Maybe we can learn together.
For me this worked well:
input_text = """
# Why genetic diversity is important for biodiversity conservation?
## INTRODUCTION
"""
randomizer_value = 251
repititions = 1 # 4
# set seed to reproduce results. Feel free to change the seed though to get different results
torch.manual_seed(randomizer_value)
# input_ids = tokenizer(input_text, return_tensors="pt").input_ids ############### CPU only
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
# set top_k = 50 and set top_p = 0.95 and num_return_sequences = 3
sample_outputs = model.generate(
input_ids,
do_sample=True,
max_length=2000,
top_k=50,
top_p=0.95,
num_return_sequences=repititions
)
print("Output:\n" + 100 * '-')
for i, sample_output in enumerate(sample_outputs):
print("{}: {}".format(i, tokenizer.decode(sample_output, skip_special_tokens=False)))
print('--------------------------------------------------------------')
Yes. Those are, essentially, the parameters specified here. https://huggingface.co/blog/how-to-generate
You are probably making smart decisions by adding the header of the first section you want to compile.
I have noticed Galactica often struggles with that initiation otherwise.
What is your use case?
Yes. Those are, essentially, the parameters specified here. https://huggingface.co/blog/how-to-generate
You are probably making smart decisions by adding the header of the first section you want to compile.
I have noticed Galactica often struggles with that initiation otherwise.
What is your use case?
"You are probably making smart decisions by adding the header of the first section you want to compile."
What do you mean?