Amazon Review Generator T5

GitHub LinkedIn Twitter

This model is a fine-tuned version of the T5 model designed to generate Amazon product reviews based on the product title and star rating. The fine-tuning process was conducted on a dataset of software product reviews from the "McAuley-Lab/Amazon-Reviews-2023" dataset.

Use Case

The primary use case of this model is to generate realistic and coherent product reviews for Amazon products. It can be particularly useful for generating sample reviews for product listings, sentiment analysis, and natural language generation tasks in e-commerce.

Model Architecture

The model is based on the T5 (Text-to-Text Transfer Transformer) architecture, which is a versatile transformer model for a variety of text generation tasks.

Training Data

The model was fine-tuned on a dataset of Amazon software product reviews. The data was preprocessed to include only verified purchases with review texts longer than 100 characters. A total of 100,000 samples were used for fine-tuning.

Training Procedure

The training was performed using the Hugging Face transformers library with the following settings:

  • Model: t5-base
  • Number of Epochs: 3
  • Batch Size: 16 for training, 32 for evaluation
  • Optimizer: AdamW
  • Learning Rate: Default settings
  • Hardware: Training was conducted on GPU (NVIDIA RTX 3060)

Model Performance

Due to the scope of this project, comprehensive evaluation metrics are not provided. However, sample outputs demonstrate the model’s ability to generate coherent and contextually relevant reviews.

Example Usage

Here’s how you can use the model to generate reviews:

import torch
from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load the model and tokenizer
model_name = "RSPRIMES1234/Amazon-Review-Generator-T5"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

# Set up GPU usage (optional)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

# Define the function to generate reviews
def generate_review(product_title, star_rating):
    input_text = f"review: {product_title}, {star_rating} Stars!"
    inputs = tokenizer(input_text, return_tensors='pt', max_length=128, padding='max_length', truncation=True)
    inputs = {k: v.to(device) for k, v in inputs.items()}
    outputs = model.generate(inputs['input_ids'], max_length=128, no_repeat_ngram_size=3, num_beams=6, early_stopping=True)
    review = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return review

# Example usage
product_title = "Example Product"
star_rating = 5
print(generate_review(product_title, star_rating))

Limitations and Considerations

  • Data Bias: The model was trained on reviews for software products, which may bias its performance when generating reviews for other types of products.
  • Ethical Use: Generated reviews should be used responsibly and ethically. Misuse of generated content can lead to misinformation and ethical concerns.

Citation

If you use this model in your research or applications, please cite the original T5 paper and provide a link to this model on Hugging Face.

Downloads last month
43
Safetensors
Model size
223M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train RSPRIMES1234/Amazon-Review-Generator-T5