Access to the gated repo & gemma-7b-it model from hugging face
I want to access the model so that, I can complete my project which is translating the text from English to various otherlanguages.
This model is linked to a GitHub repository, as per below link, I am cloning this repository in my local machine VS code. GitHub repo link: https://github.com/doctranslate-io/viet-translation-llm.
It gives me error as below:
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/google/gemma-7b-it.
401 Client Error. (Request ID: Root=1-67057062-38b576a8416d6a5720107b86;9d47d074-89e2-45ba-86a9-809d9cd8250e)
Cannot access gated repo for url https://huggingface.co/google/gemma-7b-it/resolve/main/config.json.
Access to model google/gemma-7b-it is restricted. You must have access to it and be authenticated to access it. Please log in.
P.S. Can someone help me point in right direction whom I should request for this access.
Hi @hk199
You're requesting access to a gated Gemma model. Here's how to authenticate from your server-side/local-machine environment:
Important: Accessing gated models directly from client-side environments (like web browsers) is not supported due to security risks.
Here's how to authenticate on your server/machine:
Step 1: Generate a Hugging Face User Access Token
Go to your Hugging Face settings: https://huggingface.co/settings/tokens
Click "New token".
Give your token a descriptive name (e.g., "HF_TOKEN").
We recommend keeping the default "Read" access.
Click "Generate a token" and copy the token to your clipboard.
Step 2: Set Up Authentication in Your Server-Side/local-machine Code
You'll need to set the HF_TOKEN environment variable within your server-side environment. How you do this depends on your specific setup, but here's a general example:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import os
# Set the access token as an environment variable
os.environ["HF_TOKEN"] = "YOUR_TOKEN_HERE"
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b-it")
model = AutoModelForCausalLM.from_pretrained(
"google/gemma-7b-it",
torch_dtype=torch.bfloat16,
use_auth_token=True
)
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
If you encounter any further issues or have specific questions about your local machine-side setup, feel free to ask!