--- library_name: transformers tags: - sentiment-analysis - aspect-based-sentiment-analysis - transformers - bert language: - tr metrics: - accuracy base_model: - dbmdz/bert-base-turkish-cased pipeline_tag: text-classification datasets: - Sengil/Turkish-ABSA-Wsynthetic --- # Aspect Based Sentiment Analysis with Turkish 🇹🇷 Data This model performs **Aspect-Based Sentiment Analysis (ABSA) 🚀** for Turkish text. It predicts sentiment polarity (Positive, Neutral, Negative) towards specific aspects within a given sentence. --- ## Model Details ### Model Description This model is fine-tuned from the `dbmdz/bert-base-turkish-cased` pretrained BERT model. It is trained on the **Turkish-ABSA-Wsynthetic** dataset, which contains Turkish restaurant reviews annotated with aspect-based sentiments. The model is capable of identifying the sentiment polarity for specific aspects (e.g., "servis," "fiyatlar") mentioned in Turkish sentences. - **Developed by:** Sengil - **Language(s):** Turkish 🇹🇷 - **License:** Apache-2.0 - **Finetuned from model:** `dbmdz/bert-base-turkish-cased` - **Number of Labels:** 3 (Negative, Neutral, Positive) ### Sources - **Notebook:** [ABSA_Turkish_BERT_Based_Small](https://www.kaggle.com/code/mertsengil/absa-train-w-synthetic-restaurant-reviews) --- ## Uses ### Direct Use This model can be used directly for analyzing aspect-specific sentiment in Turkish text, especially in domains like restaurant reviews. ### Downstream Use It can be fine-tuned for similar tasks in different domains (e.g., e-commerce, hotel reviews, or customer feedback analysis). ### Out-of-Scope Use - Not suitable for tasks unrelated to sentiment analysis or Turkish language. - May not perform well on datasets with significantly different domain-specific vocabulary. --- ### Limitations - May struggle with rare or ambiguous aspects not covered in the training data. - May exhibit biases present in the training dataset. ## How to Get Started with the Model ``` !pip install -U transformers ``` Use the code below to get started with the model: ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification # Load the model and tokenizer tokenizer = AutoTokenizer.from_pretrained("Sengil/ABSA-Turkish-bert-based-small") model = AutoModelForSequenceClassification.from_pretrained("Sengil/ABSA-Turkish-bert-based-small") # Example inference text = "Servis çok yavaştı ama yemekler lezzetliydi." aspect = "servis" formatted_text = f"[CLS] {text} [SEP] {aspect} [SEP]" inputs = tokenizer(formatted_text, return_tensors="pt", padding="max_length", truncation=True, max_length=128) outputs = model(**inputs) predicted_class = outputs.logits.argmax(dim=1).item() # Map prediction to label labels = {0: "Negative", 1: "Neutral", 2: "Positive"} print(f"Sentiment for '{aspect}': {labels[predicted_class]}") ``` ## Training Details ### Training Data Training Data The model was fine-tuned on the Turkish-ABSA-Wsynthetic.csv dataset. The dataset contains semi-synthetic Turkish sentences annotated for aspect-based sentiment analysis. - Training Procedure - Optimizer: AdamW - Learning Rate: 2e-5 - Batch Size: 16 - Epochs: 5 - Max Sequence Length: 128 ## Evaluation The model achieved the following scores on the test set: - Accuracy: 95.48% - F1 Score (Weighted): 95.46% ## Citation ``` @misc{absa_turkish_bert_based_small, title={Aspect-Based Sentiment Analysis for Turkish}, author={Sengil}, year={2024}, url={https://huggingface.co/Sengil/ABSA_Turkish_BERT_Based_Small} } ``` ## Model Card Contact For any questions or issues, please open an issue in the repository or contact [LinkedIN](https://www.linkedin.com/in/mertsengil/).