ardaorcun
/

finetuned_cosmos2603

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ardaorcun commited on Mar 27, 2024

Commit

3e07768

·

verified ·

1 Parent(s): 964a601

Update README.md

Files changed (1) hide show

README.md +34 -30

README.md CHANGED Viewed

@@ -9,45 +9,49 @@ pipeline_tag: text-generation
 # Model Card for Model ID
-This model is finuted version of YTU's Cosmos GPT2 Language Model
 ## Training Details
-Model fine-tuned by using LoRA and QLoRA. Training parameters is defined below.
-LoRA configs:
-  r=16,
-  lora_alpha=32,
-  target_modules=['c_proj',
-                  'c_fc',
-                  'gate_proj',
-                  'c_proj',
-                  'c_attn'],
-  bias="lora_only",
-  use_rslora=True,
-  fan_in_fan_out=True,
-  lora_dropout=0.05,
-  task_type="CAUSAL_LM",
-Train Parameters:
-  num_train_epochs=5,
-  per_device_train_batch_size=10,
-  gradient_accumulation_steps=1,
-  gradient_checkpointing=True,
-  optim="paged_lion_8bit",
-  logging_steps=11,
-  save_strategy="epoch",
-  learning_rate=2e-4,
-  max_grad_norm=0.3,
-  warmup_ratio=0.03,
-  lr_scheduler_type="linear"
-### Training Data
-For training i used Merve's Turkish Instructions Dataset you can check here -> https://huggingface.co/datasets/merve/turkish_instructions

 # Model Card for Model ID
+This model is a fine-tuned version of YTU's Cosmos GPT2 Language Model.
 ## Training Details
+The model was fine-tuned using LoRA and QLoRA techniques. Training parameters are defined below.
+### LoRA configs:
+- **r**=16
+- **lora_alpha**=32
+- **target_modules**=c_proj,c_fc, gate_proj, c_proj, c_attn
+- **lora_dropout**=0.05
+- **bias**="lora_only"
+- **fan_in_fan_out**=True
+- **max_seq_length**=512
+- **use_rslora**=True
+### Train Parameters:
+- **train_epochs**=5
+- **optim**="paged_lion_8bit"
+- **learning_rate**=2e-4
+- **warmup_ratio**=0.03
+- **max_grad_norm**=0.3
+- **lr_scheduler_type**="linear"
+### Training Data
+For training, I used Merve's Turkish Instructions Dataset, which you can check here: <a href="https://huggingface.co/datasets/merve/turkish_instructions">Merve's Turkish Instructions Dataset</a>
+## Instruction template:
+```python
+def format_instruction(sample):
+    return f"""Sen cevap vermeyi seven yardımcı bir dil modelisin.
+        ### Input:
+        {sample["talimat"]}
+        ### Context:
+        {sample[" giriş"]}
+        ### Response:
+        {sample[" çıktı"]}
+    """```