metadata

language:
  - en
  - th
  - pt
  - es
  - de
  - fr
  - it
  - hi
license: apache-2.0
library_name: transformers
tags:
  - unsloth
  - trl
  - sft
  - text-generation-inference
base_model:
  - meta-llama/Llama-3.2-3B-Instruct
datasets:
  - jeggers/competition_math
pipeline_tag: text-generation
model-index:
  - name: Komodo-Llama-3.2-3B-v2-fp16
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: IFEval (0-Shot)
          type: HuggingFaceH4/ifeval
          args:
            num_few_shot: 0
        metrics:
          - type: inst_level_strict_acc and prompt_level_strict_acc
            value: 63.41
            name: strict accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: BBH (3-Shot)
          type: BBH
          args:
            num_few_shot: 3
        metrics:
          - type: acc_norm
            value: 20.2
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MATH Lvl 5 (4-Shot)
          type: hendrycks/competition_math
          args:
            num_few_shot: 4
        metrics:
          - type: exact_match
            value: 6.27
            name: exact match
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GPQA (0-shot)
          type: Idavidrein/gpqa
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 3.69
            name: acc_norm
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MuSR (0-shot)
          type: TAUR-Lab/MuSR
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 3.37
            name: acc_norm
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU-PRO (5-shot)
          type: TIGER-Lab/MMLU-Pro
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 20.58
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard

This version of Komodo is a Llama-3.2-3B-Instruct finetuned model on jeggers/competition_math dataset to increase math performance of the base model.

This model is fp16. You should import it using torch_dtype="float16".

Finetune system prompt:

You are a highly intelligent and accurate mathematical assistant.
You will solve mathematical problems step by step, explain your reasoning clearly, and provide concise, correct answers.
When the solution requires multiple steps, detail each step systematically.

You can use ChatML format!

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	19.59
IFEval (0-Shot)	63.41
BBH (3-Shot)	20.20
MATH Lvl 5 (4-Shot)	6.27
GPQA (0-shot)	3.69
MuSR (0-shot)	3.37
MMLU-PRO (5-shot)	20.58