Text Generation
PEFT
TensorBoard
Safetensors
PyTorch
English
trl
sft
Generated from Trainer
Eval Results (legacy)
Instructions to use Menouar/phi-2-basic-maths with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Menouar/phi-2-basic-maths with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2") model = PeftModel.from_pretrained(base_model, "Menouar/phi-2-basic-maths") - Notebooks
- Google Colab
- Kaggle
| language: | |
| - en | |
| license: mit | |
| library_name: peft | |
| tags: | |
| - trl | |
| - sft | |
| - generated_from_trainer | |
| - pytorch | |
| datasets: | |
| - gsm8k | |
| base_model: microsoft/phi-2 | |
| pipeline_tag: text-generation | |
| model-index: | |
| - name: phi-2-basic-maths | |
| results: | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: AI2 Reasoning Challenge (25-Shot) | |
| type: ai2_arc | |
| config: ARC-Challenge | |
| split: test | |
| args: | |
| num_few_shot: 25 | |
| metrics: | |
| - type: acc_norm | |
| value: 55.8 | |
| name: normalized accuracy | |
| source: | |
| url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Menouar/phi-2-basic-maths | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: HellaSwag (10-Shot) | |
| type: hellaswag | |
| split: validation | |
| args: | |
| num_few_shot: 10 | |
| metrics: | |
| - type: acc_norm | |
| value: 71.15 | |
| name: normalized accuracy | |
| source: | |
| url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Menouar/phi-2-basic-maths | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: MMLU (5-Shot) | |
| type: cais/mmlu | |
| config: all | |
| split: test | |
| args: | |
| num_few_shot: 5 | |
| metrics: | |
| - type: acc | |
| value: 47.27 | |
| name: accuracy | |
| source: | |
| url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Menouar/phi-2-basic-maths | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: Winogrande (5-shot) | |
| type: winogrande | |
| config: winogrande_xl | |
| split: validation | |
| args: | |
| num_few_shot: 5 | |
| metrics: | |
| - type: acc | |
| value: 75.3 | |
| name: accuracy | |
| source: | |
| url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Menouar/phi-2-basic-maths | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: TruthfulQA (0-shot) | |
| type: truthfulqa | |
| config: truthfulqa | |
| split: validation | |
| args: | |
| num_few_shot: 0 | |
| metrics: | |
| - type: mc2 | |
| value: 41.4 | |
| name: mc2 | |
| source: | |
| url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Menouar/phi-2-basic-maths | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: GSM8k (5-shot) | |
| type: gsm8k | |
| config: main | |
| split: test | |
| args: | |
| num_few_shot: 5 | |
| metrics: | |
| - type: acc | |
| value: 30.7 | |
| name: accuracy | |
| source: | |
| url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Menouar/phi-2-basic-maths | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: TruthfulQA (0-shot) | |
| type: truthful_qa | |
| config: multiple_choice | |
| split: validation | |
| args: | |
| num_few_shot: 0 | |
| metrics: | |
| - type: mc2 | |
| value: 41.4 | |
| source: | |
| url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Menouar/phi-2-basic-maths | |
| name: Open LLM Leaderboard | |
| # phi-2-basic-maths | |
| This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an [GSM8K dataset](https://huggingface.co/datasets/gsm8k). | |
| ## Model Description | |
| The objective of this model is to evaluate Phi-2's ability to provide correct solutions to reasoning problems after fine-tuning. This model was trained using techniques such as TRL, LoRA quantization, and Flash Attention. | |
| To test it, you can use the following code: | |
| ```python | |
| import torch | |
| from peft import AutoPeftModelForCausalLM | |
| from transformers import AutoTokenizer, pipeline | |
| # Specify the model ID | |
| peft_model_id = "Menouar/phi-2-basic-maths" | |
| # Load Model with PEFT adapter | |
| model = AutoPeftModelForCausalLM.from_pretrained( | |
| peft_model_id, | |
| device_map="auto", | |
| torch_dtype=torch.float16 | |
| ) | |
| tokenizer = AutoTokenizer.from_pretrained(peft_model_id) | |
| pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) | |
| ``` | |
| ## Training procedure | |
| The complete training procedure can be found on my [Notebook](https://colab.research.google.com/drive/1mvfoEqc0mwuf8FqrABWt06qwAsU2QrvK). | |
| ### Training hyperparameters | |
| The following hyperparameters were used during training: | |
| - learning_rate: 0.0002 | |
| - train_batch_size: 42 | |
| - eval_batch_size: 8 | |
| - seed: 42 | |
| - gradient_accumulation_steps: 2 | |
| - total_train_batch_size: 84 | |
| - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 | |
| - lr_scheduler_type: constant | |
| - lr_scheduler_warmup_ratio: 0.03 | |
| - num_epochs: 30 | |
| ### Training results | |
| The training results can be found on [Tensoboard](https://huggingface.co/Menouar/phi-2-basic-maths/tensorboard). | |
| ## Evaluation procedure | |
| The complete Evaluation procedure can be found on my [Notebook](https://colab.research.google.com/drive/1xsdxOm-CgZmLAPFgp8iU9lLFEIIHGiUK). | |
| Accuracy: 36.16% | |
| Unclear answers: 7.81% | |
| ### Framework versions | |
| - PEFT 0.8.2 | |
| - Transformers 4.38.0.dev0 | |
| - Pytorch 2.1.0+cu121 | |
| - Datasets 2.16.1 | |
| - Tokenizers 0.15.1 | |
| # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) | |
| Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Menouar__phi-2-basic-maths) | |
| | Metric |Value| | |
| |---------------------------------|----:| | |
| |Avg. |53.60| | |
| |AI2 Reasoning Challenge (25-Shot)|55.80| | |
| |HellaSwag (10-Shot) |71.15| | |
| |MMLU (5-Shot) |47.27| | |
| |TruthfulQA (0-shot) |41.40| | |
| |Winogrande (5-shot) |75.30| | |
| |GSM8k (5-shot) |30.71| | |