LLaMaPaca

Model Details Model Name: LLaMaPaca Base Model: unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit Adapter Type: LoRA (Low-Rank Adaptation) Library: PEFT (Parameter-Efficient Fine-Tuning) Pipeline Tag: text-generation

Description

LLaMaPaca is a LoRA adapter fine-tuned on the LLaMA 3.2 1B Instruct model using Unsloth's optimized training framework. This adapter enables parameter-efficient customization of the base model for specific tasks or domains while maintaining the core capabilities of LLaMA 3.2. The adapter was trained using 4-bit quantization via bitsandbytes, making it memory-efficient and suitable for deployment on consumer-grade hardware.

Technical Specifications

Architecture: LLaMA 3.2 with LoRA adapters Base Model Size: ~1B parameters Quantization: 4-bit (bitsandbytes) Training Framework: Unsloth + PEFT Adapter Format: PEFT LoRA

Usage

Installation

bash

pip install transformers peft accelerate bitsandbytes

Loading the Model

python

from transformers import AutoModelForCausalLM, AutoTokenizerfrom peft import PeftModel# Load base modelbase_model_name = "unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit"adapter_name = "your-username/LLaMaPaca"  # Replace with actual repotokenizer = AutoTokenizer.from_pretrained(base_model_name)model = AutoModelForCausalLM.from_pretrained(    base_model_name,    load_in_4bit=True,    device_map="auto")# Load LoRA adaptermodel = PeftModel.from_pretrained(model, adapter_name)# Generate textprompt = "Your instruction here..."inputs = tokenizer(prompt, return_tensors="pt").to("cuda")outputs = model.generate(**inputs, max_new_tokens=256)print(tokenizer.decode(outputs[0], skip_special_tokens=True)) Using with Text Generation Pipeline

python

from transformers import pipelinepipe = pipeline(    "text-generation",    model=base_model_name,    model_kwargs={"load_in_4bit": True},    adapter_name=adapter_name)result = pipe("Your prompt here...", max_new_tokens=256)

Training Details

Method: LoRA (Low-Rank Adaptation) Optimization: Unsloth acceleration Quantization: 4-bit precision with bitsandbytes Framework: PEFT + Transformers Intended Use Cases Instruction following and conversational AI Domain-specific text generation Custom task adaptation with minimal resource requirements Edge deployment scenarios requiring efficient models Limitations Performance depends on the quality and quantity of fine-tuning data May inherit biases from the base LLaMA 3.2 model 4-bit quantization may result in slight accuracy trade-offs Adapter is specific to the base model architecture Citation If you use this model in your research, please cite: bibtex & TensorVizion

@misc{llamapaca,  title={LLaMaPaca: LoRA Adapter for LLaMA 3.2 1B Instruct},  author={Tensorizion},  year={2026},  publisher={Hugging Face},  howpublished={\url{https://huggingface.co/TensorVizion/LLaMaPaca}}}

License

Please refer to the base model license (LLaMA 3.2 Community License) and specify any additional licensing terms for your adapter.

Downloads last month
41
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support