How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="louhless/Ycoder-small",
	filename="Ycoder-small-f16.gguf",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Ycoder-small

Ycoder-small is a tiny experimental code-focused language model created by louhless and fine-tuned for short programming prompts, lightweight problem solving, simple chat, and optional thinking-style output.

Join Discord: https://discord.gg/Dq4MWuJm

Model Details

  • Model name: Ycoder-small
  • Creator: louhless
  • Base model: HuggingFaceTB/SmolLM2-135M-Instruct
  • Architecture: Llama-style causal language model
  • Context length: 8192
  • Language: English, with small German greeting support
  • Export: GGUF available
  • Status: experimental

Focus

The model is mainly tuned for:

  • Python
  • GLSL
  • JavaScript
  • SQL
  • Bash
  • simple math
  • short normal assistant replies

LM Studio Thinking Toggle

The GGUF metadata includes a chat template with an enable_thinking variable.

When enable_thinking is enabled, the model is prompted to use:

<think>
short reasoning summary
</think>
<answer>
final answer
</answer>
Downloads last month
97
GGUF
Model size
0.1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for louhless/Ycoder-small

Quantized
(102)
this model