Instructions to use James040/llama-cpp-python-wheels with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use James040/llama-cpp-python-wheels with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="James040/llama-cpp-python-wheels", filename="{{GGUF_FILE}}", )output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
| license: mit | |
| tags: | |
| - llama-cpp-python | |
| - prebuilt-wheels | |
| - huggingface-spaces | |
| - cpu-only | |
| - python-3.13 | |
| # ๐ฆ Llama-CPP-Python Pre-built Wheels (Python 3.13) | |
| ### The solution for Hugging Face "Build Timeout" errors on the Free CPU Tier. | |
| If you are using **Python 3.13** on a Hugging Face Free Space, compiling `llama-cpp-python` from source usually crashes or times out. This repository provides pre-compiled **manylinux** wheels that install in seconds. | |
| --- | |
| ## ๐ Why use these wheels? | |
| * **No Compilation:** Skips the 15+ minute build process. | |
| * **Python 3.13 Support:** Specifically built for the latest Python version. | |
| * **Generic CPU Optimization:** Compiled with `GGML_NATIVE=OFF`. This ensures the model runs on HF's shared CPUs without "Illegal Instruction" or "Core Dump" errors. | |
| * **Lightweight:** Only ~4.3 MB compared to the massive overhead of building from source. | |
| --- | |
| ## ๐ ๏ธ How to use in your HF Space | |
| ### Option A: Using `requirements.txt` | |
| Simply paste this direct link into your `requirements.txt` file: | |
| ```text | |
| [https://huggingface.co/James040/llama-cpp-python-wheels/resolve/main/llama_cpp_python-0.3.16-cp313-cp313-linux_x86_64.whl](https://huggingface.co/James040/llama-cpp-python-wheels/resolve/main/llama_cpp_python-0.3.16-cp313-cp313-linux_x86_64.whl) | |
| Option B: Using a Dockerfile | |
| If you are using a custom Docker setup, add this line: | |
| RUN pip install [https://huggingface.co/James040/llama-cpp-python-wheels/resolve/main/llama_cpp_python-0.3.16-cp313-cp313-linux_x86_64.whl](https://huggingface.co/James040/llama-cpp-python-wheels/resolve/main/llama_cpp_python-0.3.16-cp313-cp313-linux_x86_64.whl) | |
| ๐ฆ Build SpecificationsThese wheels were built using a high-performance automated pipeline on GitHub.SpecificationValuePython Version3.13PlatformLinux x86_64 (Manylinux)Build FlagsGGML_NATIVE=OFF, GGML_BLAS=OFFBuild SourceJameson040/my_lama-wheels | |