Articles Model

DOI: 10.57967/hf/9118
Demo Space: COMING SOON
Author: Roberto Lofaro
License: CC BY-SA 4.0


Model Overview

This is a GGUF quantisation of Qwen/Qwen3.5-4B, fine-tuned via a structured system prompt and optional retrieval layer to serve as a Q&A and recommendation assistant over a corpus of 350+ articles extracted from robertolofaro.com.

THE MODEL WILL BE RELEASED BY 2026-06-30

The model is designed to answer questions about the articles and, primarily, to act as an arguments outlining and guided brainstorming system.

Its answers should be considered just representation of the material within the source articles, coupled with the capabilities of the underlying Qwen3.5-4B model.

No answer represents or should be considered advice, as the training material did not include your own specific context and a professional assessment of your context.

Hence, before acting on the answers, consult professional advice.

This model is an evolution of the previously released https://huggingface.co/robertolofaro/articles-model.

The update is up to 2026-06-15.


Intended Use

Use Supported
Interactive Q&A on the 350+ articles
Offline / local inference (CPU)
General-purpose assistant ⚠️ Not the primary intent
Commercial deployment without attribution ❌ (see license)

Change vs. reference model

The changes vs. the previous model are:

  • instead of generating embeddings using the same model used for inference (Qwen3.5-4B Q4_K_M GGUF), a specialized embeddings models has been used (Qwen3-Embedding Q8_0 GGUF)
  • instead of training the DoRA with the same model used for inference (Qwen3.5-4B Q4_K_M GGUF), which had been selected to allow inference also on CPU-only computers with enough memory (>= 32GB), the training is done using a higher quality model (Qwen3.5-4B Q8_0 GGUF)
  • instead of Title+Summary+Content, the search indexes now include also the between 10 and 20 keyphrases associated to each article by an LLM (used both Kimi and Grok, but you can use any LLM that is at least 8-9B)
  • ChromaDB has been dropped, and faiss-hnsw and Qdrant both use the new extended search index material
  • Turboquant has been added
  • all the indexes creation and training can be done incrementally (the pipeline will be eventually released in a mini-book), albeit, as the update release is expected to be generally on a quarterly basis, also if the quality reduction could be marginal, it has been used to re-train the full DoRA

The training has retained the use of unsloth (the previous version has been trained for comparison with transformers first and then again with unsloth).

The new extended training material allowed a better convergence without overfitting with just 3 epochs and error rate ADDERRORRATE AND OTHER INFO ONCE DONE

Primary Task COMING SOON

Case 1 COMING SOON

Case 2 COMING SOON

Case 3 COMING SOON


About the articles

The 350+ articles cover topics spanning organizational change, business transformation, knowledge management, AI adoption, and programme management, drawing on the author's 35+ years of experience in consulting and C-level advisory roles across European industrial and, in Italy, also public-sector missions.

The abstract and content of each article (including those after the update date of the model) is on GitHub.

The metadata of the articles are on Kaggle.

You can searh the articles on robertolofaro.com either by cluster or by "tag cloud", as well as see click on each article within the sections available or directly on the latest released, most read, or latest read.

As some articles span over multiple releases, and even across multiple sections (i.e. are mini-book drafts in disguise), there is also a list of multi-part articles where you can navigate across the sections of an article.

Access to each article is free and CC-BY-SA-4.0, this model is just to further ease access vs. the existing research facilities on the website, and to ensure permanent availability online.


Available Quantisations

Quantisation File Size Recommended For
Q4_K_M articles-Q4_K_M.gguf ~2.71 GB CPU inference, everyday use

The Q4_K_M variant is recommended for CPU-only environments and is the one used in the companion Space; it has been generated by merging a DoRA produced using Qwen3.5 Q8_0 GGUF, with embeddings generated on the corpus by using Qwen3-Embedding Q8_0 GGUF.


Usage

Sample system prompt COMING SOON

Sample of results by using a custom Python script COMING SOON

Quick Start with Ollama COMING SOON

Quick Start with llama.cpp COMING SOON

Quick Start with llama-cpp-python COMING SOON


Companion Space COMING SOON


Limitations

  • The model is designed to support a system of arguments outlining and guided brainstorming using the articles within the training corpus.
  • Recommendations are bounded by the 350+ article in the corpus; the model will not recommend external works.
  • The model does not have live internet access; content reflects the corpus as indexed at build time; if you want access, you have to build the application.
  • Already tested application variants enabling integration with e.g. an AI-generated MorningNews and websearch with DuckDuckGo
  • CPU inference with Q4_K_M typically yields response times of 15–60 seconds depending on hardware; within the huggingface space, could take few minutes.

Ethical Considerations

  • The corpus consists entirely of original works by the author; no third-party copyrighted content is embedded.
  • The system is informational; it does not collect user data.
  • The model inherits any biases present in the Qwen3.5-4B base model; users should apply standard critical judgement to outputs.

Citation

If you use this model or the associated scripts in research or derivative work, please cite:

@misc{roberto_lofaro_2026,
    author       = { Roberto Lofaro },
    title        = { articlesextended-model (Revision ff09620) },
    year         = 2026,
    url          = { https://huggingface.co/robertolofaro/articlesextended-model },
    doi          = { 10.57967/hf/9118 },
    publisher    = { Hugging Face }
}```

---

## License

This model card and associated scripts are released under **[CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/)**.  
The base model weights are subject to the [Qwen3 License](https://huggingface.co/Qwen/Qwen3.5-4B/blob/main/LICENSE).

---

*Published openly as part of Roberto Lofaro's AI-assisted knowledge production initiative.  
GitHub · Patreon · [robertolofaro.com](https://robertolofaro.com)*
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for robertolofaro/articlesextended-model

Finetuned
Qwen/Qwen3.5-4B
Quantized
(242)
this model