rskill-robometer-4b-nf4
Pre-quantized NF4 build of robometer/Robometer-4B
(a Qwen3-VL-4B robotic reward foundation model, arXiv 2603.02115), packaged as an
OpenRAL kind: reward rSkill (ADR-0057).
It runs in parallel with a VLA policy and scores the live rollout: given the
robot's camera frames + the task instruction, it emits per-frame normalized
progress (0–1) and per-frame success probability. The OpenRAL reasoner polls
it on demand (read-only query_task_progress tool) to decide whether to continue,
advance, or replan — advisory only, never on the control path.
What's in this repo
A self-contained checkpoint that the OpenRAL reward sidecar loads directly as 4-bit — no bf16 materialization, no requantize:
model.safetensors— 236Linearmodules packed to bitsandbytes NF4 (~3.32 GB resident), plus the folded non-persistent rotaryinv_freqbuffers.config.json— model config (resized vocab 151674).config.yaml— therobometerExperimentConfig(lets the sidecar rebuild theRBMgraph offline).- tokenizer / processor files (incl.
added_tokens.json— the model's added progress token). quantization_metadata.json— provenance.
The model class is
RBM(robometer.models.rbm) — the upstreamconfig.jsonadvertisesarchitectures: ["RFM"]with noauto_map, so vanillatransformers.AutoModelcannot load it. The OpenRAL sidecar installs the pinnedrobometerpackage (commita669dffc) withtransformers==4.57.1in an isolated venv and builds the skeleton on themetadevice, then installs these packed NF4 weights viaParams4bit.from_prequantized.
Provenance & verification
- Source:
robometer/Robometer-4B@beef63bc914c5c189329d49c6d712d96d632aa34(Apache-2.0). - Quantization: bitsandbytes NF4 (double-quant), compute dtype bf16, the OpenRAL
rule
nn.Linear.numel ≥ 4e6 → Linear4bit. Built bytools/build_robometer_nf4_checkpoint.py. - Bit-identical to loading the upstream bf16 weights and quantizing in place:
same-process forward
max|Δ| = 0; 4-bit dequant round-trip0. For a byte-stable reward ramp across process launches, the sidecar pins the math SDP kernel +use_deterministic_algorithms(True)+CUBLAS_WORKSPACE_CONFIG=:4096:8+cudnn.allow_tf32=False. - Footprint: ~3.32 GB resident on an 8 GB GPU; co-resident with the sim (and a small NF4 VLA). The reward forward subsamples the frame window to bound activation.
Usage (OpenRAL)
This is consumed by OpenRAL, not loaded standalone. The kind: reward manifest
points weights_uri here:
weights_uri: "hf://OpenRAL/rskill-robometer-4b-nf4"
and in deploy-sim:
openral deploy sim --config scenes/deploy/<scene>.yaml --enable-reward-monitor
brings up the reward monitor parallel to the VLA and lets the reasoner poll
/openral/perception/query_task_progress. See
ADR-0057.
License
Apache-2.0, inherited from the upstream robometer/Robometer-4B. See LICENSE.
The upstream robometer package is pinned by commit and executed only in an
isolated sidecar venv (it is not an OpenRAL-trusted org).
- Downloads last month
- 54