Instructions to use GestaltLabs/BusyBeaver-50M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use GestaltLabs/BusyBeaver-50M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="GestaltLabs/BusyBeaver-50M")# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("GestaltLabs/BusyBeaver-50M", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use GestaltLabs/BusyBeaver-50M with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "GestaltLabs/BusyBeaver-50M" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GestaltLabs/BusyBeaver-50M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/GestaltLabs/BusyBeaver-50M
- SGLang
How to use GestaltLabs/BusyBeaver-50M with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "GestaltLabs/BusyBeaver-50M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GestaltLabs/BusyBeaver-50M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "GestaltLabs/BusyBeaver-50M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GestaltLabs/BusyBeaver-50M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use GestaltLabs/BusyBeaver-50M with Docker Model Runner:
docker model run hf.co/GestaltLabs/BusyBeaver-50M
BusyBeaver-50M
BusyBeaver-50M is a compact agent-policy model for strict JSON tool-call prediction. It is not a general chatbot. It receives a compact agent state, goal, recent observations, and available tool schemas, then predicts exactly one next tool call for a local agent harness.
Intended Adapter Use
BusyBeaver-50M is intended to work with the BusyBeaver Hermes Adapter / harness. In production it should be used as: model-selected tool + deterministic harness argument resolver.
This repository currently packages the RunPod-trained V12 path-grounding checkpoint 250. The full checkpoint archive is GestaltLabs/BusyBeaver-50M-v12-path-grounding-runpod.
Hermes Adapter
A standalone BusyBeaver Hermes adapter package is available on GitHub:
https://github.com/DJLougen/BusyBeaver-Hermes-Adapter
The adapter runs BusyBeaver as a compact OpenAI-compatible policy endpoint, detects BusyBeaver model selections inside Hermes-style harnesses, and maps strict JSON BusyBeaver actions into harness-native tool events and deterministic artifacts.
BusyBeaver should not replace the full Hermes controller. It is a tiny local tool-policy helper for deterministic operations: inspect, test, patch, diff, safe shell, recovery, memory, cron/message routing, and escalation gates.
Production Contract
BusyBeaver-50M is strongest when the harness supplies compact state and then validates/resolves the emitted action:
- Model emits one strict JSON object.
- Harness validates tool name and schema.
- Harness resolves concrete arguments from structured state when needed, especially file paths, commands, cron fields, and message targets.
- Harness enforces safety gates before execution.
This keeps the model tiny while avoiding the main weakness of sub-100M models: copying arbitrary long paths or commands from context perfectly.
Input Format
<|system|>
You are BusyBeaver, a small tool-policy model. Emit exactly one JSON object matching the schema. Do not explain.
<|goal|>
...
<|state|>
...
<|tools|>
...
<|output_schema|>
{"tool":"string","args":"object","confidence":"number","state_update":"string"}
<|assistant|>
Expected output is strict JSON only:
{"tool":"read_file","args":{"path":"src/parser.py"},"confidence":0.97,"state_update":"Read the referenced file before editing."}
Canonical Tools
read_filelist_filesrun_shell/ Hermesshellrun_testsapply_patchgit_diffremember/ Hermesmemory_writeretrieve_memorycron_create,cron_updatemessage_sendclarify,escalate
Evaluation
V12 checkpoint 250 raw checkpoint validation:
| Metric | Score |
|---|---|
| JSON validity | 1.0000 |
| Schema validity | 0.9792 |
| Correct tool | 0.9818 |
| Arg semantic | 0.6510 |
V12 with harness argument resolver on frozen evals:
| Eval | JSON | Schema | Correct Tool | Arg Semantic | Unsafe Cmd | Placeholder |
|---|---|---|---|---|---|---|
frozen_path_grounding_v2 |
1.0000 | 1.0000 | 1.0000 | 0.9792 | 0.0000 | 0.0000 |
frozen_harness_v1 |
1.0000 | 1.0000 | 1.0000 | 0.9000 | 0.0000 | 0.0000 |
The unresolved V11 baseline on a 24-row adversarial path-copy sample was correct_tool=0.4167 and arg_sem=0.0000; V12 plus resolver fixes that product-level failure mode.
Model Size
- Parameters: 49,382,784
- Tokenizer: 16k BusyBeaver policy tokenizer
- Context length used in training/eval: 2048 tokens
- Architecture: BusyBeaver QDelta causal LM
- Reloadable weights:
busybeaver_state.pt
The included model.safetensors is kept for compatibility with training output, but the current local loader should prefer busybeaver_state.pt.
Loading
Use the BusyBeaver local implementation from the adapter or training repo. The loader instantiates BusyBeaverQDeltaForCausalLM from config.json, then loads busybeaver_state.pt.
import torch
from busybeaver.modeling import BusyBeaverQDeltaConfig, BusyBeaverQDeltaForCausalLM
model_dir = "path/to/BusyBeaver-50M"
cfg = BusyBeaverQDeltaConfig.from_pretrained(model_dir)
model = BusyBeaverQDeltaForCausalLM(cfg)
state = torch.load(f"{model_dir}/busybeaver_state.pt", map_location="cpu")
model.load_state_dict(state, strict=True)
model.eval()
Harness Integration
Expose BusyBeaver to normal agent harnesses through the OpenAI-compatible adapter server:
python scripts/busybeaver_openai_server.py --model GestaltLabs/BusyBeaver-50M --host 127.0.0.1 --port 8765
Use http://127.0.0.1:8765/v1 as the OpenAI-compatible base URL and BusyBeaver-50M as the model id. Native support in engines such as llama.cpp, vLLM, or Ollama requires either a BusyBeaver architecture adapter or a future export through a compatible runtime wrapper.
Safety
BusyBeaver predicts tool calls; it does not execute them. Production harnesses should validate schema, reject unsafe shell commands, sandbox execution, cap repeated identical actions, and log state/action pairs for trajectory analysis.
Limitations
- Specialized policy model, not a general assistant.
- Depends on BusyBeaver/Hermes compact state formatting.
- Concrete argument reliability depends on the harness argument resolver.
- Browser-agent data was not the main training target yet.
- Custom architecture requires the BusyBeaver loader/adapter unless exported through a compatible runtime wrapper.
Provenance
- Internal run label: V12 path-grounding
- Training hardware: RunPod GPU pod
- Promoted checkpoint: 250
- Full checkpoint archive:
GestaltLabs/BusyBeaver-50M-v12-path-grounding-runpod - Training payload:
DJLougen/busybeaver-training-payload-v12-path-grounding
- Downloads last month
- 168
