byok-custom-model

BYOK — register a custom LLM endpoint (Anthropic, OpenAI, xAI Grok, Qwen, DeepSeek, Venice, etc.) with your own API key

Skill file

Preview skill file
---
name: byok-custom-model
version: 2.3.2
description: |
  Register a custom LLM endpoint with your own API key for chat in Starchild.

  Use when adding a personal Anthropic, OpenAI, Grok, Qwen, DeepSeek, NEAR AI, or Venice key as a chat model (e.g. add my Claude key, register DeepSeek).
author: starchild
delivery: script
protected: true
tags: [byok, custom-model, llm, openrouter, anthropic, openai, xai, grok, deepseek, qwen, kimi, mimo, gemini, venice, near-ai, tee, confidential-inference]

---

# 🔑 BYOK — Custom LLM Models

Register a custom LLM endpoint to the model selector. Bypasses the platform proxy — the user supplies their own API key, the agent hits the vendor / aggregator directly (OpenRouter, DashScope, Anthropic native, NEAR AI Cloud TEE, self-hosted, etc.).

This is a **script-mode skill** — no tools registered. Read this file, then call the exports from a `bash` block.

## See also
- `config/context/references/model-onboarding.md` — broader model selection / OAuth context
- `chatgpt-codex-onboarding` skill — for ChatGPT/Codex OAuth (different mechanism, NOT BYOK)

---

## Curated vendors (always check this first)

The skill ships with 11 pre-configured vendors. **Always match the user's intent against this list before asking for any URL, model name, or API example** — base_url / wire / thinking / capabilities are all pre-filled, so a curated match goes straight to `add_template(vendor=...)`.

| Vendor id | Use when user mentions… |
|---|---|
| `anthropic` | Claude, Anthropic |
| `openai` | GPT-4o, GPT-5, OpenAI direct |
| `xai` | Grok, xAI |
| `qwen` | Qwen, 通义千问, DashScope |
| `deepseek` | DeepSeek |
| `kimi` | Kimi, Moonshot |
| `mimo` | MiMo, 小米 |
| `gemini` | Gemini |
| `gemma` | Gemma |
| `near-ai` | **privacy, TEE, confidential inference, "don't log my data", Web3-native** |
| `venice` | Venice (only if user names it; see Privacy-first tier below) |

---

## Onboarding flow — templates first

1. **Check the curated vendors table above.** If the user's intent matches one, go straight to `add_template(vendor=...)` and skip to step 5. Do NOT ask for a URL.
2. Only if no curated vendor matches: ask the user to paste the provider's official API example from their docs (curl / requests / fetch sample). Tell them **not to include a real API key** — placeholders or fake keys are fine.
3. Run `parse_example` to auto-detect base_url, upstream_model, wire (openai vs anthropic), thinking params, and vendor-specific request fields.
4. Review the draft with the user, then call `add(...)` — the entry is written to `custom_models.yaml`.
5. **If the result contains `need_env_input`, immediately call the `request_env_input` tool** with `env_vars` and `reason` from that payload. This pops the secure-input UI; the user enters the key; it lands in `workspace/.env`. **This step is mandatory — the script cannot pop the UI itself.**

**Privacy-first tier:** `near-ai` and `venice` both target privacy-sensitive users, but NEAR AI is the cleaner integration — Venice's TEE story is itself built on top of NEAR AI + Phala, so going direct to NEAR AI yields a shorter trust chain (Intel + NVIDIA silicon + NEAR's reproducible enclave image; no product-layer proxy in between). Curated NEAR model list is **open-weight TEE-protected only** — NEAR's catalog also proxies Claude / GPT-5 / Gemini Pro under "Anonymized, not TEE-protected" mode, which we deliberately exclude since the entire privacy value-prop here is the hardware enclave.

**Whenever NEAR AI is in scope, always recommend a TEE-protected (privacy) model** — that's the entire reason a user picks NEAR over OpenAI/Anthropic direct. The curated list is already TEE-only, so `add_template(vendor='near-ai')` defaults are safe. If the user asks to register a non-TEE model on NEAR (e.g. NEAR's anonymized Claude passthrough), warn them it weakens the privacy guarantee and recommend they either stay on a curated TEE model or register the upstream vendor directly.

**NEAR AI reasoning protocol:** NEAR uses `chat_template_kwargs` nested under `extra_body` instead of the top-level `reasoning_effort`/`thinking`/`enable_thinking` that other vendors use. The provider handles this automatically via the `nearai_chat_template` thinking_capability rule. Per-model parameter names vary (GLM/Qwen3.5/Qwen3.6 use `enable_thinking`, DeepSeek-V3 uses `thinking`, gpt-oss is always-on). Full spec: [docs.near.ai/cloud/reasoning-models](https://docs.near.ai/cloud/reasoning-models). Default model `Qwen/Qwen3.6-35B-A3B-FP8` works out of the box; `Qwen3.5-122B-A10B` ships with `thinking_mode='disabled'` because its hidden-thinking pattern would otherwise cause `finish=length, content=null` on baseline calls.

---

## Script usage

```bash
python3 - <<'EOF'
import sys, json
sys.path.insert(0, "/data/workspace/skills/byok-custom-model")
from exports import (
    templates, list_models, get, parse_example,
    list_vendor_models, add, add_template, remove,
)

# Enumerate the 11 curated vendor presets
print(json.dumps(templates(), indent=2))

# One-click registration for a curated vendor
result = add_template(vendor="qwen")
print(json.dumps(result, indent=2))
EOF
```

---

## Functions

| Function | Required args | Purpose |
|---|---|---|
| `templates()` | — | List the 11 curated vendor presets |
| `list_vendor_models(vendor)` | `vendor` | Live `/models` catalog (only if the template has `model_discovery`) |
| `add_template(vendor, *, upstream_model=None, name=None)` | `vendor` | One-click registration for a curated vendor (recommended path) |
| `parse_example(api_example)` | `api_example` | Parse docs API example into a safe draft (non-curated vendors) |
| `add(upstream_model, base_url, ...)` | `upstream_model`, `base_url` | Register from custom args (use after `parse_example`) |
| `list_models()` | — | Show all registered custom entries |
| `get(model_id)` | `model_id` | Inspect one entry |
| `remove(model_id)` | `model_id` | Delete an entry |

All functions return a dict with `ok: True` on success or `ok: False, error: "..."` on failure.

### Handling `need_env_input` (mandatory two-step pattern)

`add()` and `add_template()` may include a `need_env_input` field in their result when the API key env var is not yet set. The script CANNOT pop the secure-input UI itself — it has no access to the user's open SSE stream. The calling agent must do it:

```python
# After add_template / add returns:
if result.get("need_env_input"):
    nei = result["need_env_input"]
    # Call the in-process tool — pseudocode, actual signature is tool-side:
    request_env_input(env_vars=nei["env_vars"], reason=nei["reason"])
```

The popup, the .env write, and the channel-specific UX (web popup / TG card / WeChat text prompt) are all handled by `request_env_input`. Do NOT prompt the user to paste the key in chat as a fallback — just call the tool.

---

## After registration

- The model appears in the selector prefixed with `custom/`.
- User switches via `/model custom/<name>` (e.g. `/model custom/qwen-plus-e3f4`) or the model picker UI.
- Subsequent calls bypass the platform proxy — vendor pricing applies directly to the user's BYOK quota.

---

## Critical rules

- **Never accept an API key pasted in chat.** If the user pastes one, ignore it, refuse to register, and tell them the secure popup is the only safe channel.
- **Never re-issue the secure-input popup automatically** if the user hasn't responded — wait.
- **If `need_env_input` is returned, always call `request_env_input`.** Do not skip, do not ask the user to paste the key, do not retry `add_template` hoping it will pop the UI — it won't.
- **Never write to `workspace/config/custom_models.yaml` or `workspace/.env` by hand.** Always go through the exports above.
- The 11 curated vendors **always** use `add_template`. Only use `parse_example` + `add` for self-hosted or rare providers.

---

## xAI Grok — note on the subscription confusion

Users frequently mix up two unrelated xAI products:

- **X Premium / SuperGrok subscription** ($30/mo on x.com) — chat UI access only. **Does not include API access.**
- **console.x.ai** — independent developer account, separate billing. Generates API keys, $25 in promo credits for new accounts, then pay-per-token.

If a user wants to add Grok via BYOK, point them at **https://console.x.ai/** — not x.com / Premium / SuperGrok. The `xai` template's `homepage` field already deep-links to the right place. Hermes / Grok-CLI's OAuth-to-subscription flow relies on a first-party client_id whitelist that xAI does not extend to third-party cloud agents, so the BYOK API-key path is the only realistic integration for hosted products.

Source

Creator's repository · starchild-ai-agent/official-skills

View on GitHub

Security

Security checks in progress
Results will appear here once audits complete
What this skill can do
Reads your filesConnects to the internetRuns code on your machine
Checked by 3 independent security firms
Does it try to trick the AI?Not yet checkedPending · Gen Agent Trust Hub
Does it sneak in hidden code?Not yet checkedPending · Socket
Does it have known bugs?Not yet checkedPending · Snyk