49 skills
tao-train-centerpose
Passed all 3 security checksCenterPose for keypoint / pose estimation. Detects object centers and regresses keypoint locations for 6-DoF
·0↓11
nemo-mbridge-perf-tp-dp-comm-overlap
Passed all 3 security checksOperational guide for enabling TP, DP, and PP communication overlap in Megatron-Bridge, including config knobs, code anchors, pitfalls, and verification.
·0↓11
cudaq-guide
Passed all 3 security checksCUDA-Q onboarding guide for installation, test programs, GPU simulation, QPU hardware, and quantum applications.
·0↓11
cuopt-developer
Passed all 3 security checksModify, build, test, debug, and contribute to NVIDIA cuOpt (C++/CUDA, Python, server, CI). Use for solver internals, PRs, DCO, and code conventions.
·0↓11
tao-finetune-cosmos-reason
Passed all 3 security checksCosmos-Reason2-8B video QA supervised fine-tuning with FSDP parallelism. Use when training or evaluating video
·0↓11
vss-query-analytics
Passed all 3 security checksUse this skill when reading video-analytics metrics, incidents, alerts, and sensor data via the VA-MCP server (port 9901). Not for live VLM or incident-range narrative reports.
·0↓11
tao-train-grounding-dino
Passed all 3 security checksGrounding DINO for open-set object detection. Combines DINO-style detection with a BERT text encoder for
·0↓11
omniverse-realtime-viewer
Passed all 3 security checksUse as the top-level router for Omniverse Realtime Viewer USD app requests and focused viewer reference documents.
·0↓11
tao-train-nvpanoptix3d
Passed all 3 security checksNVPanoptix3D for panoptic 3D scene reconstruction from posed RGB images. Produces 3D panoptic segmentation
·0↓11
tao-run-on-kubernetes
Passed all 3 security checksKubernetes execution platform — submits TAO container jobs as single-pod k8s Jobs with NVIDIA GPU scheduling.
·0↓11
hsb-test
Passed all 3 security checksExecute QA test plans on Holoscan Sensor Bridge hardware. Reads a user-provided test document, filters tests by the user's setup, determines which tests can run automatically, executes them with pass/fail evaluation, and produces a structured test results report.
·0↓11
tao-train-optical-inspection
Passed all 3 security checksOptical Inspection for defect detection using Siamese networks. Compares image pairs to detect manufacturing
·0↓11
physical-ai-video-data-augmentation
Passed all 3 security checks>-
·0↓11
nemo-mbridge-perf-moe-dispatcher-selection
Passed all 3 security checksChoose the right MoE token dispatcher (`alltoall`, DeepEP, or HybridEP) for the hardware, EP degree, and optimization stage. Summarizes patterns from DSV3, Qwen3, Qwen3-Next, and VLM bring-up work.
·0↓11
tao-generate-video-reasoning-annotations
Passed all 3 security checks>-
·0↓11
nemo-mbridge-perf-megatron-fsdp
Passed all 3 security checksOperational guide for enabling Megatron FSDP in Megatron-Bridge, including config knobs, code anchors, pitfalls, and verification.
·0↓11
digital-health-clinical-asr-setup
Passed all 3 security checksStage 1 of Clinical ASR Flywheel. Use when bootstrapping a cycle: NVCF+MW disclosure, NVIDIA_API_KEY check, deps install, TTS+ASR smoke test.
·0↓11
tao-train-mask-grounding-dino
Passed all 3 security checksMask Grounding DINO for grounded instance segmentation. Extends Grounding DINO with a mask-prediction head for
·0↓11
nv-reason-cxr
Passed all 3 security checksUsed for command-shape or live NV-Reason-CXR chest X-ray reasoning smoke tests. Not for diagnosis or clinical reporting.
·0↓11
cuopt-user-rules
Passed all 3 security checksBase rules for end users calling NVIDIA cuOpt (routing/LP/MILP/QP/install/server). Not for cuOpt internals — use cuopt-developer for those.
·0↓11
omniverse-usd-performance-tuning
Passed all 3 security checksTop-level workflow skill for USD performance diagnosis and optimization. Use for slow loading, high memory, low FPS, or 'optimize my scene' requests; delegates auth/runtime setup to Phase 0 owners.
·0↓11
tao-analyze-changenet-rca
Passed all 3 security checksPerforms deep Root Cause Analysis (RCA) on NVIDIA TAO Visual ChangeNet classification experiments with
·0↓11
nemoclaw-user-configure-security
Passed all 3 security checksPresents a risk framework for every configurable security control in NemoClaw. Use when evaluating security posture, reviewing sandbox security defaults, or assessing control trade-offs. Trigger keywords - nemoclaw security best practices, sandbox security controls risk framework, nemoclaw credential storage, openshell provider, api key security, openclaw security controls, nemoclaw security boundary, prompt injection, tool access control.
·0↓11
nv-segment-ctmr
Passed all 3 security checksUsed for running NV-Segment-CTMR on CT or MRI NIfTI volumes and recording label-map evidence. Not for clinical interpretation.
·0↓11
vss-generate-video-calibration
Passed all 3 security checksUse to run AutoMagicCalib on local MP4s, RTSP, or the bundled sample dataset, and to deploy vss-auto-calibration when needed. Do not use for non-AMC calibration or runtime analytics.
·0↓11
tao-run-platform
Passed all 3 security checksTAO Execution SDK for submitting and monitoring GPU training jobs on supported platforms (Lepton, Brev, SLURM,
·0↓11
tao-train-reid
Passed all 3 security checksPerson re-identification (ReID). Learns discriminative embeddings to match the same person across different
·0↓11
rag-blueprint
Passed all 3 security checksNVIDIA RAG Blueprint — deploy, configure, troubleshoot, and manage. Handles any RAG action: deploy, install, start, enable, disable, toggle, change, configure, troubleshoot, debug, fix, shutdown, stop, or tear down any RAG feature or service (Agentic RAG, VLM, guardrails, query rewriting, models, search, ingestion, observability, summarization, reasoning, and more).
·0↓11
nemo-mbridge-perf-activation-recompute
Passed all 3 security checksValidate and use selective and full activation recompute in Megatron Bridge to reduce GPU memory usage at the cost of extra compute.
·0↓11
physical-ai-defect-image-generation
Passed all 3 security checks>-
·0↓11
tao-train-segformer
Passed all 3 security checksSegFormer for semantic segmentation. Lightweight transformer-based architecture with hierarchical feature
·0↓11
nemo-mbridge-perf-sequence-packing
Passed all 3 security checksValidate and use packed sequences and long-context training in Megatron-Bridge, distinguishing offline packed SFT for LLMs from in-batch packing for VLMs, and applying the right CP constraints.
·0↓11
dicom-series-preflight
Passed all 3 security checksUsed for header-only preflight of one DICOM series folder before conversion or inference. Not for de-identification or clinical clearance.
·0↓11
nemoclaw-user-monitor-sandbox
Passed all 3 security checksInspects sandbox health, traces agent behavior, and diagnoses problems. Use when monitoring a running sandbox, debugging agent issues, or checking sandbox logs. Trigger keywords - monitor nemoclaw sandbox, debug nemoclaw agent issues.
·0↓11
tilegym-improve-cutile-kernel-perf
Passed all 3 security checks·0↓11
nemo-mbridge-multi-node-slurm
Passed all 3 security checksConvert single-node scripts to multi-node Slurm sbatch jobs and debug common multi-node failures. Covers srun-native vs uv run torch.distributed approaches, container setup, NCCL timeouts, OOM sizing for MoE models, and interactive allocation.
·0↓11
tao-train-single-step
Passed all 3 security checksStandard single-step train/eval/export workflow for any TAO model. Use when training a TAO model on a dataset
·0↓11
tilegym-cutile-autotuning
Passed all 3 security checksUse when adding, modifying, optimizing, or debugging CuTile autotuning code. Trigger signals: `exhaustive_search` / `replace_hints` / `hints_fn` / `cuda.tile.tune` in code, `autotune` in filenames, or correctness/performance issues in autotuned CuTile kernels. Covers: tune-once/cache/launch pattern, per-architecture configs (sm80–sm120), parameter space design (tile sizes, occupancy, num_ctas), and 7 common pitfalls with solutions.
·0↓11
tao-train-mask-auto-label
Passed all 3 security checksMAL (Mask Auto-Label) for weakly-supervised segmentation. Produces segmentation masks from minimal annotations
·0↓11
nemo-mbridge-perf-parallelism-strategies
Passed all 3 security checksOperational guide for choosing and combining parallelism strategies in Megatron Bridge, including sizing rules, hardware topology mapping, and combined parallelism configuration.
·0↓11
cupynumeric-migration-readiness
Passed all 3 security checksPre-migration readiness assessor for porting NumPy to cuPyNumeric. Use BEFORE substantial porting work begins when the user asks whether code will scale on GPU, whether they should migrate to cuPyNumeric, which NumPy patterns transfer cleanly, what must be refactored before porting, or mentions pre-port assessment, scaling analysis, or refactor planning. Inspect the user's source code, look up NumPy usage, cross-reference the cuPyNumeric API support manifest, and distinguish distributed-scaling-friendly patterns from blockers such as unsupported APIs, scalar synchronization, host round-trips, Python/object-heavy control flow, shape/data-dependent branching, and in-place mutation hazards. Produce a verdict of READY, LIGHT REFACTOR, SIGNIFICANT REFACTOR, or NOT RECOMMENDED, with concrete refactor pointers.
·0↓11
nemo-retriever
Passed all 3 security checksUse this to pull a specific figure, fact, quote, or table value out of a collection of documents and cite the exact source file and page. Built for question-answering over a folder of reports — annual reports, 10-Ks and financial filings, research PDFs, scanned forms / images (`.jpg` `.png` `.tiff`), Office (`.docx` `.pptx`), HTML / TXT, audio, and video. It indexes the whole corpus once, then finds the right document among many and returns the value with its page number — useful when several documents or figures look alike and you need the correct one, not a near-match. Reach for it instead of reading or grepping PDFs one at a time. Not for: editing files, web browsing, a single plain-text file, fine-tuning.
·0↓11
vss-deploy-detection-tracking-2d
Passed all 3 security checksUse this skill when the user wants to deploy, run, debug, tear down, or call the REST API of the RTVI-CV 2D detection / tracking microservice. Trigger when the user says things like 'deploy rtvi-cv', 'start warehouse 2d', 'add a stream', 'check rtvi-cv health', or 'stop the perception container'. Not for VLM, embedding, or analytics — use the matching vss-* skill.
·0↓11
tilegym-adding-cutile-kernel
Passed all 3 security checksAdd a new cuTile GPU kernel operator to TileGym. Covers dispatch registration in ops.py, cuTile backend implementation, __init__.py exports, test creation, and benchmark in tests/benchmark. Use when adding, creating, or implementing a new cuTile operator/kernel in TileGym, or when asking how to register a new cuTile op.
·0↓11
tao-run-automl
Passed all 3 security checksRun AutoML / hyperparameter optimization (HPO) for NVIDIA TAO networks using AutoMLRunner. Handles algorithm
·0↓11
tao-run-deft-aoi
Passed all 3 security checks>
·0↓11
tao-train-rtdetr
Passed all 3 security checksRT-DETR (Real-Time DEtection TRansformer) for 2D object detection. Designed for real-time inference with
·0↓11
nemo-mbridge-perf-cuda-graphs
Passed all 3 security checksValidate and use CUDA graph capture in Megatron Bridge, including local full-iteration graphs and Transformer Engine scoped graphs for attention, MLP, and MoE modules.
·0↓11
tao-train-oneformer
Passed all 3 security checksOneFormer for universal image segmentation. Unifies panoptic, instance, and semantic segmentation with a
·0↓11