AgentHubAgentHub

deployment

SkillSkillsMP

io.github.NVIDIA/Model-Optimizer/deployment · vmain

Serve a quantized or unquantized LLM checkpoint as an OpenAI-compatible API endpoint using vLLM, SGLang, or TRT-LLM. Use when user says "deploy model", "serve model", "start vLLM server", "launch SGLang", "TRT-LLM deploy", "AutoDeploy", "benchmark throughput", "serve checkpoint", or needs an inference endpoint from a HuggingFace or ModelOpt-quantized checkpoint. Do NOT use for quantizing models (use ptq) or evaluating accuracy (use evaluation).

概览

deployment 是一个Agent Skill,收录自 SkillsMP。本页提供 Cursor、Claude Code 等客户端的安装配置片段。

Agent Skill 是带 SKILL.md 的指令包。安装后,AI 会根据 description 在匹配任务时自动加载,无需每次手动粘贴提示词。

安装

选择你的平台查看安装方式

# 通用 CLI(Cursor / Claude Code / Codex 等均支持)
npx skills add NVIDIA/Model-Optimizer@deployment

使用方式

安装完成后,在对话中直接描述你的任务(或提及技能名称)。Agent 会先读取 SKILL.md 的 description 判断是否启用,再按其中的步骤执行。可用 /skills(Claude Code)或在设置中查看已加载的 Skills。

相关资源

统一 Manifest

{
  "id": "io.github.NVIDIA/Model-Optimizer/deployment",
  "type": "skill",
  "version": "main",
  "displayName": "deployment",
  "description": "Serve a quantized or unquantized LLM checkpoint as an OpenAI-compatible API endpoint using vLLM, SGLang, or TRT-LLM. Use when user says \"deploy model\", \"serve model\", \"start vLLM server\", \"launch SGLang\", \"TRT-LLM deploy\", \"AutoDeploy\", \"benchmark throughput\", \"serve checkpoint\", or needs an inference endpoint from a HuggingFace or ModelOpt-quantized checkpoint. Do NOT use for quantizing models (use ptq) or evaluating accuracy (use evaluation).",
  "author": {
    "name": "NVIDIA",
    "url": "https://github.com/NVIDIA"
  },
  "repository": {
    "url": "https://github.com/NVIDIA/Model-Optimizer",
    "source": "github",
    "subfolder": ".agents/skills/deployment"
  },
  "homepage": "https://skillsmp.com/skills/nvidia-model-optimizer-agents-skills-deployment-skill-md",
  "distribution": {
    "packages": [
      {
        "registryType": "source",
        "identifier": "NVIDIA/Model-Optimizer@deployment",
        "version": "main",
        "runtimeHint": "npx skills add"
      }
    ],
    "remotes": []
  },
  "dependencies": [],
  "installTargets": [
    "claude-code",
    "claude-desktop",
    "cursor",
    "codex",
    "vscode"
  ],
  "keywords": [
    "stars:2897"
  ],
  "provenance": {
    "origin": "skillsmp",
    "originalId": "nvidia-model-optimizer-agents-skills-deployment-skill-md",
    "originalUrl": "https://skillsmp.com/skills/nvidia-model-optimizer-agents-skills-deployment-skill-md",
    "isOfficial": false,
    "status": "active"
  }
}
deployment — Agent Skill 安装与配置 · AgentHub