Building an MCP Server for VergeOS: AI-Powered Infrastructure Management

David Vincent

01 Dec 2025 — 8 min read

How I built a Model Context Protocol server to manage my VergeOS virtualization cluster using natural language through AI assistants like Claude and Windsurf.

The Problem: Managing Infrastructure with Natural Language

I run a homelab with VergeOS, a powerful hyperconverged infrastructure platform. It's great for managing VMs, networks, and storage—but like most infrastructure tools, it requires either the web UI or direct API calls.

What if I could just ask my AI assistant to "list all running VMs", "power off the test server", or even "add 4GB of RAM to that VM"?

Enter the Model Context Protocol (MCP)—Anthropic's open standard for connecting AI assistants to external tools and data sources.

What is MCP?

The Model Context Protocol is a standardized way for AI applications to interact with external systems. Think of it as a universal adapter that lets AI assistants like Claude, Windsurf, and others access your tools, APIs, and data.

MCP servers expose:

Tools: Actions the AI can perform (e.g., power_on_vm, modify_vm, resize_drive)
Resources: Data the AI can read (e.g., cluster status, VM lists)
Prompts: Pre-defined conversation starters

The Architecture

Here's my setup:

VergeOS cluster running on my local network (your-vergeos-host)
Windsurf IDE on my MacBook
Direct network access from my Mac to VergeOS

Simple and Direct

Since my Mac can reach VergeOS directly, the architecture is straightforward:

┌─────────────┐  stdio   ┌─────────────┐  HTTPS   ┌─────────────┐
│  Windsurf   │◄────────►│  MCP Server │◄────────►│   VergeOS   │
│  (MacBook)  │          │  (MacBook)  │          │ your-vergeos-host│
└─────────────┘          └─────────────┘          └─────────────┘

The MCP server runs locally on my Mac, communicates with Windsurf via stdio (standard input/output), and makes HTTPS calls directly to the VergeOS API.

Alternative: Remote Access via Kubernetes

If your AI client can't reach VergeOS directly (e.g., you're on a different network), you can deploy the MCP server to Kubernetes and use a local proxy:

┌─────────────┐      ┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│  Windsurf   │─────►│ Local Proxy │─────►│  K8s MCP    │─────►│   VergeOS   │
│  (Laptop)   │stdio │  (Laptop)   │HTTPS │  Server     │HTTPS │    API      │
└─────────────┘      └─────────────┘      └─────────────┘      └─────────────┘

I've included both options in the project, but for most homelab setups where your machine can reach VergeOS, the direct approach is simpler.

Building the MCP Server

The VergeOS API

VergeOS has a comprehensive REST API (Swagger 2.0). Key endpoints I needed:

Endpoint	Purpose
`/api/v4/vms`	List/manage virtual machines
`/api/v4/machine_drives`	VM disk management
`/api/v4/machine_status`	Real-time VM power state
`/api/v4/vnets`	Virtual networks
`/api/v4/tenants`	Multi-tenant management
`/api/v4/nodes`	Cluster nodes
`/api/v4/cluster_status`	Cluster health
`/api/v4/vm_actions`	Power operations (poweron, poweroff, kill, reset)

Authentication uses cookie-based tokens:

# Get a token
TOKEN=$(curl -sk -X POST "https://vergeos/api/sys/tokens" \
  -u "admin:password" \
  -H "Content-Type: application/json" \
  -d '{"login":"admin","password":"password"}' | jq -r '."$key"')

# Use it
curl -sk "https://vergeos/api/v4/vms" -b "token=$TOKEN"

The MCP Server (Node.js)

I used the official @modelcontextprotocol/sdk to build the server:

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";

const server = new Server(
  { name: "vergeos", version: "1.0.0" },
  { capabilities: { tools: {} } }
);

// Define tools
const TOOLS = [
  { 
    name: "list_vms", 
    description: "List all VMs in VergeOS",
    inputSchema: { type: "object", properties: {} }
  },
  { 
    name: "power_on_vm", 
    description: "Power on a VM",
    inputSchema: { 
      type: "object", 
      properties: { id: { type: "number" } },
      required: ["id"]
    }
  },
  // ... more tools
];

// Handle tool calls
server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const { name, arguments: args } = request.params;
  
  switch (name) {
    case "list_vms":
      const vms = await vergeosApi.listVMs();
      return { content: [{ type: "text", text: JSON.stringify(vms, null, 2) }] };
    case "power_on_vm":
      await vergeosApi.powerOnVM(args.id);
      return { content: [{ type: "text", text: `VM ${args.id} powered on` }] };
    // ... more handlers
  }
});

// Start
const transport = new StdioServerTransport();
await server.connect(transport);

Windsurf Configuration

Configure Windsurf to use the MCP server:

{
  "mcpServers": {
    "vergeos": {
      "command": "node",
      "args": ["/Users/dvincent/.mcp/vergeos/index.js"],
      "env": {
        "VERGEOS_HOST": "your-vergeos-host",
        "VERGEOS_USER": "admin",
        "VERGEOS_PASS": "your-password"
      }
    }
  }
}

The Result

Now I can manage my entire VergeOS infrastructure through natural language:

Me: "List all running VMs"

Windsurf:

Here are the running VMs in your VergeOS cluster:

| Name | Status | CPU | RAM |
|------|--------|-----|-----|
| vergeos-host1 | running | 8 cores | 16 GB |
| rke2-node1 | running | 14 cores | 32 GB |
| rke2-node2 | running | 8 cores | 16 GB |
| nfs-for-kubernetes | running | 4 cores | 8 GB |

Me: "Power off the test-vm and wait for it to shut down"

Windsurf: "VM 'test-vm' shut down gracefully in 23 seconds."

Me: "Add 2 more CPU cores and 4GB RAM to test-vm"

Windsurf: "VM 'test-vm' is currently stopped. Modified successfully: CPU 2→4 cores, RAM 2048→6144 MB. Use power_on_vm to start it with the new settings."

Me: "Add a 100GB data disk to test-vm"

Windsurf: "Drive 'data-disk' (100 GB) added to test-vm. The VM may need to be restarted to detect the new drive."

Me: "What alarms are active?"

Windsurf: "No active alarms in your VergeOS cluster. Everything looks healthy! ✅"

Available Tools

The MCP server exposes 29 tools for comprehensive infrastructure management:

VM Power Control

Tool	Description
`power_on_vm`	Start a VM
`power_off_vm`	Graceful shutdown with optional wait and auto-force
`force_off_vm`	Hard power off (kill)
`reset_vm`	Reboot a VM

The power_off_vm tool is smart—it can wait for the VM to shut down and automatically force power off if the graceful shutdown times out:

power_off_vm({ id: 40, wait_timeout: 60, force_after_timeout: true })
// Waits up to 60s for graceful shutdown, then forces if needed

VM Configuration

Tool	Description
`modify_vm`	Change CPU cores and/or RAM (auto-shutdown if running)
`add_drive`	Add a new disk to a VM
`resize_drive`	Expand an existing disk
`get_vm_drives`	List VM disks with sizes
`get_vm_nics`	List VM network interfaces with MAC, IP, and network info

Snapshot Management

Tool	Description
`list_vm_snapshots`	List all snapshots for a VM
`create_vm_snapshot`	Create a snapshot with name, description, and expiration
`delete_vm_snapshot`	Delete a snapshot by ID
`restore_vm_snapshot`	Restore a VM from a snapshot

The create_vm_snapshot tool supports flexible options:

create_vm_snapshot({ 
  vm_id: 40, 
  name: "before-upgrade", 
  description: "Pre-upgrade backup",
  expires_days: 7,  // 0 for never
  quiesce: true     // Requires guest agent
})
// Creates a consistent snapshot that expires in 7 days

The get_vm_nics tool returns detailed NIC information including network association:

get_vm_nics({ id: 11 })  // Pass VM ID, not machine ID
// Returns:
// [
//   {
//     "id": 50,
//     "name": "eth0",
//     "mac": "f0:db:30:7a:48:83",
//     "network_id": 17,
//     "ip": "10.0.6.100",
//     "interface": "virtio",
//     "enabled": true
//   }
// ]

The modify_vm tool handles running VMs gracefully:

modify_vm({ id: 40, cpu_cores: 8, ram_mb: 16384, shutdown_if_running: true })
// If VM is running: shuts down → applies changes → tells you to restart
// If VM is stopped: applies changes immediately

Infrastructure

Tool	Description
`list_vms`	List VMs with power state (filter by running/name)
`get_vm`	Detailed VM info with human-readable power state
`get_vm_status`	Quick power state check
`list_networks`	Virtual networks
`network_action`	Power on/off/reset networks
`list_tenants`	Multi-tenant management
`list_nodes`	Cluster nodes
`get_cluster_status`	Cluster health
`get_alarms`	Active alerts
`get_logs`	System logs with filtering

The get_logs tool supports filtering by level and object type:

get_logs({ limit: 20, level: "error" })
// Returns only error-level logs

get_logs({ limit: 50, object_type: "vm" })
// Returns only VM-related logs (power changes, edits, etc.)

Available log levels: audit, message, warning, error, critical, summary, debug
Available object types: vm, vnet, tenant, node, cluster, user, system, task

Lessons Learned

1. Keep Responses Small

My first version returned the full VM objects from the API—including all metadata, recipe configurations, and nested objects. This resulted in 100KB+ responses that got truncated.

Fix: Map responses to only essential fields:

return vms.map(vm => ({
  id: vm.$key,
  name: vm.name,
  running: status.running,
  status: status.status,  // "running", "stopped", etc.
  cpu_cores: vm.cpu_cores,
  ram: vm.ram,
}));

2. Human-Readable Status

The VergeOS API returns numeric status codes like console_status: 40. These mean nothing to an AI (or a human). I added a separate call to /machine_status to get readable states like "running" or "stopped".

3. Smart Error Messages

Instead of cryptic API errors, the tools return actionable hints:

{
  "success": false,
  "error": "VM 'test-vm' is currently running. CPU/RAM changes require the VM to be powered off.",
  "hint": "Set shutdown_if_running=true to automatically shut down the VM, apply changes, and restart it"
}

4. VergeOS API Quirks

Snapshots are VMs: VMs with is_snapshot: true are templates, not running machines. Always filter them out.
NIC filtering is broken: /machine_nics?machine=<ID> returns NICs from other machines too. Always filter by machine ID in your code.
NIC field names differ: The API uses macaddress (not mac) and ipaddress (not ip_address). The vnet field contains the network ID.
Token in cookie: Despite the Swagger spec mentioning headers, tokens must be sent as cookies.
Power state needs /machine_status: The /vms endpoint doesn't reliably show if a VM is running.

5. Direct is Best (When Possible)

Initially, I built a complex two-tier architecture with a Kubernetes-hosted HTTP server and a local proxy. But then I realized my Mac could reach VergeOS directly—so I simplified to a single local MCP server.

Lesson: Start simple. Only add complexity (proxies, K8s deployments) when your network topology requires it.

6. The Proxy Pattern (When You Need It)

If your AI client can't reach your infrastructure directly, the proxy pattern works great:

Deploy an HTTP MCP server to Kubernetes (or any reachable host)
Run a lightweight local proxy that forwards stdio ↔ HTTP
Centralizes credentials on the server side

I've included both options in the project for flexibility.

What's Next?

VM Creation: Add tools to create VMs from templates
Snapshot Profiles: Automated snapshot scheduling
NIC Management: Add/remove/modify network interfaces
Monitoring Dashboard: Build a web UI on top of the same API
Multi-Cluster: Support multiple VergeOS instances

Quick Start

Want to try it yourself? Here's the 5-minute setup:

Option 1: Direct Connection (Recommended)

Connect directly to your VergeOS server—no intermediate server needed:

# 1. Create the directory
mkdir -p ~/.mcp/vergeos && cd ~/.mcp/vergeos

# 2. Initialize and install dependencies
cat > package.json << 'EOF'
{
  "name": "vergeos-mcp",
  "version": "1.0.0",
  "type": "module",
  "dependencies": {
    "@modelcontextprotocol/sdk": "^0.5.0",
    "node-fetch": "^3.3.2"
  }
}
EOF
npm install

# 3. Download the direct-connection MCP server
curl -o index.js https://raw.githubusercontent.com/dvvincent/vergeos-mcp-server/main/local-proxy/index-direct.js

# 4. Add to Windsurf config (~/.codeium/windsurf/mcp_config.json)

{
  "mcpServers": {
    "vergeos": {
      "command": "node",
      "args": ["~/.mcp/vergeos/index.js"],
      "env": {
        "VERGEOS_HOST": "your-vergeos-ip",
        "VERGEOS_USER": "admin",
        "VERGEOS_PASS": "your-password"
      }
    }
  }
}

This connects directly to your VergeOS API with automatic token management and self-signed certificate handling.

Claude Desktop Configuration

For Claude Desktop, add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or the equivalent config file:

{
  "mcpServers": {
    "vergeos": {
      "command": "node",
      "args": [
        "/Users/yourusername/.mcp/vergeos/index-direct.js"
      ],
      "env": {
        "VERGEOS_HOST": "your-vergeos-ip",
        "VERGEOS_USER": "admin",
        "VERGEOS_PASS": "your-password"
      }
    }
  }
}

Note: The user account can be any VergeOS account with appropriate permissions—it does not have to be admin.

Option 2: Via HTTP Proxy Server

If you want to run a centralized MCP server (e.g., in Kubernetes) that multiple clients can connect to:

Important: Before deploying to Kubernetes, edit k8s-deployment.yaml and update the credentials:

stringData:
  VERGEOS_HOST: "your-vergeos-ip"      # Your VergeOS server IP or hostname
  VERGEOS_USER: "admin"                 # Your VergeOS username
  VERGEOS_PASS: "your-actual-password"  # Your VergeOS password

Then deploy with kubectl apply -f k8s-deployment.yaml or use the deploy.sh script.

# Download the proxy client
curl -o index.js https://raw.githubusercontent.com/dvvincent/vergeos-mcp-server/main/local-proxy/index.js

{
  "mcpServers": {
    "vergeos": {
      "command": "node",
      "args": ["~/.mcp/vergeos/index.js"],
      "env": {
        "VERGEOS_MCP_URL": "https://your-mcp-server.example.com"
      }
    }
  }
}

Restart Windsurf and start chatting with your infrastructure!

The Full Project

The complete source code includes:

29 MCP tools for comprehensive VM and infrastructure management
Smart power control with wait timeouts and auto-force
VM modification (CPU, RAM, disks) with running VM handling
Snapshot management - create, list, delete, and restore VM snapshots
Log retrieval with level and object type filtering
Direct connection mode - connect straight to VergeOS with auto token management
HTTP proxy mode - centralized server for multi-client deployments
Self-signed cert handling - works with homelab SSL certificates
Kubernetes manifests and deployment scripts
Comprehensive documentation

Available at: github.com/dvvincent/vergeos-mcp-server

If you're running VergeOS (or any infrastructure with a REST API), building an MCP server is a great way to add AI-powered management. The MCP SDK makes it straightforward—I went from zero to managing VMs with natural language in an afternoon.

Have questions or built your own MCP server? I'd love to hear about it!