Hosted Kubernetes for AI

Run AI agents
on real K8s.
Pay by the minute.

Liveness checks, TLS certs, service mesh, ingress routing. All the things Kubernetes gives you for free, without the six weeks of setup. Built for builders shipping AI agents to production.

$ lattice deploy --gpu a100 --replicas 3
Provisioning 3x A100 pods...
✓ agent-worker-01 running (12s)
✓ agent-worker-02 running (14s)
✓ agent-worker-03 running (13s)
TLS cert issued, mesh route configured
https://my-agent.lattice.run

The Problem

Hosting AI agents shouldn't require a platform team

You've got 5 machines sitting idle 75% of the day. Your agents restart and nobody notices. Cert renewal is a cron job and a prayer. There's a better way.

Setup
SSH keys, VPS configs, 40-page runbooks
One command. Live in seconds.
Health
Agent crashed at 3am, nobody knew
K8s liveness probes restart automatically.
Networking
Manual nginx, Let's Encrypt, iptables
TLS, ingress, DNS. Handled.
Billing
Reserved instances burning cash overnight
Per-minute billing. Scale to zero.
Scale
SSH into each box, deploy one by one
Horizontal pod autoscaler. Done.
Orchestration
Custom queues, brittle webhook chains
Service mesh routes agent-to-agent traffic.

Built for Agents

Kubernetes primitives, zero Kubernetes pain

Service Mesh for Agent Orchestration

Agents that talk to other agents need more than HTTP. Lattice's built-in service mesh handles routing, retries, circuit breaking, and observability between your agents. mTLS by default. Trace every request from ingress to response.

agent-plannermeshagent-researcher
agent-plannermeshagent-writer
agent-writer  → meshagent-reviewer

latency p99: 12ms
mTLS: enabled
traces: 100%

GPU Pods On Demand

Run your own models when API latency or cost doesn't cut it. Provision A100s, H100s, or consumer GPUs by the minute. Scale to zero when idle. No reserved instances, no commitments, no waste.

$ lattice gpu list

A100 80GB   $0.018/min   available
H100 80GB   $0.042/min   available
RTX 4090    $0.008/min   available
T4 16GB     $0.004/min   available

Real Kubernetes, No Guardrails

This isn't a PaaS pretending to be K8s. You get real pods, services, configmaps, secrets, CRDs. Bring your own Helm charts. Run kubectl. Full control with none of the provisioning headache.

$ kubectl get pods

NAME                STATUS    AGE
planner-7d4f     Running   2h
researcher-a91c  Running   2h
writer-3b8e      Running   45m
gpu-infer-f2d1   Running   12m

Pricing

Pay for minutes, not months

No reserved instances. No annual commits. Your agents scale up, you pay more. They scale down, you pay less. Scale to zero, pay nothing.

CPU Agents
$0.002/min
Perfect for agents calling LLM APIs. Most agent workloads start here.
  • 2 vCPU, 4GB RAM per pod
  • Auto-scaling included
  • Service mesh + TLS
  • Scale to zero
Dedicated Cluster
Custom
Your own K8s cluster. Custom networking, compliance, SLAs.
  • Isolated control plane
  • Custom node pools
  • VPC peering
  • 99.9% SLA

Your agents deserve
infrastructure that works.

Lattice is Kubernetes done right for the agentic era. No more duct-taping VPS boxes together. No more paying for idle compute. Just deploy and build.