Weekly • Technical • Practitioner-Focused

The Pulse of LLMOps, FinOps
& AI Infrastructure

Intelligence for engineers building and operating AI infrastructure at scale. LLMOps, FinOps, Kubernetes, and the tools that keep production AI running.

Subscribe Free Start Here

📊

Deep Technical Guides

Benchmarks run on real infrastructure. Config files you can copy-paste. No vendor fluff.

💰

Cost Optimization Playbooks

Datadog to Grafana migrations. GPU budget triage. Reserved instance strategy. Real savings.

🛠️

Production Incident Frameworks

Postmortem templates for AI failures. Runbooks your on-call team will actually use.

Articles Published

—

Subscribers

Weekly

Publication Cadence

Free

Always

Latest Articles

View all →

AI Infrastructure

Mesh Inference on iroh: GPUs in Three Offices and a Closet

Mesh LLM turns scattered GPUs into one OpenAI-compatible API. Skippy split-mode layer-pipeline inference behind localhost:9337/v1 — and the OTel gap.

July 15, 2026•11 min read

Observability

OpenAI Just Made Your Agent a Black Box (and What to Do)

Codex multi-agent-v2 encrypts the parent→subagent payload. The agent-side OTel proxy that keeps your plaintext copy before Responses encrypts it

July 15, 2026•12 min read

AI Infrastructure

Custom AI Silicon 2026: Meta MTIA, Trainium2, TPU, Maia

Meta MTIA 300/450/Iris, AWS Trainium2/Inferentia2, Google TPU v5e/v6, Microsoft Maia 100 vs NVIDIA H100/B100 — vendor-neutral, dollar-per-token.

July 11, 2026•14 min read

LLMOps

Per-Engineer AI Observability 2026: Beat Reflection

OTel + LangSmith + ClickHouse reference schema for per-engineer Claude observability — session cost, four-signal dashboard, 1/3/6/12-month retention.

July 11, 2026•12 min read

Observability

Agent Observability at 1,200+ Agents: OTel Stack 2026

AWS AgentCore vs. OSS for 1,200-agent fleets: OpenTelemetry gen_ai.* conventions and a Tempo / ClickHouse / OpenSearch trace-store comparison.

July 8, 2026•13 min read

AI Infrastructure

Agent Sandbox vs. Agent Substrate: CNCF Runtime 2026

Two CNCF SIG Apps projects — agent-sandbox and agent-substrate — and the OTel pattern that makes both debuggable in production.

July 8, 2026•11 min read

Stay ahead of the stack.

Weekly intelligence on LLMOps, FinOps, and AI infrastructure. No fluff, no vendor pitches. Written by practitioners, for practitioners.

The Pulse of LLMOps, FinOps& AI Infrastructure

Deep Technical Guides

Cost Optimization Playbooks

Production Incident Frameworks

Latest Articles

Mesh Inference on iroh: GPUs in Three Offices and a Closet

OpenAI Just Made Your Agent a Black Box (and What to Do)

Custom AI Silicon 2026: Meta MTIA, Trainium2, TPU, Maia

Per-Engineer AI Observability 2026: Beat Reflection

Agent Observability at 1,200+ Agents: OTel Stack 2026

Agent Sandbox vs. Agent Substrate: CNCF Runtime 2026

Stay ahead of the stack.

The Pulse of LLMOps, FinOps
& AI Infrastructure