Hyperstack

Hyperstack Hyperstack is a global, UK-owned, European-compliant full stack AI cloud built for teams that need powerful, flexible infrastructure at scale.

07/05/2026

One model. Video, audio, images, and documents - from a single endpoint.

We deployed NVIDIA Nemotron 3 Nano Omni on Hyperstack and put its multimodal pipeline to work.

In this tutorial:
→ vLLM serving on a single NVIDIA H100 80GB (62 GB BF16 checkpoint)
→ 256K token context window with native reasoning mode
→ PDF extraction - structured JSON from complex financial documents
→ Hour-long audio transcription with word-level timestamps and action-item extraction
→ Video summarisation and temporal Q&A from a single prompt
→ Disabling thinking mode for latency-sensitive tasks

67.04 on OCRBenchV2. 89.39 on VoiceBench. 72.2 on Video-MME. One deployment.

Full tutorial on the blog: https://bit.ly/3QMhCmf

06/05/2026

1.6 trillion parameters. 49B active per token. Too large for a single node.

We deployed DeepSeek-V4-Pro on Hyperstack using multi-node Kubernetes - 16 NVIDIA H100s across two worker nodes, hybrid Data + Expert Parallelism, and a 960 GB FP4+FP8 checkpoint loaded from local NVMe.

In this tutorial:
→ Multi-node Kubernetes cluster on Hyperstack (2x 8x NVIDIA H100-80G PCIe-NVLink)
→ LeaderWorkerSet API for coordinated 2-node inference
→ vLLM with hybrid DEP topology and MTP speculative decoding
→ 1M token context window with three reasoning tiers
→ Long-horizon autonomous code refactoring with self-correction
→ Plugging into Claude Code, OpenClaw, and OpenCode as a local backend

80.6 on SWE-Bench Verified. 93.5 on LiveCodeBench v6.

Full tutorial on the blog: https://bit.ly/4ucugd7

Running Kubernetes or SLURM in-house is a full-time job.Hyperstack Managed Cluster Platform hands you a fully managed cl...
05/05/2026

Running Kubernetes or SLURM in-house is a full-time job.

Hyperstack Managed Cluster Platform hands you a fully managed cluster environment - delivered at the orchestrator layer, so your team focuses on models, not maintenance.

GPU infrastructure. Fully managed. Ready to scale.

Enquire now 👉 https://bit.ly/4tSGoiY

29/04/2026

1 trillion parameters on 8 GPUs. Here's what that looks like.

We deployed Kimi K2.6 on Hyperstack - Moonshot AI's open-weight agentic model. In this video:

→ vLLM serving on 8x NVIDIA H100-80G PCIe
→ 595 GB of INT4 weights loaded from ephemeral NVMe in ~6 minutes
→ Autonomous multi-step refactoring with self-correction
→ Coding-driven design - single prompt to working website
→ Local backend for Claude Code, OpenClaw and Kimi Code CLI

32B active parameters per token. 256K context window. 300 sub-agents in a single run.

Full tutorial on our blog: https://bit.ly/49jibdo

27/04/2026

We deployed Qwen 3.6 on Hyperstack and turned it into a fully autonomous coding agent.

What you'll see in this video:

→ vLLM server running on 8x NVIDIA H100 PCIe GPUs
→ 262K token context window
→ The model organising files on its own using MCP tools
→ Building and saving a website autonomously
→ Plugging into Claude Code, OpenClaw and Qwen Code as a local backend

35B total parameters, 3B active per token. That's what Mixture-of-Experts buys you - large-model intelligence at small-model speed.

Full step-by-step tutorial on the blog: https://bit.ly/499XpwI

Qwen3

The teams shipping production AI right now aren't just picking GPUs.They're asking: how do I manage infrastructure witho...
15/04/2026

The teams shipping production AI right now aren't just picking GPUs.

They're asking: how do I manage infrastructure without slowing down engineers? How do I enforce security at the cluster level without adding friction? How do I give the right access to the right people without over-provisioning?

That's what we focused on in March.

We shipped an MCP Server that lets you manage Hyperstack through natural language - no raw API calls, no dashboard hopping. We added a Firewall API so security rules propagate automatically across Kubernetes clusters. We published a full deployment guide for NVIDIA's NemoClaw following its GTC 2026 announcement. And we improved cluster deployment defaults so things work correctly out of the box.

Plus role management improvements, better error handling and VM lifecycle fixes.

Infrastructure should get out of your way, not create more work.

Full March update here: https://bit.ly/4tbxRHv

27/03/2026

Your infrastructure, one prompt away ⚡

We shipped the Hyperstack MCP server. Here's what you get:

✔️ One-line deployment - spin it up with Docker and you're ready to go

✔️ Talk to your infrastructure - ask about your VMs, storage, clusters, billing, and more

✔️ No dashboards, no CLI commands, just plain English

✔️ AI Studio Inference integration - connect with your favourite models from Hyperstack AI Studio

✔️ 37+ tools - all from a single conversational interface

Watch what it can do ↓

👉 Get started now: https://bit.ly/485VqJh

Address

TechSpace. 9-13 Street Andrew St, 3rd Floor
London
EC4A3AF

Alerts

Be the first to know and let us send you an email when Hyperstack posts news and promotions. Your email address will not be used for any other purpose, and you can unsubscribe at any time.

Contact The Business

Send a message to Hyperstack:

Share