This website uses cookies to anonymously analyze website traffic using Google Analytics.

NVIDIA HGX B200

NVIDIA Blackwell platform has arrived on Together AI.

High-performance clusters designed for teams running large-scale LLM and mixture-of-experts (MoE) models.

Why NVIDIA HGX B200 on Together GPU Clusters?

The world’s most powerful AI infrastructure. Delivered faster. Tuned smarter.

  • Train faster on 8-GPU HGX nodes

    Each HGX B200 system includes 8 Blackwell GPUs with 2nd-gen Transformer Engine and FP8 precision. We tune communication libraries and parallelism strategies for maximum throughput.

  • Optimized NVLink topologies

    We configure 1.8TB/s NVSwitch fabrics per node and extend to spine-leaf or Clos topologies — optimized for dense LLMs or sparsely activated MoE workloads.

  • High-throughput storage options

    Support for AI-native shared storage systems like VAST and Weka, enabling efficient multi-node checkpointing and large-scale dataset streaming.

  • Deploy and orchestrate at scale

    Together delivers HGX B200 clusters at 1K–10K+ GPU scale with your preferred scheduler (Slurm, Kubernetes, Ray), fully validated across network and storage layout.

  • Delivery in 4–6 weeks, no NVIDIA lottery required

    We ship full-rack NVIDIA HGX B200 clusters — not just dev kits — with thousands of GPUs available now. You don’t wait on backorders. You start training.

  • Run by researchers who train models

    Our research team actively runs and tunes training workloads on NVIDIA GB200 systems. You're not just getting hardware — you’re working with experts at the edge of what's possible.

"What truly elevates Together AI to ClusterMax™ Gold is their exceptional support and technical expertise. Together AI’s team, led by Tri Dao — the inventor of FlashAttention — and their Together Kernel Collection (TKC), significantly boost customer performance. We don’t believe the value created by Together AI can be replicated elsewhere without cloning Tri Dao.”

— Dylan Patel, Chief Analyst, SemiAnalysis

Powering reasoning models and AI agents

The NVIDIA HGX B200 is designed for the most demanding AI, data
analytics, and high-performance computing (HPC) workloads.

Accelerated Compute

Purpose-built for AI: NVIDIA Blackwell SXMs integrated with a high-speed interconnect to accelerate AI performance at scale.

Advanced networking: Using NVIDIA Quantum-X800 InfiniBand and Spectrum™-X Ethernet, NVIDIA HGX delivers networking b up to 800 Gb/s.

Baseboard powerhouse: Each single baseboard joins 8 Blackwell GPUs via fifth-generation NVLink, delivering GPU-to-GPU bandwidth of 1.8TB/s.

LLM TRAINING SPEED
RELATIVE TO H100

3x FASTER

LLM INFERENCE SPEED
RELATIVE TO H100

15x FASTER

ENERGY EFFICIENCY
 & TCO RELATIVE TO H100

12x better

Technical Specs

NVIDIA HGX B200

Blackwell GPUs

8 GPUs

Total FP4 Tensor Core

144 PFLOPS

Total FP8/FP6 Tensor Core

72 PFLOPS

Total Fast Memory

Up to 1.4TB

Total Memory Bandwidth

Up to 62TB/s

Total NVLink Bandwidth

14.4TB/s

FP4 Tensor Core (per GPU)

18 PFLOPS

FP8/FP6 Tensor Core (per GPU)

9 PFLOPS

INT8 Tensor Core

9 POPS

FP16/BF16 Tensor Core

4.5 PFLOPS

TF32 Tensor Core

2.2 PFLOPS

FP32

75 TFLOPS

FP64/FP64 Tensor Core

37 TFLOPS

Multi-Instance GPU (MIG)

7

Decompression Engine

Yes

Decoders

7 NVDEC, 7 nvJPEG

Max Thermal Design Power (TDP)

Configurable up to 1,000W

Interconnect

5th Gen NVLink: 1.8TB/s, PCIe Gen5: 128GB/s

Server Options

NVIDIA HGX B200 partner
and Certified Systems with 8 GPUs

Total FP4 Tensor Core

144 PFLOPS

Forging the AI Frontier with NVIDIA Reference Architecture

As an NVIDIA Cloud Partner, we’re on the leading frontier of optimizing and operating the deployment of NVIDIA GB200 NVL72 GPU clusters.

Learn more

United States

AI Data Centers and Power across the US

Data Center Portfolio

2GW+ in the Portfolio with 600MW of near-term Capacity.

Europe

Expansion Capability in Europe and Beyond

Data Center Portfolio

150MW+ available in Europe: UK, Spain, France, Portugal Iceland also.

Asia / Middle East

Next Frontiers – Asia and the Middle East

Data Center Portfolio

Options available based on the scale of the projects in Asia and the Middle East.

Powering AI Pioneers

Leading AI companies are ramping up with NVIDIA Blackwell running on Together AI.

Zoom partnered with Together AI to leverage our research and deliver accelerated performance when training the models powering various Zoom AI Companion features.

With Together GPU Clusters accelerated by NVIDIA HGX B200, experienced a 1.9X improvement in training speeds out-of-the-box over previous generation NVIDIA Hopper GPUs.

Salesforce leverages Together AI for the entire AI journey: from training, to fine-tuning to inference of their models to deliver Agentforce.

Training a Mistral-24B model, Salesforce saw a 2x improvement in training speeds upgrading from NVIDIA HGX H200 to HGX B200. This is accelerating how Salesforce trains models and integrates research results into Agentforce.

During initial tests with the NVIDIA HGX B200, InVideo immediately saw a 25% improvement when running a training job from NVIDIA HGX H200.

Then, in partnership with our researchers, the team made further  optimizations and more than doubled this improvement – making the step up to the NVIDIA Blackwell platform even more appealing.

Go live with purpose-built NVIDIA HGX B200 GPU clusters

Reserve a cluster