Skip to main content

RTX 6000 Blackwell

Next-Gen AI on Climate-Positive Infrastructure

NVIDIA RTX 6000 Blackwell GPUs on Europe's most sustainable cloud infrastructure. Enterprise GPU hosting for AI training, inference, and ML workloads. Up to 4 GPUs with 96GB memory each, from €2.35/hour. Deploy with ease using IaC & Kubernetes.

96GB GDDR7 per GPU

Blackwell architecture

From €2.35/hour

1-4 GPUs, VM included

Amsterdam, Netherlands

EU-sovereign infrastructure

True Sustainability

Carbon-Negative Workload

Performance

Enterprise-Grade Blackwell Architecture

Built for AI inference, training, and multimodal workloads with next-generation Tensor Cores and massive memory bandwidth.

5th Gen Tensor Cores

Advanced Transformer Engine for LLM inference and training with significantly improved performance over previous generations.

96 GB GDDR7 ECC Memory

Enterprise-grade error-correcting memory with 1.79 TB/s bandwidth for the most demanding AI workloads.

Integrated Media Pipeline

Hardware-accelerated AV1 encoding, video processing, and real-time streaming capabilities.

AI Inference at Scale

Deploy LLMs, AI agents, and generative AI APIs with ultra-low latency. Perfect for chatbots, semantic search, and content generation.

Model Training & Fine-Tuning

Train diffusion models, fine-tune LLMs, and explore multimodal AI with 5th-gen Tensor Cores and massive memory bandwidth.

Media & Streaming

Real-time AV1 encoding, transcoding, and video streaming pipelines. Ideal for broadcasters and content platforms.

Pricing

Simple, Predictable Pricing

Three configurations available—Blackwell Pro (1 GPU), Blackwell Duo Pro (2 GPUs), Blackwell Quad Pro (4 GPUs). From flexible on-demand to committed long-term rates. Lock in access while available.

On Demand

Placeholder
€2.76 per hour

No commitment

  • Pay as you go
  • Full flexibility
  • Carbon-negative workload

6 Month

5% discount
€2.62 per hour

6 month commitment

  • Predictable costs
  • Production workloads
  • Carbon-negative workload
Most Popular

12 Month

10% discount
€2.48 per hour

12 month commitment

  • Best value
  • Annual budgeting
  • Carbon-negative workload

36 Month

15% discount
€2.35 per hour

36 month commitment

  • Lowest rate
  • Cost optimization
  • Carbon-negative workload
1
GPUs × 96GB
32
vCPU Cores
256 GB
System RAM
2 TB
NVMe Storage
96GB
GDDR7 ECC per GPU
5th Gen
Tensor Cores
789 kg
CO₂ Saved per Year
Deploy Now

Every kilowatt of computing eliminates natural gas heating emissions.

RTX 6000 Market Comparison

Compare RTX 6000 Blackwell Across Cloud Providers

Leafcloud offers competitive RTX 6000 pricing in Europe with 4x the CPU cores and superior RAM compared to AWS. Includes full VM configuration with 2TB NVMe storage.

Provider Price per Hour Bandwidth CPU / RAM Storage Region
RunPod $1.89 1.5TB/s 16 cores / 188GB Global
Leafcloud €2.76 1.5TB/s 32 cores / 256GB 2TB NVMe Amsterdam, NL
AWS $3.36 1.5TB/s 8 cores / 64GB US/EU
Lambda Labs N/A N/A
Azure N/A N/A
Prices shown in original currency (USD for RunPod/AWS, EUR for Leafcloud). As of May 2026, €1 ≈ $1.08Source: cloud-gpus.com

Why Leafcloud

GPU Hosting in the Netherlands with Climate-Positive Infrastructure

Deploy GPU Kubernetes clusters or Virtual Machines in Amsterdam that transform compute heat into a community asset. Open, predictable, and affordable European GPU cloud infrastructure.

Unbeatable sustainability

All infrastructure powered by verified renewable sources. No carbon credits or accounting tricks. Server heat delivers Real impact: Your workload provides free hot showers to nursing homes and residential blocks in Amsterdam

EU Data Sovereignty

Amsterdam datacenter. GDPR native. ISO 27001 and SOC2 Type II certified. Your data stays in Europe. No US CLOUD Act concerns.

Lower TCO than Hyperscalers

No egress fees. No hidden costs. Just simple, predictable pricing. Save over 30% compared to traditional providers

Open Standards & Kubernetes

Scalable, vendor-neutral solutions for dedicated & High-Availability machines. Use Kubernetes or common infrastructure as code solutions like Terraform and Ansible.

EU-Sovereign AI Infrastructure

HAVEN+ Compatible, GDPR Compliant, No US CLOUD Act

Leafcloud is a Dutch B.V. (besloten vennootschap) incorporated and headquartered in Amsterdam, Netherlands. No US parent company, no exposure to US legal jurisdiction. All data, including AI training data, model weights, and inference results, remains under Dutch law and EU GDPR. Physical servers located at Amsterdam Core facility (Tier III datacenter). Critical for healthcare, finance, government, and regulated industries requiring data sovereignty under NIS2, DORA, and CSRD regulations.

EU-Sovereign AI Infrastructure HAVEN+ Compatible, GDPR Compliant, No US CLOUD Act

No US CLOUD Act Exposure

Unlike AWS (subject to US CLOUD Act), Azure (US jurisdiction), or GCP (US jurisdiction), Leafcloud is EU-owned with no US parent. US government data requests must go through proper MLAT channels with EU oversight. Your AI training data and model weights cannot be compelled by non-EU authorities.

GDPR Native Data Residency

All persistent data (volumes, object storage, snapshots, backups) stored in Amsterdam. No third-country transfers without explicit instruction. GDPR compliance built-in, not bolted-on. Data Processing Agreement (DPA) available. Full compliance with EU General Data Protection Regulation.

ISO 27001 & SOC 2 Type II Certified

Independently certified for information security management (ISO 27001) and third-party audited for security, availability, confidentiality (SOC 2 Type II). HAVEN+ certification in progress for Dutch public sector cloud requirements.

Regulated Industries Support

NIS2 compliant infrastructure for critical infrastructure operators. DORA ready for financial institutions requiring operational resilience. CSRD-ready reporting for sustainability disclosures. AI Act compatible for high-risk AI systems requiring data governance and accountability.

Carbon Reducing

Calculate Your Yearly Emissions Reduction

Our compute heavy machines are housed in apartment complexes and care homes. That means your workload reduces emissions for heating shower water by replacing natural gas use. With the heat from your workload people get a hot shower! Find out by how much you can reduce emissions

Use Cases

GPU Use Cases

Run AI workloads on GPU-accelerated Kubernetes clusters or VMs in the Netherlands. From machine learning training to real-time inference—all powered by Europe's most sustainable infrastructure.

Data Analytics

Data Analytics

GPU-accelerated ETL, RAPIDS workflows, and big data processing faster than CPU-only clusters.

AI inference

Next-gen AI inference

Deploy large language models (LLMs) with low latency and high throughput. Whether you’re powering a chatbot, content generation, or knowledge retrieval, Blackwell helps you scale smoothly.

Media & streaming

Media & streaming pipelines

Accelerate real-time encoding, transcoding, and streaming with GPU-optimized tools like FFmpeg and Apache Kafka. Perfect for video platforms, broadcasters, or live-event applications.

Research

Scientific Research

Climate simulations, genomics, material science, and advanced computational research with enterprise-grade reliability.

Computer Vision

Computer Vision

Real-time object detection, image processing, and visual AI applications powered by optimized inference.

HPC & Simulation

HPC & Simulation

High-performance computing for financial modeling, weather forecasting, molecular dynamics, and complex engineering simulations at unprecedented scale.

What is Blackwell?

NVIDIA's 5th Generation Architecture for AI Inference

Blackwell is NVIDIA's newest GPU architecture (2024), succeeding Hopper. RTX 6000 Blackwell features 5th-generation Tensor Cores optimized for FP8 and FP16 inference workloads. Each GPU provides 96GB GDDR7 memory, 20% more than H100 (80GB HBM3). Memory bandwidth of 1,800 GB/s enables high-throughput inference for large language models and multimodal AI. Power consumption approximately 300W TDP, more efficient than H100 (700W) for inference workloads. Available exclusively in Europe through Leafcloud Amsterdam infrastructure.

96GB Memory Capacity

Run 70B parameter models in full FP16 precision, 405B models with 4-bit quantization. Supports multimodal models requiring large VRAM (Flamingo, CLIP variants). Multi-GPU scaling, 2x GPUs = 192GB, 4x GPUs = 384GB total.

Model Compatibility

Llama 3.1 405B (4-bit quantization), Llama 2 70B (full precision FP16/BF16), GPT-J 6B through Falcon 180B, multimodal vision-language models. Larger batch sizes and longer context windows than 80GB alternatives.

EU Availability

First EU-sovereign provider offering Blackwell GPUs at this memory tier. Amsterdam data center, Dutch ownership, no US CLOUD Act exposure. HAVEN+ compatible infrastructure for regulated industries.

Blackwell vs Hopper

20% more memory (96GB vs 80GB H100). Newer 5th-gen Tensor Cores (2024 vs 2022). Lower power consumption for inference (300W vs 700W). 32% lower cost (€2.35/hour committed vs €3.45/hour H100).

Deploy RTX 6000 Blackwell GPUs

Launch your AI workloads on next-generation Blackwell architecture with Kubernetes orchestration on Europe's most sustainable cloud infrastructure.

Any Questions?

Yes. Leafcloud uses standard OpenStack APIs and supports common orchestration tools like Kubernetes, and IaC solutions like Terraform and Ansible making migration straightforward.

Your workload provides people in nursing homes and apartment blocks with emissions-free hot showers. Leafcloud operations are carbon-negative (-1.93 tonnes CO₂/kW-year at Leafsite (figures from 2024)) by reusing server heat to warm water for residential buildings. We don't offset, trade carbon-credits, or hide our emissions in Scope 3—we eliminate emissions through actual heat recovery.

The NVIDIA RTX 6000 Blackwell has 96GB of GDDR7 ECC memory per GPU with 1,800 GB/s memory bandwidth.

Why 96GB matters for AI workloads:

Large Language Models (LLMs): Run large models in a single GPU with high memory capacity:

  • Large parameter models with quantization: 70B models with 4-bit quantization fit comfortably
  • Medium models in full precision (FP16/BF16): 30-40B parameter models run smoothly
  • Multi-GPU scaling: 2 GPUs = 192GB, 4 GPUs = 384GB total VRAM for even larger models
  • Multimodal models: Large vision-language models requiring significant context windows

Comparison to other GPUs:

  • H100 (80GB): 20% more memory per RTX 6000 GPU, plus newer Blackwell architecture
  • A100 (80GB): Similar capacity, but RTX 6000 has newer architecture with GDDR7
  • A30 (24GB): 4x less memory - limited to smaller models or aggressive quantization

Memory bandwidth (1,800 GB/s): Critical for inference throughput. Higher bandwidth means faster token generation for LLMs and better performance for batch inference.

ECC (Error-Correcting Code): Enterprise-grade reliability - detects and corrects memory errors during long-running training or inference jobs.

Practical implications: With 96GB GDDR7 memory per GPU and Blackwell architecture, the RTX 6000 offers excellent value for production inference workloads, balancing capacity, performance, and cost efficiency. Scale from 1 to 4 GPUs based on model size requirements.

No. All infrastructure is in Amsterdam, Netherlands. Your data never leaves Europe, ensuring full GDPR compliance without US CLOUD Act concerns.

Choose RTX 6000 Blackwell for cost-effective inference with newer architecture and more memory, or H100 for maximum training throughput and FP8 optimization. Here's how they compare:

RTX 6000 Blackwell (96GB GDDR7) - Inference Focused:

  • Architecture: Blackwell (2024) - newest generation with 5th-gen Tensor Cores
  • Memory: 96GB GDDR7 per GPU (20% more than H100)
  • Memory bandwidth: 1,800 GB/s per GPU
  • Power: ~300W TDP (estimated) - more efficient than H100 for inference
  • Cost: €2.76/hour on-demand (€2.35/hour with commitment) - 20% cheaper than H100
  • Availability: 1x, 2x, 4x configurations (available now)
  • Best for: Inference, fine-tuning, multimodal AI, production deployments

H100 (80GB HBM3) - Training and Inference:

  • Architecture: Hopper (2022) - 4th-gen Tensor Cores
  • Memory: 80GB HBM3 per GPU
  • Memory bandwidth: 3.35 TB/s per GPU (1.86x faster than RTX 6000)
  • Power: 700W TDP - highest performance density for training
  • Cost: €3.45/hour on-demand - premium performance
  • Availability: 1x configuration only (Leafcloud)
  • Best for: Large-scale training, FP8 optimization, cutting-edge research

Key Differences:

Memory Capacity:

  • RTX 6000 Blackwell: 96GB per GPU = supports larger models per GPU
    • Example: Run Llama 3 70B with less aggressive quantization
    • Multi-GPU: 2x = 192GB, 4x = 384GB total VRAM
  • H100: 80GB per GPU = industry-proven capacity
    • Example: Run Llama 2 70B with INT8 quantization

Memory Bandwidth:

  • H100: 3.35 TB/s = faster data throughput for training
  • RTX 6000 Blackwell: 1,800 GB/s = sufficient for inference, slower for training

Architecture Generation:

  • RTX 6000 Blackwell: Newer 5th-gen Tensor Cores (2024)
  • H100: 4th-gen Tensor Cores (2022)

When to choose RTX 6000 Blackwell:

  1. Inference workloads: Serving large language models (70B-405B parameters) with vLLM or TensorRT-LLM
  2. Cost optimization: 20% cheaper than H100 (€2.35/hour committed vs €3.45/hour H100)
  3. Memory-intensive models: Larger batch sizes or longer context windows (96GB vs 80GB)
  4. Multi-GPU inference: Scale to 4x GPUs (384GB total) for very large models
  5. Fine-tuning: LoRA/QLoRA fine-tuning of 70B+ models
  6. Production deployments: Power-efficient inference for sustained workloads

When to choose H100:

  1. Large-scale training: Training models from scratch (not just fine-tuning)
  2. FP8 optimization: Workloads leveraging Transformer Engine for FP8 training
  3. Maximum bandwidth: Memory-bandwidth-bound workloads requiring 3.35 TB/s
  4. Proven at scale: Battle-tested in production for 2+ years

Real-world comparison (Llama 3 70B inference):

  • RTX 6000 Blackwell: ~50-70 tokens/second @ €2.35/hour (committed)
  • H100: ~60-80 tokens/second @ €3.45/hour
  • Cost efficiency: RTX 6000 Blackwell provides ~95% of H100 performance at 32% lower cost

Real-world comparison (Fine-tuning 70B model with LoRA):

  • RTX 6000 Blackwell: Supports full fine-tuning with 96GB memory, sufficient bandwidth
  • H100: Faster fine-tuning due to higher memory bandwidth (3.35 TB/s)
  • Cost: RTX 6000 Blackwell 32% cheaper for overnight fine-tuning runs (€2.35/hour committed vs €3.45/hour H100)

Multi-GPU scenarios:

  • RTX 6000 Blackwell Quad Pro (4x GPUs): 384GB total VRAM @ €11.04/hour on-demand
    • Deploy 405B parameter models with quantization
  • H100 (1x GPU only): 80GB @ €3.45/hour
    • Single GPU limits scalability for very large models

Recommendation:

  • For inference and fine-tuning: RTX 6000 Blackwell offers better value with newer architecture, more memory, and lower cost
  • For large-scale training: H100 provides faster training throughput with higher memory bandwidth
  • For production deployment: RTX 6000 Blackwell is the new default for inference workloads, VM included

Leafcloud offers RTX 6000 Blackwell now in Amsterdam with configurations from 1x to 4x GPUs, providing cost-effective inference infrastructure with EU sovereignty.

Leafcloud has no hidden egress fees—a major cost saving compared to hyperscalers where data transfer costs can significantly increase your total bill. Leafcloud maintains a fair-use policy for network traffic. See Leafcloud Terms & Conditions for more details.

The NVIDIA RTX 6000 Blackwell is NVIDIA's 5th-generation professional GPU for AI and HPC workloads, launched in 2024-2025 as part of the Blackwell architecture family.

Key specifications:

  • 96GB GDDR7 ECC memory per GPU: High-capacity VRAM for large models and batch sizes
  • 1,800 GB/s memory bandwidth: High data throughput for inference-heavy workloads
  • 5th-generation Tensor Cores: Optimized for FP8, FP16, and INT8 inference with 2x throughput over Hopper architecture
  • PCIe Gen5 interface: High-speed connectivity for data center deployment

Comparison to H100:

  • Memory: 96GB vs 80GB (20% more capacity per GPU)
  • Newer architecture: Blackwell (2024) vs Hopper (2022)
  • Better FP8 support: Native FP8 Tensor Cores for efficient inference
  • Lower power per TFLOP: More efficient for sustained workloads

Enterprise features:

  • ECC memory (error-correcting code) for data integrity
  • Multi-GPU configurations: Scale from 1 to 4 GPUs (96GB to 384GB total VRAM)
  • Professional driver support and long-term availability
  • Validated for AI frameworks (PyTorch, TensorFlow, JAX, vLLM, TensorRT-LLM)

Ideal workloads: LLM inference (large parameter models), model fine-tuning, multimodal AI, video processing at scale, HPC simulations, scientific computing requiring high memory capacity.

Leafcloud configurations:

Three configurations available starting from €2.35/hour with commitment (€2.76/hour on-demand):

  • Blackwell Pro (1 GPU): 32 vCPU, 256GB RAM, 2TB NVMe - €2.76/hour on-demand (€2.35/hour with commitment)
  • Blackwell Duo Pro (2 GPUs): 64 vCPU, 512GB RAM, 4TB NVMe - €5.52/hour on-demand
  • Blackwell Quad Pro (4 GPUs): 128 vCPU, 1TB RAM, 8TB NVMe - €11.04/hour on-demand

Available now on Leafcloud infrastructure in Amsterdam, Netherlands. Commitment discounts available for 6, 12, and 36-month terms.

The RTX 6000 Blackwell is optimized for workloads requiring high memory capacity (96GB per GPU) and efficient inference with Blackwell architecture. Scale from 1 to 4 GPUs based on your needs. Ideal use cases:

AI Inference (Production):

  • LLM serving: Deploy large language models (70B+ parameters) with vLLM or TensorRT-LLM for chatbots, content generation, code assistants
  • Multimodal AI: Vision-language models (CLIP, Flamingo), text-to-image (Stable Diffusion XL), image understanding
  • Real-time inference: Low-latency applications requiring consistent sub-second response times
  • Batch inference: High-throughput workloads processing thousands of requests per hour
  • Multi-GPU scaling: Deploy 405B+ parameter models with Blackwell Duo Pro (2 GPUs) or Quad Pro (4 GPUs)

Model Fine-tuning & Training:

  • Fine-tune large models (70B+) on domain-specific data with LoRA/QLoRA
  • Train mid-to-large models (7B-70B) from scratch
  • Experiment with model architectures in single or multi-GPU setups

Video & Media Processing:

  • Real-time video encoding/transcoding with GPU-accelerated FFmpeg
  • AI video upscaling and enhancement (4K/8K workflows)
  • Live streaming pipelines with Apache Kafka + GPU processing
  • Broadcast-quality media production

Computer Vision:

  • Object detection and tracking at scale (surveillance, autonomous systems)
  • Image processing pipelines (medical imaging, satellite imagery)
  • Real-time visual AI (manufacturing quality control, retail analytics)

Scientific Computing & HPC:

  • Climate modeling and weather forecasting
  • Molecular dynamics simulations (drug discovery, materials science)
  • Financial modeling (risk analysis, options pricing)
  • Genomics and bioinformatics (sequence alignment, protein folding)

When to choose RTX 6000 Blackwell over H100: The RTX 6000 Blackwell offers newer Blackwell architecture with 96GB GDDR7 memory per GPU (20% more than H100), making it ideal for inference workloads requiring high memory capacity and bandwidth. For pure training throughput, H100 remains strong, but RTX 6000 Blackwell excels for inference, fine-tuning, and cost-efficient deployment at €2.35/hour with commitment (€2.76/hour on-demand), VM included.

RTX 6000 Blackwell is available now. Deploy immediately via the Leafcloud dashboard or API.

Available configurations:

Three configurations available:

  • Blackwell Pro (1 GPU, 32 vCPU, 256GB RAM, 2TB NVMe): €2.76/hour on-demand (€2.35/hour with commitment)
  • Blackwell Duo Pro (2 GPUs, 64 vCPU, 512GB RAM, 4TB NVMe): €5.52/hour on-demand
  • Blackwell Quad Pro (4 GPUs, 128 vCPU, 1TB RAM, 8TB NVMe): €11.04/hour on-demand

Commitment discounts available for 6, 12, and 36-month terms.

Get started at /rtx6000 or deploy directly via /signup.

Leafcloud offers lower TCO with no egress fees, EU data sovereignty (GDPR-native), 100% renewable energy and local heat-reuse resulting in actual emissions reduction. Open source technologies provide easy and repeatable deployments, portability, and avoid vendor lock-in. The Amsterdam-based infrastructure provides low latency to European customers, and pricing is predictable without hidden costs.