Question 1

Can I migrate existing GPU workloads from AWS, Azure, or GCP to Leafcloud?

Accepted Answer

Yes. Leafcloud uses standard OpenStack APIs and supports common orchestration tools like Kubernetes, and IaC solutions like Terraform and Ansible making migration straightforward.

Question 2

How does Leafcloud's sustainability differ from hyperscalers?

Accepted Answer

Your workload provides people in nursing homes and apartment blocks with emissions-free hot showers. Leafcloud operations are carbon-negative (-1.93 tonnes CO₂/kW-year at Leafsite (figures from 2024)) by reusing server heat to warm water for residential buildings. We don't offset, trade carbon-credits, or hide our emissions in Scope 3—we eliminate emissions through actual heat recovery.

Question 3

How much memory does the RTX 6000 Blackwell have?

Accepted Answer

The NVIDIA RTX 6000 Blackwell has 96GB of GDDR7 ECC memory per GPU with 1,800 GB/s memory bandwidth.

Why 96GB matters for AI workloads:

Large Language Models (LLMs): Run large models in a single GPU with high memory capacity:

Large parameter models with quantization: 70B models with 4-bit quantization fit comfortably
Medium models in full precision (FP16/BF16): 30-40B parameter models run smoothly
Multi-GPU scaling: 2 GPUs = 192GB, 4 GPUs = 384GB total VRAM for even larger models
Multimodal models: Large vision-language models requiring significant context windows

Comparison to other GPUs:

H100 (80GB): 20% more memory per RTX 6000 GPU, plus newer Blackwell architecture
A100 (80GB): Similar capacity, but RTX 6000 has newer architecture with GDDR7
A30 (24GB): 4x less memory - limited to smaller models or aggressive quantization

Memory bandwidth (1,800 GB/s): Critical for inference throughput. Higher bandwidth means faster token generation for LLMs and better performance for batch inference.

ECC (Error-Correcting Code): Enterprise-grade reliability - detects and corrects memory errors during long-running training or inference jobs.

Practical implications: With 96GB GDDR7 memory per GPU and Blackwell architecture, the RTX 6000 offers excellent value for production inference workloads, balancing capacity, performance, and cost efficiency. Scale from 1 to 4 GPUs based on model size requirements.

Question 4

Is my data subject to US jurisdiction on Leafcloud?

Accepted Answer

No. All infrastructure is in Amsterdam, Netherlands. Your data never leaves Europe, ensuring full GDPR compliance without US CLOUD Act concerns.

Question 5

RTX 6000 Blackwell vs H100: Which GPU for inference?

Accepted Answer

Choose RTX 6000 Blackwell for cost-effective inference with newer architecture and more memory, or H100 for maximum training throughput and FP8 optimization. Here's how they compare:

RTX 6000 Blackwell (96GB GDDR7) - Inference Focused:

Architecture: Blackwell (2024) - newest generation with 5th-gen Tensor Cores
Memory: 96GB GDDR7 per GPU (20% more than H100)
Memory bandwidth: 1,800 GB/s per GPU
Power: ~300W TDP (estimated) - more efficient than H100 for inference
Cost: €2.76/hour on-demand (€2.35/hour with commitment) - 20% cheaper than H100
Availability: 1x, 2x, 4x configurations (available now)
Best for: Inference, fine-tuning, multimodal AI, production deployments

H100 (80GB HBM3) - Training and Inference:

Architecture: Hopper (2022) - 4th-gen Tensor Cores
Memory: 80GB HBM3 per GPU
Memory bandwidth: 3.35 TB/s per GPU (1.86x faster than RTX 6000)
Power: 700W TDP - highest performance density for training
Cost: €3.45/hour on-demand - premium performance
Availability: 1x configuration only (Leafcloud)
Best for: Large-scale training, FP8 optimization, cutting-edge research

Key Differences:

Memory Capacity:

RTX 6000 Blackwell: 96GB per GPU = supports larger models per GPU
- Example: Run Llama 3 70B with less aggressive quantization
- Multi-GPU: 2x = 192GB, 4x = 384GB total VRAM
H100: 80GB per GPU = industry-proven capacity
- Example: Run Llama 2 70B with INT8 quantization

Memory Bandwidth:

H100: 3.35 TB/s = faster data throughput for training
RTX 6000 Blackwell: 1,800 GB/s = sufficient for inference, slower for training

Architecture Generation:

RTX 6000 Blackwell: Newer 5th-gen Tensor Cores (2024)
H100: 4th-gen Tensor Cores (2022)

When to choose RTX 6000 Blackwell:

Inference workloads: Serving large language models (70B-405B parameters) with vLLM or TensorRT-LLM
Cost optimization: 20% cheaper than H100 (€2.35/hour committed vs €3.45/hour H100)
Memory-intensive models: Larger batch sizes or longer context windows (96GB vs 80GB)
Multi-GPU inference: Scale to 4x GPUs (384GB total) for very large models
Fine-tuning: LoRA/QLoRA fine-tuning of 70B+ models
Production deployments: Power-efficient inference for sustained workloads

When to choose H100:

Large-scale training: Training models from scratch (not just fine-tuning)
FP8 optimization: Workloads leveraging Transformer Engine for FP8 training
Maximum bandwidth: Memory-bandwidth-bound workloads requiring 3.35 TB/s
Proven at scale: Battle-tested in production for 2+ years

Real-world comparison (Llama 3 70B inference):

RTX 6000 Blackwell: ~50-70 tokens/second @ €2.35/hour (committed)
H100: ~60-80 tokens/second @ €3.45/hour
Cost efficiency: RTX 6000 Blackwell provides ~95% of H100 performance at 32% lower cost

Real-world comparison (Fine-tuning 70B model with LoRA):

RTX 6000 Blackwell: Supports full fine-tuning with 96GB memory, sufficient bandwidth
H100: Faster fine-tuning due to higher memory bandwidth (3.35 TB/s)
Cost: RTX 6000 Blackwell 32% cheaper for overnight fine-tuning runs (€2.35/hour committed vs €3.45/hour H100)

Multi-GPU scenarios:

RTX 6000 Blackwell Quad Pro (4x GPUs): 384GB total VRAM @ €11.04/hour on-demand
- Deploy 405B parameter models with quantization
H100 (1x GPU only): 80GB @ €3.45/hour
- Single GPU limits scalability for very large models

Recommendation:

For inference and fine-tuning: RTX 6000 Blackwell offers better value with newer architecture, more memory, and lower cost
For large-scale training: H100 provides faster training throughput with higher memory bandwidth
For production deployment: RTX 6000 Blackwell is the new default for inference workloads, VM included

Leafcloud offers RTX 6000 Blackwell now in Amsterdam with configurations from 1x to 4x GPUs, providing cost-effective inference infrastructure with EU sovereignty.

Question 6

What are the networking egress fees on Leafcloud?

Accepted Answer

Leafcloud has no hidden egress fees—a major cost saving compared to hyperscalers where data transfer costs can significantly increase your total bill. Leafcloud maintains a fair-use policy for network traffic. See Leafcloud Terms & Conditions for more details.

Question 7

What is the NVIDIA RTX 6000 Blackwell?

Accepted Answer

The NVIDIA RTX 6000 Blackwell is NVIDIA's 5th-generation professional GPU for AI and HPC workloads, launched in 2024-2025 as part of the Blackwell architecture family.

Key specifications:

96GB GDDR7 ECC memory per GPU: High-capacity VRAM for large models and batch sizes
1,800 GB/s memory bandwidth: High data throughput for inference-heavy workloads
5th-generation Tensor Cores: Optimized for FP8, FP16, and INT8 inference with 2x throughput over Hopper architecture
PCIe Gen5 interface: High-speed connectivity for data center deployment

Comparison to H100:

Memory: 96GB vs 80GB (20% more capacity per GPU)
Newer architecture: Blackwell (2024) vs Hopper (2022)
Better FP8 support: Native FP8 Tensor Cores for efficient inference
Lower power per TFLOP: More efficient for sustained workloads

Enterprise features:

ECC memory (error-correcting code) for data integrity
Multi-GPU configurations: Scale from 1 to 4 GPUs (96GB to 384GB total VRAM)
Professional driver support and long-term availability
Validated for AI frameworks (PyTorch, TensorFlow, JAX, vLLM, TensorRT-LLM)

Ideal workloads: LLM inference (large parameter models), model fine-tuning, multimodal AI, video processing at scale, HPC simulations, scientific computing requiring high memory capacity.

Leafcloud configurations:

Three configurations available starting from €2.35/hour with commitment (€2.76/hour on-demand):

Blackwell Pro (1 GPU): 32 vCPU, 256GB RAM, 2TB NVMe - €2.76/hour on-demand (€2.35/hour with commitment)
Blackwell Duo Pro (2 GPUs): 64 vCPU, 512GB RAM, 4TB NVMe - €5.52/hour on-demand
Blackwell Quad Pro (4 GPUs): 128 vCPU, 1TB RAM, 8TB NVMe - €11.04/hour on-demand

Available now on Leafcloud infrastructure in Amsterdam, Netherlands. Commitment discounts available for 6, 12, and 36-month terms.

Question 8

What workloads are best suited for the RTX 6000 Blackwell?

Accepted Answer

The RTX 6000 Blackwell is optimized for workloads requiring high memory capacity (96GB per GPU) and efficient inference with Blackwell architecture. Scale from 1 to 4 GPUs based on your needs. Ideal use cases:

AI Inference (Production):

LLM serving: Deploy large language models (70B+ parameters) with vLLM or TensorRT-LLM for chatbots, content generation, code assistants
Multimodal AI: Vision-language models (CLIP, Flamingo), text-to-image (Stable Diffusion XL), image understanding
Real-time inference: Low-latency applications requiring consistent sub-second response times
Batch inference: High-throughput workloads processing thousands of requests per hour
Multi-GPU scaling: Deploy 405B+ parameter models with Blackwell Duo Pro (2 GPUs) or Quad Pro (4 GPUs)

Model Fine-tuning & Training:

Fine-tune large models (70B+) on domain-specific data with LoRA/QLoRA
Train mid-to-large models (7B-70B) from scratch
Experiment with model architectures in single or multi-GPU setups

Video & Media Processing:

Real-time video encoding/transcoding with GPU-accelerated FFmpeg
AI video upscaling and enhancement (4K/8K workflows)
Live streaming pipelines with Apache Kafka + GPU processing
Broadcast-quality media production

Computer Vision:

Object detection and tracking at scale (surveillance, autonomous systems)
Image processing pipelines (medical imaging, satellite imagery)
Real-time visual AI (manufacturing quality control, retail analytics)

Scientific Computing & HPC:

Climate modeling and weather forecasting
Molecular dynamics simulations (drug discovery, materials science)
Financial modeling (risk analysis, options pricing)
Genomics and bioinformatics (sequence alignment, protein folding)

When to choose RTX 6000 Blackwell over H100: The RTX 6000 Blackwell offers newer Blackwell architecture with 96GB GDDR7 memory per GPU (20% more than H100), making it ideal for inference workloads requiring high memory capacity and bandwidth. For pure training throughput, H100 remains strong, but RTX 6000 Blackwell excels for inference, fine-tuning, and cost-efficient deployment at €2.35/hour with commitment (€2.76/hour on-demand), VM included.

Question 9

When will the RTX 6000 Blackwell be available on Leafcloud?

Accepted Answer

RTX 6000 Blackwell is available now. Deploy immediately via the Leafcloud dashboard or API.

Available configurations:

Three configurations available:

Blackwell Pro (1 GPU, 32 vCPU, 256GB RAM, 2TB NVMe): €2.76/hour on-demand (€2.35/hour with commitment)
Blackwell Duo Pro (2 GPUs, 64 vCPU, 512GB RAM, 4TB NVMe): €5.52/hour on-demand
Blackwell Quad Pro (4 GPUs, 128 vCPU, 1TB RAM, 8TB NVMe): €11.04/hour on-demand

Commitment discounts available for 6, 12, and 36-month terms.

Get started at /rtx6000 or deploy directly via /signup.

Question 10

Why choose Leafcloud over AWS, Azure, or Google Cloud for GPU computing?

Accepted Answer

Leafcloud offers lower TCO with no egress fees, EU data sovereignty (GDPR-native), 100% renewable energy and local heat-reuse resulting in actual emissions reduction. Open source technologies provide easy and repeatable deployments, portability, and avoid vendor lock-in. The Amsterdam-based infrastructure provides low latency to European customers, and pricing is predictable without hidden costs.

Provider	Price per Hour	Bandwidth	CPU / RAM	Storage	Region
RunPod	$1.89	1.5TB/s	16 cores / 188GB	—	Global
Leafcloud	€2.76	1.5TB/s	32 cores / 256GB	2TB NVMe	Amsterdam, NL	true
AWS	$3.36	1.5TB/s	8 cores / 64GB	—	US/EU
Lambda Labs	N/A	N/A	—	—	—
Azure	N/A	N/A	—	—	—

96GB GDDR7 per GPU

From €2.35/hour

Amsterdam, Netherlands

True Sustainability

5th Gen Tensor Cores

96 GB GDDR7 ECC Memory

Integrated Media Pipeline

AI Inference at Scale

Model Training & Fine-Tuning

Media & Streaming

On Demand

6 Month

12 Month

36 Month

Unbeatable sustainability

EU Data Sovereignty

Lower TCO than Hyperscalers

Open Standards & Kubernetes

No US CLOUD Act Exposure

GDPR Native Data Residency

ISO 27001 & SOC 2 Type II Certified

Regulated Industries Support

96GB Memory Capacity

Model Compatibility

EU Availability

Blackwell vs Hopper