Selecting the appropriate Tensor One Cluster configuration is critical to maximizing your deployment’s performance and efficiency. Factors to consider include:
- GPU type
- RAM and VRAM capacity
- vCPU count
- Storage options (permanent and temporary)
Summary: Know Your Model’s Needs
Before choosing a cluster, make sure you understand your model’s requirements. These can often be found in:- The model card (e.g., on Hugging Face)
- The model’s
config.json
file
Finding Memory Requirements
- Check the model’s README or model card on Hugging Face
- Look for
config.json
files which often specify parameter counts - Search for the model name + “VRAM requirements” or “memory usage”
- Community discussions on Reddit, Discord, or GitHub issues
Key Selection Factors
Focus on the following core aspects when choosing a Cluster:GPU
The GPU plays a vital role in computational tasks especially for graphics-heavy or machine learning applications.Why It Matters
- Enables fast parallel processing
- Reduces compute time for complex workloads
- Essential for AI/ML, image processing, and scientific computing
What to Consider
- Task Requirements: How compute-intensive is your workload?
- Compatibility: Does your software support the GPU type (e.g., CUDA)?
- Efficiency: Consider power usage for long-running processes
VRAM (Video RAM)
VRAM is GPU memory used to store and process large datasets, models, and graphical assets.Why It Matters
- Supports larger models and datasets
- Enables better parallelism
- Crucial for training, inference, and rendering
What to Consider
- Graphics/AI Intensity: High-performance models and rendering tasks need more VRAM
- Concurrent Processing: More VRAM helps with multiple active jobs
- Future-Proofing: Choose higher VRAM for upcoming, larger workloads
Storage (Disk Size)
Storage determines how much data your Cluster can process, retain, and serve during operation.Why It Matters
- Supports smooth runtime performance
- Helps with caching, data persistence, and checkpointing
What to Consider
- Data Volume: Estimate how much raw and processed data you’ll handle
- Speed: Fast storage = faster read/write = faster execution
- Volatility: Use disk volume for persistence and container volume for temp files