Create Cluster

Overview

The Create Cluster endpoint allows you to provision new GPU clusters with flexible configurations including GPU types, storage options, networking, and security settings. Perfect for ML training, development environments, and production AI workloads.

Endpoint

POST https://api.tensorone.ai/v1/clusters

Request Body

Parameter	Type	Required	Description
`name`	string	Yes	Cluster name (3-64 characters, alphanumeric and hyphens)
`description`	string	No	Optional cluster description
`gpu_type`	string	Yes	GPU type: `A100`, `H100`, `RTX4090`, `V100`, `T4`, `RTX3090`
`gpu_count`	integer	Yes	Number of GPUs (1-8 depending on GPU type)
`cpu_cores`	integer	No	CPU cores (auto-calculated if not specified)
`memory_gb`	integer	No	RAM in GB (auto-calculated if not specified)
`storage_gb`	integer	Yes	Persistent storage in GB (minimum 50GB)
`region`	string	Yes	Deployment region
`project_id`	string	Yes	Project ID for organization
`template_id`	string	No	Template ID for pre-configured environments
`docker_image`	string	No	Custom Docker image (if not using template)
`environment_variables`	object	No	Environment variables for the cluster
`ssh_enabled`	boolean	No	Enable SSH access (default: true)
`ssh_public_keys`	array	No	SSH public keys for access
`port_mappings`	array	No	Port forwarding configuration
`auto_start`	boolean	No	Start cluster immediately (default: true)
`auto_terminate`	object	No	Auto-termination settings
`network_config`	object	No	Advanced networking configuration
`security_groups`	array	No	Security group IDs
`tags`	object	No	Resource tags for organization

Request Examples

# Create basic ML training cluster
curl -X POST "https://api.tensorone.ai/v1/clusters" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "llm-training-cluster",
    "description": "Large language model training environment",
    "gpu_type": "A100",
    "gpu_count": 4,
    "storage_gb": 1000,
    "region": "us-west-2",
    "project_id": "proj_123",
    "template_id": "tmpl_pytorch_latest",
    "ssh_enabled": true,
    "auto_terminate": {
      "enabled": true,
      "idle_minutes": 60,
      "max_runtime_hours": 24
    }
  }'

# Create development cluster with custom Docker image
curl -X POST "https://api.tensorone.ai/v1/clusters" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "dev-environment",
    "gpu_type": "RTX4090",
    "gpu_count": 1,
    "cpu_cores": 16,
    "memory_gb": 64,
    "storage_gb": 500,
    "region": "us-east-1",
    "project_id": "proj_456",
    "docker_image": "pytorch/pytorch:2.1-cuda11.8-devel",
    "environment_variables": {
      "CUDA_VISIBLE_DEVICES": "0",
      "PYTHONPATH": "/workspace",
      "WANDB_API_KEY": "$WANDB_KEY"
    },
    "port_mappings": [
      {
        "internal_port": 8888,
        "external_port": 0,
        "protocol": "tcp",
        "description": "Jupyter Lab"
      }
    ]
  }'

Response Schema

{
  "success": true,
  "data": {
    "id": "cluster_abc123",
    "name": "llm-training-cluster",
    "description": "Large language model training environment",
    "status": "starting",
    "gpu_type": "A100",
    "gpu_count": 4,
    "cpu_cores": 32,
    "memory_gb": 256,
    "storage_gb": 1000,
    "region": "us-west-2",
    "project_id": "proj_123",
    "template_id": "tmpl_pytorch_latest",
    "docker_image": "tensorone/pytorch:2.1-cuda11.8",
    "ssh_enabled": true,
    "ssh_connection": {
      "host": "ssh-abc123.tensorone.ai",
      "port": 22,
      "username": "root",
      "status": "pending"
    },
    "port_mappings": [
      {
        "internal_port": 8888,
        "external_port": 32001,
        "protocol": "tcp",
        "description": "Jupyter Lab",
        "url": "https://cluster-abc123.tensorone.ai:32001"
      }
    ],
    "proxy_url": "https://cluster-abc123.tensorone.ai",
    "environment_variables": {
      "CUDA_VISIBLE_DEVICES": "0,1,2,3",
      "NCCL_SOCKET_IFNAME": "eth0"
    },
    "cost": {
      "hourly_rate": 8.50,
      "estimated_monthly": 6120.00,
      "currency": "USD"
    },
    "auto_terminate": {
      "enabled": true,
      "idle_minutes": 60,
      "max_runtime_hours": 24,
      "estimated_termination": "2024-01-16T14:30:00Z"
    },
    "network_config": {
      "private_ip": "10.0.1.15",
      "public_ip": "203.0.113.42",
      "bandwidth_limit_mbps": 1000
    },
    "security_groups": ["sg_default_ml"],
    "tags": {
      "team": "ml-research",
      "environment": "training"
    },
    "created_at": "2024-01-15T14:30:00Z",
    "updated_at": "2024-01-15T14:30:00Z",
    "estimated_ready_at": "2024-01-15T14:35:00Z"
  },
  "meta": {
    "request_id": "req_create_789",
    "estimated_setup_time_minutes": 5
  }
}

Configuration Options

GPU Types and Availability

GPU Type	Memory	Cores	Max Count	Hourly Rate	Best For
`A100`	80GB	6912	8	$2.50+	Large model training, inference
`H100`	80GB	16896	8	$4.00+	Latest generation, fastest training
`RTX4090`	24GB	16384	4	$0.80+	Development, medium models
`V100`	32GB	5120	8	$1.20+	Legacy support, cost-effective
`T4`	16GB	2560	4	$0.50+	Inference, light training

Storage Options

Type	Min Size	Max Size	Performance	Use Case
`ssd`	50GB	10TB	High IOPS	OS, applications, fast data access
`nvme`	100GB	5TB	Ultra-high IOPS	Training data, checkpoints
`hdd`	100GB	50TB	Standard	Archives, large datasets

Auto-termination Settings

{
  "auto_terminate": {
    "enabled": true,
    "idle_minutes": 30,           // Terminate after idle time
    "max_runtime_hours": 24,      // Maximum runtime limit
    "cost_limit_usd": 100.0,      // Cost-based termination
    "schedule": {                 // Scheduled termination
      "type": "cron",
      "expression": "0 18 * * 5"  // Every Friday at 6 PM
    }
  }
}

Use Cases

ML Model Training

Create powerful multi-GPU clusters for training large language models and computer vision models.

def create_training_cluster(model_size="large"):
    config = {
        "name": f"training-{model_size}-{int(time.time())}",
        "gpu_type": "A100" if model_size == "large" else "RTX4090",
        "gpu_count": 8 if model_size == "large" else 2,
        "storage_gb": 2000,
        "region": "us-west-2",
        "template_id": "tmpl_pytorch_distributed",
        "environment_variables": {
            "MODEL_SIZE": model_size,
            "BATCH_SIZE": "32" if model_size == "large" else "64"
        },
        "auto_terminate": {
            "enabled": True,
            "cost_limit_usd": 500.0
        }
    }
    
    response = requests.post(
        "https://api.tensorone.ai/v1/clusters",
        headers={"Authorization": f"Bearer {API_KEY}"},
        json=config
    )
    
    return response.json()["data"]

Development Environment

Set up interactive development environments with Jupyter, VSCode, and debugging tools.

async function createDevEnvironment(teamMember) {
  const config = {
    name: `dev-${teamMember.username}`,
    description: `Development environment for ${teamMember.name}`,
    gpu_type: 'RTX4090',
    gpu_count: 1,
    storage_gb: 500,
    region: 'us-east-1',
    project_id: teamMember.project_id,
    template_id: 'tmpl_jupyter_vscode',
    ssh_public_keys: [teamMember.ssh_key],
    port_mappings: [
      { internal_port: 8888, external_port: 0, description: 'Jupyter' },
      { internal_port: 8080, external_port: 0, description: 'VSCode' }
    ],
    auto_terminate: {
      enabled: true,
      idle_minutes: 120  // 2 hours idle timeout
    }
  };
  
  const response = await fetch('https://api.tensorone.ai/v1/clusters', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer ' + API_KEY,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify(config)
  });
  
  return await response.json();
}

Production Inference

Deploy production-ready inference clusters with load balancing and auto-scaling.

def create_inference_cluster(model_name, replicas=3):
    config = {
        "name": f"inference-{model_name}",
        "description": f"Production inference cluster for {model_name}",
        "gpu_type": "T4",
        "gpu_count": 1,
        "cpu_cores": 8,
        "memory_gb": 32,
        "storage_gb": 200,
        "region": "us-east-1",
        "docker_image": f"myregistry/models:{model_name}",
        "environment_variables": {
            "MODEL_NAME": model_name,
            "BATCH_SIZE": "8",
            "MAX_CONCURRENT": "10"
        },
        "port_mappings": [
            {
                "internal_port": 8000,
                "external_port": 80,
                "protocol": "tcp",
                "description": "API Endpoint"
            }
        ],
        "network_config": {
            "enable_load_balancer": True,
            "health_check_path": "/health"
        },
        "auto_terminate": {
            "enabled": False  # Keep running for production
        }
    }
    
    # Create multiple replicas
    clusters = []
    for i in range(replicas):
        config["name"] = f"inference-{model_name}-{i+1}"
        response = requests.post(
            "https://api.tensorone.ai/v1/clusters",
            headers={"Authorization": f"Bearer {API_KEY}"},
            json=config
        )
        clusters.append(response.json()["data"])
    
    return clusters

Error Handling

{
  "success": false,
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Invalid cluster configuration",
    "details": {
      "gpu_count": "Maximum 4 GPUs allowed for RTX4090",
      "storage_gb": "Minimum storage is 50GB",
      "region": "Region 'invalid-region' is not available"
    }
  }
}

Security Considerations

SSH Keys: Always use strong SSH key pairs and rotate them regularly
Network Security: Configure security groups and firewall rules appropriately
Environment Variables: Never store secrets in plain text; use encrypted secrets
Access Control: Ensure proper project-based access controls
Cost Monitoring: Implement cost alerts to prevent unexpected charges

Best Practices

Resource Planning: Choose GPU types based on your specific workload requirements
Cost Optimization: Use auto-termination to prevent runaway costs
Data Management: Plan storage requirements and backup strategies
Security: Implement proper access controls and network security
Monitoring: Set up alerts for cluster status and performance metrics
Template Usage: Use templates for consistent, repeatable deployments

Authorizations

Authorization

string

header

required

API key authentication. Use 'Bearer YOUR_API_KEY' format.

Body

application/json

Cluster configuration

The body is of type object.

Response

Cluster created successfully

The response is of type object.

Getting Started

Account Management

GPU Clusters (VPS)

Serverless Endpoints

Managed Training

AI Services

Payment & Billing

Monitoring & Analytics

Overview

Endpoint

Request Body

Request Examples

Response Schema

Configuration Options

GPU Types and Availability

Storage Options

Auto-termination Settings

Use Cases

ML Model Training

Development Environment

Production Inference

Error Handling

Security Considerations

Best Practices

Authorizations

Body

Response

Getting Started

Account Management

GPU Clusters (VPS)

Serverless Endpoints

Managed Training

AI Services

Payment & Billing

Monitoring & Analytics

​Overview

​Endpoint

​Request Body

​Request Examples

​Response Schema

​Configuration Options

​GPU Types and Availability

​Storage Options

​Auto-termination Settings

​Use Cases

​ML Model Training

​Development Environment

​Production Inference

​Error Handling

​Security Considerations

​Best Practices

Authorizations

Body

Response

Overview

Endpoint

Request Body

Request Examples

Response Schema

Configuration Options

GPU Types and Availability

Storage Options

Auto-termination Settings

Use Cases

ML Model Training

Development Environment

Production Inference

Error Handling

Security Considerations

Best Practices