List Clusters
curl --request GET \
  --url https://api.tensorone.ai/v2/clusters \
  --header 'Authorization: <api-key>'
[
  {
    "id": "<string>",
    "name": "<string>",
    "status": "running",
    "gpuType": "<string>",
    "containerDiskSize": 123,
    "volumeSize": 123,
    "createdAt": "2023-11-07T05:31:56Z"
  }
]

Overview

The List Clusters endpoint allows you to retrieve all GPU clusters associated with your account with comprehensive filtering, pagination, and sorting options. This is essential for managing large fleets of GPU resources across different projects and environments.

Endpoint

GET https://api.tensorone.ai/v1/clusters

Query Parameters

ParameterTypeRequiredDescription
pageintegerNoPage number for pagination (default: 1)
limitintegerNoNumber of clusters per page (default: 20, max: 100)
statusstringNoFilter by cluster status: running, stopped, starting, stopping, error, pending
gpu_typestringNoFilter by GPU type: A100, H100, RTX4090, V100, T4
regionstringNoFilter by region: us-east-1, us-west-2, eu-west-1, ap-southeast-1
project_idstringNoFilter by project ID
template_idstringNoFilter by template ID
sort_bystringNoSort field: created_at, name, status, gpu_count, cost
sort_orderstringNoSort order: asc, desc (default: desc)
searchstringNoSearch clusters by name or description
min_gpu_countintegerNoMinimum number of GPUs
max_gpu_countintegerNoMaximum number of GPUs
created_afterstringNoFilter clusters created after date (ISO 8601)
created_beforestringNoFilter clusters created before date (ISO 8601)

Request Examples

# List all clusters with basic pagination
curl -X GET "https://api.tensorone.ai/v1/clusters?page=1&limit=20" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"

# Filter running A100 clusters in us-east-1
curl -X GET "https://api.tensorone.ai/v1/clusters?status=running&gpu_type=A100&region=us-east-1" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"

# Search and sort clusters by cost
curl -X GET "https://api.tensorone.ai/v1/clusters?search=training&sort_by=cost&sort_order=desc" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"

Response Schema

{
  "success": true,
  "data": {
    "clusters": [
      {
        "id": "cluster_abc123",
        "name": "ml-training-cluster",
        "description": "High-performance cluster for LLM training",
        "status": "running",
        "gpu_type": "A100",
        "gpu_count": 8,
        "cpu_cores": 64,
        "memory_gb": 512,
        "storage_gb": 2000,
        "region": "us-west-2",
        "project_id": "proj_456",
        "template_id": "tmpl_789",
        "ssh_enabled": true,
        "port_mappings": [
          {
            "internal_port": 8080,
            "external_port": 32001,
            "protocol": "tcp"
          }
        ],
        "proxy_url": "https://cluster-abc123.tensorone.ai",
        "ssh_connection": {
          "host": "ssh-abc123.tensorone.ai",
          "port": 22,
          "username": "root"
        },
        "metrics": {
          "gpu_utilization": 85.2,
          "memory_utilization": 67.8,
          "cpu_utilization": 45.1
        },
        "cost": {
          "hourly_rate": 12.50,
          "current_session_cost": 45.75,
          "total_cost": 234.80
        },
        "created_at": "2024-01-15T10:30:00Z",
        "updated_at": "2024-01-15T14:45:00Z",
        "expires_at": "2024-01-16T10:30:00Z"
      }
    ],
    "pagination": {
      "current_page": 1,
      "total_pages": 5,
      "total_count": 89,
      "per_page": 20,
      "has_next": true,
      "has_previous": false
    },
    "filters_applied": {
      "status": "running",
      "gpu_type": "A100",
      "region": "us-west-2"
    }
  },
  "meta": {
    "request_id": "req_xyz789",
    "response_time_ms": 156
  }
}

Response Fields

Cluster Object

FieldTypeDescription
idstringUnique cluster identifier
namestringHuman-readable cluster name
descriptionstringOptional cluster description
statusstringCurrent cluster status
gpu_typestringGPU model (A100, H100, RTX4090, etc.)
gpu_countintegerNumber of GPUs allocated
cpu_coresintegerNumber of CPU cores
memory_gbintegerRAM in gigabytes
storage_gbintegerPersistent storage in gigabytes
regionstringDeployment region
project_idstringAssociated project ID
template_idstringTemplate used for cluster creation
ssh_enabledbooleanSSH access availability
port_mappingsarrayExternal port mappings
proxy_urlstringHTTPS proxy URL for web services
ssh_connectionobjectSSH connection details
metricsobjectReal-time performance metrics
costobjectCost information and billing
created_atstringCreation timestamp (ISO 8601)
updated_atstringLast update timestamp (ISO 8601)
expires_atstringAuto-termination time (if set)

Use Cases

Fleet Management

Monitor and manage large numbers of GPU clusters across different projects and environments.
# Get overview of all running clusters
def get_cluster_overview():
    response = requests.get(
        "https://api.tensorone.ai/v1/clusters",
        headers={"Authorization": f"Bearer {API_KEY}"},
        params={
            "status": "running",
            "sort_by": "cost",
            "sort_order": "desc",
            "limit": 100
        }
    )
    
    clusters = response.json()["data"]["clusters"]
    
    # Calculate total costs and utilization
    total_cost = sum(c["cost"]["hourly_rate"] for c in clusters)
    avg_gpu_util = sum(c["metrics"]["gpu_utilization"] for c in clusters) / len(clusters)
    
    return {
        "total_clusters": len(clusters),
        "total_hourly_cost": total_cost,
        "average_gpu_utilization": avg_gpu_util
    }

Development Environment Discovery

Find available development clusters for team members.
// Find available development clusters
async function findAvailableDevClusters(teamProject) {
  const response = await fetch('https://api.tensorone.ai/v1/clusters?' + 
    new URLSearchParams({
      project_id: teamProject,
      status: 'stopped',
      gpu_type: 'RTX4090',
      sort_by: 'created_at'
    }), {
    headers: {
      'Authorization': 'Bearer ' + API_KEY,
      'Content-Type': 'application/json'
    }
  });
  
  const data = await response.json();
  return data.data.clusters.filter(cluster => 
    cluster.name.includes('dev') || cluster.name.includes('sandbox')
  );
}

Cost Optimization

Identify expensive or underutilized clusters for optimization.
# Find clusters for cost optimization
def find_optimization_candidates():
    response = requests.get(
        "https://api.tensorone.ai/v1/clusters",
        headers={"Authorization": f"Bearer {API_KEY}"},
        params={
            "status": "running",
            "sort_by": "cost",
            "sort_order": "desc"
        }
    )
    
    clusters = response.json()["data"]["clusters"]
    
    # Find underutilized expensive clusters
    candidates = []
    for cluster in clusters:
        if (cluster["cost"]["hourly_rate"] > 10.0 and 
            cluster["metrics"]["gpu_utilization"] < 30.0):
            candidates.append({
                "id": cluster["id"],
                "name": cluster["name"],
                "cost": cluster["cost"]["hourly_rate"],
                "utilization": cluster["metrics"]["gpu_utilization"]
            })
    
    return candidates

Error Handling

{
  "success": false,
  "error": {
    "code": "INVALID_PARAMETERS",
    "message": "Invalid query parameters provided",
    "details": {
      "limit": "Must be between 1 and 100",
      "gpu_type": "Must be one of: A100, H100, RTX4090, V100, T4"
    }
  }
}

Security Considerations

  • Authentication: Always use secure API keys with appropriate scopes
  • Data Privacy: Cluster lists may contain sensitive project information
  • Rate Limiting: Implement proper rate limiting for automated cluster monitoring
  • Permissions: Ensure users have appropriate permissions to view cluster information

Best Practices

  1. Pagination: Always use pagination for large cluster fleets to avoid timeouts
  2. Filtering: Use specific filters to reduce API response times and data transfer
  3. Caching: Cache cluster lists for dashboard applications with appropriate TTL
  4. Monitoring: Regularly check cluster status and metrics for proactive management
  5. Cost Control: Monitor expensive clusters and set up alerts for cost thresholds

Authorizations

Authorization
string
header
required

API key authentication. Use 'Bearer YOUR_API_KEY' format.

Response

List of clusters

The response is of type object[].