Overview
Cluster Proxy and Port Management provides secure access to cluster services through HTTPS proxy URLs, port forwarding configurations, and network access controls. Essential for accessing web applications, APIs, and development tools running on GPU clusters.Endpoints
Get Proxy Configuration
Copy
GET https://api.tensorone.ai/v1/clusters/{cluster_id}/proxy
Update Proxy Settings
Copy
PUT https://api.tensorone.ai/v1/clusters/{cluster_id}/proxy
Manage Port Mappings
Copy
POST https://api.tensorone.ai/v1/clusters/{cluster_id}/proxy/ports
PUT https://api.tensorone.ai/v1/clusters/{cluster_id}/proxy/ports/{port_id}
DELETE https://api.tensorone.ai/v1/clusters/{cluster_id}/proxy/ports/{port_id}
Configure Access Control
Copy
PUT https://api.tensorone.ai/v1/clusters/{cluster_id}/proxy/access
Get Proxy Configuration
Query Parameters
Parameter | Type | Required | Description |
---|---|---|---|
include_health | boolean | No | Include health check status for mapped ports |
include_access_logs | boolean | No | Include recent access logs |
include_ssl_info | boolean | No | Include SSL certificate information |
Request Examples
Copy
# Get complete proxy configuration
curl -X GET "https://api.tensorone.ai/v1/clusters/cluster_abc123/proxy?include_health=true" \
-H "Authorization: Bearer YOUR_API_KEY"
# Get proxy config with access logs
curl -X GET "https://api.tensorone.ai/v1/clusters/cluster_abc123/proxy?include_access_logs=true" \
-H "Authorization: Bearer YOUR_API_KEY"
Response Schema
Copy
{
"success": true,
"data": {
"cluster_id": "cluster_abc123",
"proxy_url": "https://cluster-abc123.tensorone.ai",
"proxy_status": "active",
"ssl_certificate": {
"issuer": "TensorOne CA",
"valid_from": "2024-01-01T00:00:00Z",
"valid_until": "2025-01-01T00:00:00Z",
"status": "valid"
},
"port_mappings": [
{
"id": "port_123",
"internal_port": 8888,
"external_port": 32001,
"protocol": "tcp",
"description": "Jupyter Lab",
"url": "https://cluster-abc123.tensorone.ai:32001",
"status": "active",
"health_status": "healthy",
"health_check": {
"enabled": true,
"path": "/lab",
"interval_seconds": 30,
"timeout_seconds": 10,
"last_check": "2024-01-15T16:45:00Z",
"consecutive_failures": 0
},
"access_control": {
"authentication_required": true,
"allowed_users": ["ml_engineer", "data_scientist"],
"ip_whitelist": ["203.0.113.0/24"],
"rate_limit": {
"requests_per_minute": 100,
"burst_size": 20
}
},
"usage_statistics": {
"total_requests": 1247,
"unique_visitors": 8,
"last_accessed": "2024-01-15T16:30:00Z",
"avg_response_time_ms": 245
},
"created_at": "2024-01-15T14:30:00Z",
"updated_at": "2024-01-15T16:00:00Z"
},
{
"id": "port_124",
"internal_port": 6006,
"external_port": 32002,
"protocol": "tcp",
"description": "TensorBoard",
"url": "https://cluster-abc123.tensorone.ai:32002",
"status": "active",
"health_status": "healthy",
"health_check": {
"enabled": true,
"path": "/",
"interval_seconds": 60,
"timeout_seconds": 15,
"last_check": "2024-01-15T16:44:00Z",
"consecutive_failures": 0
},
"access_control": {
"authentication_required": false,
"public_access": true
},
"usage_statistics": {
"total_requests": 456,
"unique_visitors": 12,
"last_accessed": "2024-01-15T16:25:00Z",
"avg_response_time_ms": 180
},
"created_at": "2024-01-15T14:35:00Z"
}
],
"access_control": {
"default_authentication": true,
"session_timeout_minutes": 480,
"max_concurrent_sessions": 10,
"ip_whitelist": ["203.0.113.0/24", "198.51.100.0/24"],
"blocked_ips": [],
"rate_limiting": {
"global_limit": {
"requests_per_minute": 1000,
"burst_size": 100
},
"per_user_limit": {
"requests_per_minute": 200,
"burst_size": 50
}
}
},
"network_configuration": {
"load_balancer": {
"enabled": true,
"algorithm": "round_robin",
"health_check_enabled": true
},
"ssl_termination": "proxy",
"websocket_support": true,
"compression": {
"enabled": true,
"algorithms": ["gzip", "br"]
}
},
"access_logs": [
{
"timestamp": "2024-01-15T16:30:15Z",
"client_ip": "203.0.113.42",
"user": "ml_engineer",
"method": "GET",
"path": "/lab/tree",
"status_code": 200,
"response_time_ms": 156,
"user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
}
]
},
"meta": {
"last_updated": "2024-01-15T16:45:00Z",
"total_port_mappings": 2,
"active_port_mappings": 2,
"proxy_version": "2.1.0"
}
}
Create Port Mapping
Request Body
Parameter | Type | Required | Description |
---|---|---|---|
internal_port | integer | Yes | Internal port number (1-65535) |
external_port | integer | No | External port number (auto-assigned if not specified) |
protocol | string | No | Protocol: tcp , udp (default: tcp ) |
description | string | Yes | Human-readable description of the service |
health_check | object | No | Health check configuration |
access_control | object | No | Access control settings |
custom_domain | string | No | Custom domain for the service |
Request Examples
Copy
# Create Jupyter Lab port mapping
curl -X POST "https://api.tensorone.ai/v1/clusters/cluster_abc123/proxy/ports" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"internal_port": 8888,
"description": "Jupyter Lab Interface",
"health_check": {
"enabled": true,
"path": "/lab",
"interval_seconds": 30
},
"access_control": {
"authentication_required": true,
"allowed_users": ["ml_engineer", "data_scientist"]
}
}'
# Create API endpoint with custom domain
curl -X POST "https://api.tensorone.ai/v1/clusters/cluster_abc123/proxy/ports" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"internal_port": 8000,
"external_port": 443,
"description": "ML Model API",
"custom_domain": "api.ml-project.com",
"health_check": {
"enabled": true,
"path": "/health",
"interval_seconds": 15,
"timeout_seconds": 5
},
"access_control": {
"authentication_required": false,
"rate_limit": {
"requests_per_minute": 500,
"burst_size": 100
}
}
}'
Update Port Mapping
Copy
# Update port mapping configuration
curl -X PUT "https://api.tensorone.ai/v1/clusters/cluster_abc123/proxy/ports/port_123" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"description": "Updated Jupyter Lab Environment",
"health_check": {
"enabled": true,
"path": "/lab/tree",
"interval_seconds": 15,
"timeout_seconds": 8
},
"access_control": {
"authentication_required": true,
"allowed_users": ["ml_engineer", "data_scientist", "researcher"],
"rate_limit": {
"requests_per_minute": 150,
"burst_size": 30
}
}
}'
Access Control Configuration
Copy
# Configure cluster-wide access control
curl -X PUT "https://api.tensorone.ai/v1/clusters/cluster_abc123/proxy/access" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"default_authentication": true,
"session_timeout_minutes": 240,
"max_concurrent_sessions": 20,
"ip_whitelist": ["203.0.113.0/24", "198.51.100.0/24"],
"rate_limiting": {
"global_limit": {
"requests_per_minute": 2000,
"burst_size": 500
},
"per_user_limit": {
"requests_per_minute": 300,
"burst_size": 100
}
},
"security_headers": {
"hsts_enabled": true,
"content_security_policy": "default-src self; script-src self unsafe-inline",
"x_frame_options": "SAMEORIGIN"
}
}'
Use Cases
ML Development Environment
Set up comprehensive development environment with multiple tools.Copy
def setup_comprehensive_ml_environment(cluster_id, team_config):
"""Set up a complete ML development environment"""
environment_services = {
"core_services": [
{
"name": "Jupyter Lab",
"port": 8888,
"path": "/lab",
"priority": "high",
"auth_required": True
},
{
"name": "TensorBoard",
"port": 6006,
"path": "/",
"priority": "high",
"auth_required": False
}
],
"development_tools": [
{
"name": "VSCode Server",
"port": 8080,
"path": "/",
"priority": "medium",
"auth_required": True
},
{
"name": "MLflow UI",
"port": 5000,
"path": "/",
"priority": "medium",
"auth_required": True
}
],
"monitoring_services": [
{
"name": "Prometheus Metrics",
"port": 9090,
"path": "/metrics",
"priority": "low",
"auth_required": True
},
{
"name": "Grafana Dashboard",
"port": 3000,
"path": "/",
"priority": "low",
"auth_required": True
}
]
}
setup_results = {
"cluster_id": cluster_id,
"team": team_config["team_name"],
"services_created": [],
"services_failed": [],
"access_summary": {}
}
# Create services by priority
for category, services in environment_services.items():
print(f"\nSetting up {category}...")
for service in services:
config = {
"internal_port": service["port"],
"description": f"{service['name']} for {team_config['team_name']} team",
"health_check": {
"enabled": True,
"path": service["path"],
"interval_seconds": 30 if service["priority"] == "high" else 60,
"timeout_seconds": 10
},
"access_control": {
"authentication_required": service["auth_required"],
"allowed_users": team_config.get("team_members", []),
"session_timeout_minutes": 480, # 8 hours
"rate_limit": {
"requests_per_minute": 200 if service["priority"] == "high" else 100,
"burst_size": 50 if service["priority"] == "high" else 25
}
}
}
result = create_port_mapping(cluster_id, config)
if result["success"]:
service_info = {
"name": service["name"],
"category": category,
"url": result["data"]["url"],
"port": result["data"]["external_port"],
"priority": service["priority"],
"auth_required": service["auth_required"]
}
setup_results["services_created"].append(service_info)
print(f"✅ {service['name']}: {result['data']['url']}")
else:
setup_results["services_failed"].append({
"name": service["name"],
"error": result["error"]["message"]
})
print(f"❌ Failed to create {service['name']}: {result['error']['message']}")
# Generate access summary for team
setup_results["access_summary"] = {
"total_services": len(setup_results["services_created"]),
"authenticated_services": len([s for s in setup_results["services_created"] if s["auth_required"]]),
"public_services": len([s for s in setup_results["services_created"] if not s["auth_required"]]),
"team_access_urls": {s["name"]: s["url"] for s in setup_results["services_created"]}
}
return setup_results
# Example team configuration
team_config = {
"team_name": "ML Research Team",
"team_members": ["alice_researcher", "bob_engineer", "carol_scientist"],
"project_focus": "NLP Research",
"security_level": "standard"
}
# Set up environment
ml_environment = setup_comprehensive_ml_environment("cluster_research_001", team_config)
print(f"\n🎉 ML Environment Setup Complete!")
print(f"Created {ml_environment['access_summary']['total_services']} services for {team_config['team_name']}")
print("\nQuick Access URLs:")
for service_name, url in ml_environment["access_summary"]["team_access_urls"].items():
print(f" {service_name}: {url}")
Production Model Serving
Set up production-ready model serving infrastructure.Copy
class ProductionModelServer {
constructor(clusterId, modelConfig) {
this.clusterId = clusterId;
this.modelConfig = modelConfig;
this.services = new Map();
}
async deployModel() {
console.log(`Deploying ${this.modelConfig.name} model to production...`);
// Main API endpoint
const apiConfig = {
internal_port: this.modelConfig.apiPort || 8000,
description: `${this.modelConfig.name} Model API`,
health_check: {
enabled: true,
path: '/health',
interval_seconds: 10,
timeout_seconds: 5,
expected_status: 200,
retries: 3
},
access_control: {
authentication_required: this.modelConfig.requireAuth !== false,
rate_limit: {
requests_per_minute: this.modelConfig.rateLimit || 1000,
burst_size: this.modelConfig.burstSize || 200
},
ip_whitelist: this.modelConfig.allowedIPs || [],
cors: {
enabled: true,
allowed_origins: this.modelConfig.corsOrigins || ['*'],
allowed_methods: ['GET', 'POST'],
allowed_headers: ['Content-Type', 'Authorization']
}
},
custom_domain: this.modelConfig.customDomain,
ssl_redirect: true
};
const apiResult = await createPortMapping(this.clusterId, apiConfig);
if (!apiResult.success) {
throw new Error(`Failed to create API endpoint: ${apiResult.error.message}`);
}
this.services.set('api', {
name: 'Model API',
url: apiResult.data.url,
port: apiResult.data.external_port,
type: 'api'
});
// Metrics endpoint
if (this.modelConfig.enableMetrics !== false) {
const metricsConfig = {
internal_port: this.modelConfig.metricsPort || 9090,
description: `${this.modelConfig.name} Model Metrics`,
health_check: {
enabled: true,
path: '/metrics',
interval_seconds: 30
},
access_control: {
authentication_required: true,
allowed_users: ['ops_team', 'ml_engineer'],
ip_whitelist: this.modelConfig.monitoringIPs || []
}
};
const metricsResult = await createPortMapping(this.clusterId, metricsConfig);
if (metricsResult.success) {
this.services.set('metrics', {
name: 'Model Metrics',
url: metricsResult.data.url,
port: metricsResult.data.external_port,
type: 'monitoring'
});
}
}
// Admin dashboard (if enabled)
if (this.modelConfig.enableDashboard) {
const dashboardConfig = {
internal_port: this.modelConfig.dashboardPort || 8080,
description: `${this.modelConfig.name} Admin Dashboard`,
health_check: {
enabled: true,
path: '/dashboard',
interval_seconds: 60
},
access_control: {
authentication_required: true,
allowed_users: ['admin', 'ml_engineer'],
session_timeout_minutes: 120
}
};
const dashboardResult = await createPortMapping(this.clusterId, dashboardConfig);
if (dashboardResult.success) {
this.services.set('dashboard', {
name: 'Admin Dashboard',
url: dashboardResult.data.url,
port: dashboardResult.data.external_port,
type: 'admin'
});
}
}
console.log(`✅ Model deployment complete for ${this.modelConfig.name}`);
return this.getDeploymentSummary();
}
async setupLoadBalancing() {
if (this.modelConfig.replicas && this.modelConfig.replicas > 1) {
console.log(`Setting up load balancing for ${this.modelConfig.replicas} replicas...`);
// Configure load balancer settings
const lbConfig = {
algorithm: 'round_robin',
health_check_enabled: true,
health_check_path: '/health',
session_affinity: false,
timeout_seconds: 30,
retries: 3
};
// Update the main API service with load balancing
const apiService = this.services.get('api');
if (apiService) {
const updateConfig = {
load_balancer: lbConfig,
scaling: {
min_replicas: 1,
max_replicas: this.modelConfig.replicas,
target_cpu_utilization: 70,
target_memory_utilization: 80
}
};
// Note: This would update the existing port mapping
// Implementation depends on the specific load balancing API
console.log('Load balancing configured for API endpoint');
}
}
}
async setupMonitoring() {
console.log('Setting up comprehensive monitoring...');
const monitoringServices = [];
// Health check monitoring
const healthMonitor = setInterval(async () => {
for (const [key, service] of this.services) {
if (service.type === 'api') {
try {
const response = await fetch(`${service.url}/health`, {
method: 'GET',
timeout: 5000
});
const isHealthy = response.ok;
const timestamp = new Date().toISOString();
if (!isHealthy) {
console.warn(`⚠️ API health check failed: ${service.url} (${response.status})`);
// Could trigger alerts here
}
service.lastHealthCheck = { timestamp, healthy: isHealthy, status: response.status };
} catch (error) {
console.error(`❌ Health check error for ${service.name}:`, error.message);
service.lastHealthCheck = { timestamp: new Date().toISOString(), healthy: false, error: error.message };
}
}
}
}, 30000); // Check every 30 seconds
monitoringServices.push(healthMonitor);
// Performance monitoring (if metrics endpoint exists)
const metricsService = this.services.get('metrics');
if (metricsService) {
const performanceMonitor = setInterval(async () => {
try {
const response = await fetch(`${metricsService.url}/metrics`);
if (response.ok) {
const metrics = await response.text();
// Parse and analyze metrics
this.analyzePerformanceMetrics(metrics);
}
} catch (error) {
console.error('Error fetching performance metrics:', error.message);
}
}, 60000); // Check every minute
monitoringServices.push(performanceMonitor);
}
this.monitoringServices = monitoringServices;
return monitoringServices.length;
}
analyzePerformanceMetrics(metricsText) {
// Simple metrics analysis
const lines = metricsText.split('\n');
const metrics = {};
lines.forEach(line => {
if (line.startsWith('http_requests_total')) {
const match = line.match(/http_requests_total{.*?} (\d+)/);
if (match) metrics.totalRequests = parseInt(match[1]);
} else if (line.startsWith('http_request_duration_seconds')) {
const match = line.match(/http_request_duration_seconds{.*?} ([0-9.]+)/);
if (match) metrics.avgResponseTime = parseFloat(match[1]);
}
});
// Simple alerting logic
if (metrics.avgResponseTime && metrics.avgResponseTime > 2.0) {
console.warn(`⚠️ High response time detected: ${metrics.avgResponseTime}s`);
}
if (metrics.totalRequests && this.lastRequestCount) {
const requestRate = (metrics.totalRequests - this.lastRequestCount) / 60; // per second
if (requestRate > this.modelConfig.alertThresholds?.maxRequestsPerSecond || 100) {
console.warn(`⚠️ High request rate: ${requestRate} req/s`);
}
}
this.lastRequestCount = metrics.totalRequests;
}
getDeploymentSummary() {
const summary = {
modelName: this.modelConfig.name,
clusterId: this.clusterId,
deploymentTime: new Date().toISOString(),
services: {},
endpoints: {},
configuration: {
authentication: this.modelConfig.requireAuth !== false,
rateLimit: this.modelConfig.rateLimit || 1000,
customDomain: this.modelConfig.customDomain || null,
replicas: this.modelConfig.replicas || 1
}
};
for (const [key, service] of this.services) {
summary.services[key] = {
name: service.name,
type: service.type,
url: service.url,
port: service.port
};
summary.endpoints[service.name] = service.url;
}
return summary;
}
async cleanup() {
console.log('Cleaning up monitoring services...');
if (this.monitoringServices) {
this.monitoringServices.forEach(service => {
if (typeof service === 'number') {
clearInterval(service);
}
});
}
this.services.clear();
}
}
// Usage example
async function deployProductionModel() {
const modelConfig = {
name: 'GPT-3.5 Fine-tuned',
apiPort: 8000,
metricsPort: 9090,
dashboardPort: 8080,
requireAuth: true,
rateLimit: 500,
burstSize: 100,
enableMetrics: true,
enableDashboard: true,
replicas: 3,
customDomain: 'api.gpt-model.company.com',
allowedIPs: ['203.0.113.0/24'],
corsOrigins: ['https://app.company.com'],
monitoringIPs: ['198.51.100.0/24'],
alertThresholds: {
maxRequestsPerSecond: 50,
maxResponseTimeSeconds: 2.0
}
};
const modelServer = new ProductionModelServer('cluster_prod_001', modelConfig);
try {
// Deploy the model services
const deployment = await modelServer.deployModel();
console.log('Deployment Summary:', deployment);
// Set up load balancing
await modelServer.setupLoadBalancing();
// Start monitoring
const monitoringCount = await modelServer.setupMonitoring();
console.log(`Started ${monitoringCount} monitoring services`);
return modelServer;
} catch (error) {
console.error('Deployment failed:', error);
await modelServer.cleanup();
throw error;
}
}
// Deploy the model
const productionModel = await deployProductionModel();
// Cleanup on exit
process.on('exit', () => {
productionModel.cleanup();
});
Error Handling
Copy
{
"success": false,
"error": {
"code": "PORT_ALREADY_MAPPED",
"message": "Internal port 8888 is already mapped",
"details": {
"internal_port": 8888,
"existing_mapping_id": "port_456",
"existing_description": "Existing Jupyter Lab",
"suggestion": "Use a different internal port or update the existing mapping"
}
}
}
Security Considerations
- HTTPS Only: All proxy connections use HTTPS with valid SSL certificates
- Authentication: Implement proper authentication for sensitive services
- Rate Limiting: Configure appropriate rate limits to prevent abuse
- IP Whitelisting: Restrict access to trusted IP ranges when possible
- Session Management: Use secure session timeouts and proper session handling
Best Practices
- Service Organization: Use descriptive names and organize services logically
- Health Monitoring: Implement comprehensive health checks for all services
- Access Control: Apply principle of least privilege for service access
- Performance Monitoring: Monitor service performance and response times
- SSL/TLS: Always use HTTPS for production services
- Load Balancing: Implement load balancing for high-availability services
Authorizations
API key authentication. Use 'Bearer YOUR_API_KEY' format.
Path Parameters
Body
application/json
Response
200 - application/json
Proxy settings updated successfully
The response is of type object
.