Hyperparameter Tuning

Hyperparameter tuning is crucial for achieving optimal model performance. TensorOne’s automated hyperparameter optimization uses advanced algorithms to find the best parameter combinations while minimizing computational cost.

Start Hyperparameter Tuning

Launch a hyperparameter optimization job for an existing training configuration.

Required Parameters

objective: Metric to optimize (minimize or maximize)
metric: Target metric name (e.g., loss, accuracy, f1_score)
search_space: Parameter search space configuration
algorithm: Optimization algorithm (grid, random, bayesian, tpe, hyperband)

Optional Parameters

max_trials: Maximum number of trials (default: 50)
max_concurrent: Maximum concurrent trials (default: 4)
early_stopping: Early stopping configuration
budget: Resource budget constraints

Example Usage

Bayesian Optimization for Language Model

curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "objective": "minimize",
    "metric": "validation_loss",
    "algorithm": "bayesian",
    "max_trials": 50,
    "max_concurrent": 4,
    "search_space": {
      "learning_rate": {
        "type": "log_uniform",
        "min": 1e-5,
        "max": 1e-2
      },
      "batch_size": {
        "type": "choice",
        "values": [4, 8, 16, 32]
      },
      "weight_decay": {
        "type": "uniform",
        "min": 0.001,
        "max": 0.1
      },
      "warmup_steps": {
        "type": "int_uniform",
        "min": 50,
        "max": 500
      },
      "lora_rank": {
        "type": "choice",
        "values": [8, 16, 32, 64]
      },
      "lora_alpha": {
        "type": "choice",
        "values": [16, 32, 64, 128]
      }
    },
    "early_stopping": {
      "metric": "validation_loss",
      "patience": 3,
      "min_delta": 0.001
    },
    "budget": {
      "max_gpu_hours": 100,
      "max_cost": 500
    }
  }'

Grid Search for Computer Vision Model

curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_vision_123/tune" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "objective": "maximize",
    "metric": "accuracy",
    "algorithm": "grid",
    "max_trials": 24,
    "max_concurrent": 6,
    "search_space": {
      "learning_rate": {
        "type": "choice",
        "values": [0.001, 0.01, 0.1]
      },
      "batch_size": {
        "type": "choice",
        "values": [16, 32, 64]
      },
      "optimizer": {
        "type": "choice",
        "values": ["adam", "sgd", "adamw"]
      },
      "dropout_rate": {
        "type": "choice",
        "values": [0.1, 0.2, 0.3]
      }
    },
    "early_stopping": {
      "metric": "val_accuracy",
      "patience": 5,
      "min_delta": 0.005
    }
  }'

Hyperband Optimization

curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_multimodal_456/tune" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "objective": "maximize",
    "metric": "f1_score",
    "algorithm": "hyperband",
    "max_trials": 100,
    "max_concurrent": 8,
    "search_space": {
      "learning_rate": {
        "type": "log_uniform",
        "min": 1e-5,
        "max": 1e-1
      },
      "hidden_size": {
        "type": "choice",
        "values": [256, 512, 768, 1024]
      },
      "num_layers": {
        "type": "int_uniform",
        "min": 6,
        "max": 24
      },
      "attention_heads": {
        "type": "choice",
        "values": [8, 12, 16, 24]
      }
    },
    "hyperband_config": {
      "max_epochs": 50,
      "reduction_factor": 3,
      "min_epochs": 5
    }
  }'

Response

Returns the hyperparameter tuning job object:

{
  "id": "tune_1234567890abcdef",
  "trainingJobId": "job_1234567890abcdef",
  "status": "running",
  "algorithm": "bayesian",
  "objective": "minimize",
  "metric": "validation_loss",
  "progress": {
    "completedTrials": 12,
    "totalTrials": 50,
    "runningTrials": 4,
    "bestValue": 0.234,
    "percentage": 24.0
  },
  "searchSpace": {
    "learning_rate": {
      "type": "log_uniform",
      "min": 1e-5,
      "max": 1e-2
    },
    "batch_size": {
      "type": "choice",
      "values": [4, 8, 16, 32]
    }
  },
  "bestTrial": {
    "id": "trial_0008",
    "parameters": {
      "learning_rate": 3.2e-4,
      "batch_size": 16,
      "weight_decay": 0.045,
      "warmup_steps": 125,
      "lora_rank": 32,
      "lora_alpha": 64
    },
    "metrics": {
      "validation_loss": 0.234,
      "accuracy": 0.891,
      "f1_score": 0.887
    },
    "duration": 3420,
    "cost": 24.50
  },
  "estimatedCompletion": "2024-01-16T08:30:00Z",
  "createdAt": "2024-01-15T18:00:00Z",
  "updatedAt": "2024-01-15T20:15:00Z"
}

Get Tuning Job Status

Retrieve the current status and results of a hyperparameter tuning job.

curl -X GET "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune/tune_1234567890abcdef" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "id": "tune_1234567890abcdef",
  "trainingJobId": "job_1234567890abcdef",
  "status": "completed",
  "algorithm": "bayesian",
  "objective": "minimize",
  "metric": "validation_loss",
  "results": {
    "totalTrials": 50,
    "completedTrials": 47,
    "failedTrials": 3,
    "bestValue": 0.187,
    "improvementOverBaseline": 0.047,
    "totalDuration": 18000,
    "totalCost": 245.30
  },
  "bestTrial": {
    "id": "trial_0034",
    "parameters": {
      "learning_rate": 2.8e-4,
      "batch_size": 8,
      "weight_decay": 0.032,
      "warmup_steps": 180,
      "lora_rank": 16,
      "lora_alpha": 32
    },
    "metrics": {
      "validation_loss": 0.187,
      "accuracy": 0.924,
      "f1_score": 0.918,
      "perplexity": 8.92
    },
    "duration": 3840,
    "cost": 27.20,
    "modelId": "model_best_tune_034"
  },
  "trials": [
    {
      "id": "trial_0034",
      "parameters": {
        "learning_rate": 2.8e-4,
        "batch_size": 8
      },
      "metrics": {
        "validation_loss": 0.187,
        "accuracy": 0.924
      },
      "status": "completed",
      "duration": 3840,
      "cost": 27.20
    }
  ],
  "convergenceAnalysis": {
    "converged": true,
    "convergenceEpoch": 38,
    "remainingImprovement": 0.003,
    "recommendation": "Tuning has converged. Consider using best parameters for production."
  },
  "recommendations": [
    "Use trial_0034 parameters for best performance",
    "Learning rate 2.8e-4 shows optimal convergence",
    "Batch size 8 provides best memory/performance tradeoff"
  ]
}

List Tuning Jobs

Retrieve a list of hyperparameter tuning jobs for your account.

curl -X GET "https://api.tensorone.ai/v2/training/tune" \
  -H "Authorization: Bearer YOUR_API_KEY"

Query Parameters

status: Filter by status (pending, running, completed, failed, cancelled)
algorithm: Filter by algorithm (grid, random, bayesian, tpe, hyperband)
limit: Number of jobs to return (1-100, default: 50)
offset: Number of jobs to skip for pagination

Stop Tuning Job

Stop a running hyperparameter tuning job.

curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune/tune_1234567890abcdef/stop" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "reason": "User requested stop",
    "complete_running_trials": true
  }'

Create Training Job from Best Trial

Create a new training job using the best parameters found during tuning.

curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune/tune_1234567890abcdef/deploy-best" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "optimized-model-training",
    "epochs": 10,
    "use_best_trial": "trial_0034",
    "override_parameters": {
      "epochs": 10,
      "save_steps": 500
    }
  }'

Search Space Configuration

Parameter Types

Continuous Parameters

{
  "learning_rate": {
    "type": "uniform",
    "min": 0.001,
    "max": 0.1
  },
  "weight_decay": {
    "type": "log_uniform",
    "min": 1e-5,
    "max": 1e-1
  }
}

Discrete Parameters

{
  "batch_size": {
    "type": "choice",
    "values": [4, 8, 16, 32, 64]
  },
  "optimizer": {
    "type": "choice",
    "values": ["adam", "sgd", "adamw", "adagrad"]
  }
}

Integer Parameters

{
  "num_layers": {
    "type": "int_uniform",
    "min": 6,
    "max": 24
  },
  "hidden_size": {
    "type": "int_log_uniform",
    "min": 128,
    "max": 2048
  }
}

Optimization Algorithms

Bayesian Optimization

Best for: Expensive function evaluations, continuous parameters
Pros: Sample efficient, handles noise well
Cons: Slower for discrete spaces, requires more memory

Tree-structured Parzen Estimator (TPE)

Best for: Mixed discrete/continuous spaces
Pros: Efficient for complex search spaces
Cons: May get stuck in local optima

Random Search

Best for: Quick exploration, baseline comparison
Pros: Simple, parallelizable, robust
Cons: Not sample efficient for complex spaces

Grid Search

Best for: Small search spaces, interpretable results
Pros: Exhaustive, deterministic
Cons: Exponential scaling, curse of dimensionality

Hyperband

Best for: Neural architecture search, large budgets
Pros: Efficient early stopping, handles varying budgets
Cons: Requires configurable training epochs

SDK Examples

Python SDK

from tensorone import TensorOneClient

client = TensorOneClient(api_key="YOUR_API_KEY")

# Start hyperparameter tuning
tuning_job = client.training.hyperparameters.tune(
    training_job_id="job_1234567890abcdef",
    objective="minimize",
    metric="validation_loss",
    algorithm="bayesian",
    max_trials=50,
    max_concurrent=4,
    search_space={
        "learning_rate": {
            "type": "log_uniform",
            "min": 1e-5,
            "max": 1e-2
        },
        "batch_size": {
            "type": "choice",
            "values": [4, 8, 16, 32]
        },
        "weight_decay": {
            "type": "uniform",
            "min": 0.001,
            "max": 0.1
        }
    },
    early_stopping={
        "metric": "validation_loss",
        "patience": 3,
        "min_delta": 0.001
    }
)

print(f"Started tuning job: {tuning_job.id}")

# Monitor progress
while tuning_job.status in ["pending", "running"]:
    tuning_job = client.training.hyperparameters.get(tuning_job.id)
    progress = tuning_job.progress
    print(f"Progress: {progress.completed_trials}/{progress.total_trials} trials")
    print(f"Best value so far: {progress.best_value}")
    time.sleep(60)

# Get final results
final_results = client.training.hyperparameters.get(tuning_job.id)
best_trial = final_results.best_trial
print(f"Best parameters: {best_trial.parameters}")
print(f"Best score: {best_trial.metrics[final_results.metric]}")

# Create optimized training job
optimized_job = client.training.hyperparameters.deploy_best(
    tuning_job_id=tuning_job.id,
    name="optimized-model-training",
    epochs=10
)

print(f"Created optimized training job: {optimized_job.id}")

JavaScript SDK

import { TensorOneClient } from '@tensorone/sdk';

const client = new TensorOneClient({ apiKey: 'YOUR_API_KEY' });

// Start hyperparameter tuning
const tuningJob = await client.training.hyperparameters.tune({
  trainingJobId: 'job_1234567890abcdef',
  objective: 'minimize',
  metric: 'validation_loss',
  algorithm: 'bayesian',
  maxTrials: 50,
  maxConcurrent: 4,
  searchSpace: {
    learningRate: {
      type: 'log_uniform',
      min: 1e-5,
      max: 1e-2
    },
    batchSize: {
      type: 'choice',
      values: [4, 8, 16, 32]
    },
    weightDecay: {
      type: 'uniform',
      min: 0.001,
      max: 0.1
    }
  },
  earlyStopping: {
    metric: 'validation_loss',
    patience: 3,
    minDelta: 0.001
  }
});

console.log(`Started tuning job: ${tuningJob.id}`);

// Monitor progress
const monitorTuning = async (jobId) => {
  const job = await client.training.hyperparameters.get(jobId);
  const progress = job.progress;
  
  console.log(`Progress: ${progress.completedTrials}/${progress.totalTrials} trials`);
  console.log(`Best value so far: ${progress.bestValue}`);
  
  if (job.status === 'running' || job.status === 'pending') {
    setTimeout(() => monitorTuning(jobId), 60000);
  } else {
    console.log('Tuning completed!');
    console.log(`Best parameters: ${JSON.stringify(job.bestTrial.parameters)}`);
    
    // Create optimized training job
    const optimizedJob = await client.training.hyperparameters.deployBest({
      tuningJobId: jobId,
      name: 'optimized-model-training',
      epochs: 10
    });
    
    console.log(`Created optimized training job: ${optimizedJob.id}`);
  }
};

monitorTuning(tuningJob.id);

Advanced Configurations

Multi-Objective Optimization

{
  "objectives": [
    {
      "metric": "accuracy",
      "direction": "maximize",
      "weight": 0.7
    },
    {
      "metric": "inference_time",
      "direction": "minimize",
      "weight": 0.3
    }
  ]
}

Conditional Parameters

{
  "optimizer": {
    "type": "choice",
    "values": ["adam", "sgd"]
  },
  "adam_beta1": {
    "type": "uniform",
    "min": 0.8,
    "max": 0.99,
    "condition": "optimizer == 'adam'"
  },
  "sgd_momentum": {
    "type": "uniform",
    "min": 0.5,
    "max": 0.99,
    "condition": "optimizer == 'sgd'"
  }
}

Budget-Aware Optimization

{
  "budget": {
    "type": "fidelity",
    "parameter": "epochs",
    "min": 5,
    "max": 50,
    "resource_multiplier": 1.0
  },
  "multifidelity": {
    "enabled": true,
    "promotion_strategy": "top_k",
    "promotion_rate": 0.5
  }
}

Error Handling

Common Errors

{
  "error": "INVALID_SEARCH_SPACE",
  "message": "Search space configuration is invalid",
  "details": {
    "parameter": "learning_rate",
    "reason": "min value must be less than max value"
  }
}

{
  "error": "INSUFFICIENT_BUDGET",
  "message": "Insufficient budget for requested trials",
  "details": {
    "requestedTrials": 100,
    "estimatedCost": 1500,
    "availableBudget": 500
  }
}

{
  "error": "ALGORITHM_NOT_SUPPORTED",
  "message": "Algorithm not supported for this parameter type",
  "details": {
    "algorithm": "grid",
    "unsupportedParameterTypes": ["log_uniform"]
  }
}

Best Practices

Search Space Design

Start with wide ranges and narrow down based on initial results
Use appropriate parameter types (log_uniform for learning rates)
Include both architectural and training hyperparameters
Consider parameter interactions and dependencies

Algorithm Selection

Use Bayesian optimization for expensive evaluations
Use random search for initial exploration
Use grid search for final fine-tuning
Use Hyperband for neural architecture search

Resource Management

Set appropriate budget constraints to control costs
Use early stopping to avoid wasted computation
Balance exploration vs exploitation with concurrent trials
Monitor convergence to avoid unnecessary trials

Hyperparameter tuning jobs can be paused and resumed. Intermediate results are saved automatically for recovery.

Hyperparameter tuning can be computationally expensive. Set appropriate budget limits and use early stopping to control costs.

Authorizations

Authorization

string

header

required

API key authentication. Use 'Bearer YOUR_API_KEY' format.

Path Parameters

string

required

Body

application/json

Response

202 - application/json

Hyperparameter tuning started

The response is of type object.

Getting Started

Account Management

GPU Clusters (VPS)

Serverless Endpoints

Managed Training

AI Services

Payment & Billing

Monitoring & Analytics

​Start Hyperparameter Tuning

​Required Parameters

​Optional Parameters

​Example Usage

​Bayesian Optimization for Language Model

​Grid Search for Computer Vision Model

​Hyperband Optimization

​Response

​Get Tuning Job Status

​Response

​List Tuning Jobs

​Query Parameters

​Stop Tuning Job

​Create Training Job from Best Trial

​Search Space Configuration

​Parameter Types

​Continuous Parameters

​Discrete Parameters

​Integer Parameters

​Optimization Algorithms

​Bayesian Optimization

​Tree-structured Parzen Estimator (TPE)

​Random Search

​Grid Search

​Hyperband

​SDK Examples

​Python SDK

​JavaScript SDK

​Advanced Configurations

​Multi-Objective Optimization

​Conditional Parameters

​Budget-Aware Optimization

​Error Handling

​Common Errors

​Best Practices

​Search Space Design

​Algorithm Selection

​Resource Management

Authorizations

Path Parameters

Body

Response

Start Hyperparameter Tuning

Required Parameters

Optional Parameters

Example Usage

Bayesian Optimization for Language Model

Grid Search for Computer Vision Model

Hyperband Optimization

Response

Get Tuning Job Status

Response

List Tuning Jobs

Query Parameters

Stop Tuning Job

Create Training Job from Best Trial

Search Space Configuration

Parameter Types

Continuous Parameters

Discrete Parameters

Integer Parameters

Optimization Algorithms

Bayesian Optimization

Tree-structured Parzen Estimator (TPE)

Random Search

Grid Search

Hyperband

SDK Examples

Python SDK

JavaScript SDK

Advanced Configurations

Multi-Objective Optimization

Conditional Parameters

Budget-Aware Optimization

Error Handling

Common Errors

Best Practices

Search Space Design

Algorithm Selection

Resource Management