Hyperparameter tuning is crucial for achieving optimal model performance. TensorOne’s automated hyperparameter optimization uses advanced algorithms to find the best parameter combinations while minimizing computational cost.
Start Hyperparameter Tuning
Launch a hyperparameter optimization job for an existing training configuration.
Required Parameters
objective
: Metric to optimize (minimize
or maximize
)
metric
: Target metric name (e.g., loss
, accuracy
, f1_score
)
search_space
: Parameter search space configuration
algorithm
: Optimization algorithm (grid
, random
, bayesian
, tpe
, hyperband
)
Optional Parameters
max_trials
: Maximum number of trials (default: 50)
max_concurrent
: Maximum concurrent trials (default: 4)
early_stopping
: Early stopping configuration
budget
: Resource budget constraints
Example Usage
Bayesian Optimization for Language Model
curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"objective": "minimize",
"metric": "validation_loss",
"algorithm": "bayesian",
"max_trials": 50,
"max_concurrent": 4,
"search_space": {
"learning_rate": {
"type": "log_uniform",
"min": 1e-5,
"max": 1e-2
},
"batch_size": {
"type": "choice",
"values": [4, 8, 16, 32]
},
"weight_decay": {
"type": "uniform",
"min": 0.001,
"max": 0.1
},
"warmup_steps": {
"type": "int_uniform",
"min": 50,
"max": 500
},
"lora_rank": {
"type": "choice",
"values": [8, 16, 32, 64]
},
"lora_alpha": {
"type": "choice",
"values": [16, 32, 64, 128]
}
},
"early_stopping": {
"metric": "validation_loss",
"patience": 3,
"min_delta": 0.001
},
"budget": {
"max_gpu_hours": 100,
"max_cost": 500
}
}'
Grid Search for Computer Vision Model
curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_vision_123/tune" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"objective": "maximize",
"metric": "accuracy",
"algorithm": "grid",
"max_trials": 24,
"max_concurrent": 6,
"search_space": {
"learning_rate": {
"type": "choice",
"values": [0.001, 0.01, 0.1]
},
"batch_size": {
"type": "choice",
"values": [16, 32, 64]
},
"optimizer": {
"type": "choice",
"values": ["adam", "sgd", "adamw"]
},
"dropout_rate": {
"type": "choice",
"values": [0.1, 0.2, 0.3]
}
},
"early_stopping": {
"metric": "val_accuracy",
"patience": 5,
"min_delta": 0.005
}
}'
Hyperband Optimization
curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_multimodal_456/tune" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"objective": "maximize",
"metric": "f1_score",
"algorithm": "hyperband",
"max_trials": 100,
"max_concurrent": 8,
"search_space": {
"learning_rate": {
"type": "log_uniform",
"min": 1e-5,
"max": 1e-1
},
"hidden_size": {
"type": "choice",
"values": [256, 512, 768, 1024]
},
"num_layers": {
"type": "int_uniform",
"min": 6,
"max": 24
},
"attention_heads": {
"type": "choice",
"values": [8, 12, 16, 24]
}
},
"hyperband_config": {
"max_epochs": 50,
"reduction_factor": 3,
"min_epochs": 5
}
}'
Response
Returns the hyperparameter tuning job object:
{
"id": "tune_1234567890abcdef",
"trainingJobId": "job_1234567890abcdef",
"status": "running",
"algorithm": "bayesian",
"objective": "minimize",
"metric": "validation_loss",
"progress": {
"completedTrials": 12,
"totalTrials": 50,
"runningTrials": 4,
"bestValue": 0.234,
"percentage": 24.0
},
"searchSpace": {
"learning_rate": {
"type": "log_uniform",
"min": 1e-5,
"max": 1e-2
},
"batch_size": {
"type": "choice",
"values": [4, 8, 16, 32]
}
},
"bestTrial": {
"id": "trial_0008",
"parameters": {
"learning_rate": 3.2e-4,
"batch_size": 16,
"weight_decay": 0.045,
"warmup_steps": 125,
"lora_rank": 32,
"lora_alpha": 64
},
"metrics": {
"validation_loss": 0.234,
"accuracy": 0.891,
"f1_score": 0.887
},
"duration": 3420,
"cost": 24.50
},
"estimatedCompletion": "2024-01-16T08:30:00Z",
"createdAt": "2024-01-15T18:00:00Z",
"updatedAt": "2024-01-15T20:15:00Z"
}
Get Tuning Job Status
Retrieve the current status and results of a hyperparameter tuning job.
curl -X GET "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune/tune_1234567890abcdef" \
-H "Authorization: Bearer YOUR_API_KEY"
Response
{
"id": "tune_1234567890abcdef",
"trainingJobId": "job_1234567890abcdef",
"status": "completed",
"algorithm": "bayesian",
"objective": "minimize",
"metric": "validation_loss",
"results": {
"totalTrials": 50,
"completedTrials": 47,
"failedTrials": 3,
"bestValue": 0.187,
"improvementOverBaseline": 0.047,
"totalDuration": 18000,
"totalCost": 245.30
},
"bestTrial": {
"id": "trial_0034",
"parameters": {
"learning_rate": 2.8e-4,
"batch_size": 8,
"weight_decay": 0.032,
"warmup_steps": 180,
"lora_rank": 16,
"lora_alpha": 32
},
"metrics": {
"validation_loss": 0.187,
"accuracy": 0.924,
"f1_score": 0.918,
"perplexity": 8.92
},
"duration": 3840,
"cost": 27.20,
"modelId": "model_best_tune_034"
},
"trials": [
{
"id": "trial_0034",
"parameters": {
"learning_rate": 2.8e-4,
"batch_size": 8
},
"metrics": {
"validation_loss": 0.187,
"accuracy": 0.924
},
"status": "completed",
"duration": 3840,
"cost": 27.20
}
],
"convergenceAnalysis": {
"converged": true,
"convergenceEpoch": 38,
"remainingImprovement": 0.003,
"recommendation": "Tuning has converged. Consider using best parameters for production."
},
"recommendations": [
"Use trial_0034 parameters for best performance",
"Learning rate 2.8e-4 shows optimal convergence",
"Batch size 8 provides best memory/performance tradeoff"
]
}
List Tuning Jobs
Retrieve a list of hyperparameter tuning jobs for your account.
curl -X GET "https://api.tensorone.ai/v2/training/tune" \
-H "Authorization: Bearer YOUR_API_KEY"
Query Parameters
status
: Filter by status (pending
, running
, completed
, failed
, cancelled
)
algorithm
: Filter by algorithm (grid
, random
, bayesian
, tpe
, hyperband
)
limit
: Number of jobs to return (1-100, default: 50)
offset
: Number of jobs to skip for pagination
Stop Tuning Job
Stop a running hyperparameter tuning job.
curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune/tune_1234567890abcdef/stop" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"reason": "User requested stop",
"complete_running_trials": true
}'
Create Training Job from Best Trial
Create a new training job using the best parameters found during tuning.
curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune/tune_1234567890abcdef/deploy-best" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "optimized-model-training",
"epochs": 10,
"use_best_trial": "trial_0034",
"override_parameters": {
"epochs": 10,
"save_steps": 500
}
}'
Search Space Configuration
Parameter Types
Continuous Parameters
{
"learning_rate": {
"type": "uniform",
"min": 0.001,
"max": 0.1
},
"weight_decay": {
"type": "log_uniform",
"min": 1e-5,
"max": 1e-1
}
}
Discrete Parameters
{
"batch_size": {
"type": "choice",
"values": [4, 8, 16, 32, 64]
},
"optimizer": {
"type": "choice",
"values": ["adam", "sgd", "adamw", "adagrad"]
}
}
Integer Parameters
{
"num_layers": {
"type": "int_uniform",
"min": 6,
"max": 24
},
"hidden_size": {
"type": "int_log_uniform",
"min": 128,
"max": 2048
}
}
Optimization Algorithms
Bayesian Optimization
- Best for: Expensive function evaluations, continuous parameters
- Pros: Sample efficient, handles noise well
- Cons: Slower for discrete spaces, requires more memory
Tree-structured Parzen Estimator (TPE)
- Best for: Mixed discrete/continuous spaces
- Pros: Efficient for complex search spaces
- Cons: May get stuck in local optima
Random Search
- Best for: Quick exploration, baseline comparison
- Pros: Simple, parallelizable, robust
- Cons: Not sample efficient for complex spaces
Grid Search
- Best for: Small search spaces, interpretable results
- Pros: Exhaustive, deterministic
- Cons: Exponential scaling, curse of dimensionality
Hyperband
- Best for: Neural architecture search, large budgets
- Pros: Efficient early stopping, handles varying budgets
- Cons: Requires configurable training epochs
SDK Examples
Python SDK
from tensorone import TensorOneClient
client = TensorOneClient(api_key="YOUR_API_KEY")
# Start hyperparameter tuning
tuning_job = client.training.hyperparameters.tune(
training_job_id="job_1234567890abcdef",
objective="minimize",
metric="validation_loss",
algorithm="bayesian",
max_trials=50,
max_concurrent=4,
search_space={
"learning_rate": {
"type": "log_uniform",
"min": 1e-5,
"max": 1e-2
},
"batch_size": {
"type": "choice",
"values": [4, 8, 16, 32]
},
"weight_decay": {
"type": "uniform",
"min": 0.001,
"max": 0.1
}
},
early_stopping={
"metric": "validation_loss",
"patience": 3,
"min_delta": 0.001
}
)
print(f"Started tuning job: {tuning_job.id}")
# Monitor progress
while tuning_job.status in ["pending", "running"]:
tuning_job = client.training.hyperparameters.get(tuning_job.id)
progress = tuning_job.progress
print(f"Progress: {progress.completed_trials}/{progress.total_trials} trials")
print(f"Best value so far: {progress.best_value}")
time.sleep(60)
# Get final results
final_results = client.training.hyperparameters.get(tuning_job.id)
best_trial = final_results.best_trial
print(f"Best parameters: {best_trial.parameters}")
print(f"Best score: {best_trial.metrics[final_results.metric]}")
# Create optimized training job
optimized_job = client.training.hyperparameters.deploy_best(
tuning_job_id=tuning_job.id,
name="optimized-model-training",
epochs=10
)
print(f"Created optimized training job: {optimized_job.id}")
JavaScript SDK
import { TensorOneClient } from '@tensorone/sdk';
const client = new TensorOneClient({ apiKey: 'YOUR_API_KEY' });
// Start hyperparameter tuning
const tuningJob = await client.training.hyperparameters.tune({
trainingJobId: 'job_1234567890abcdef',
objective: 'minimize',
metric: 'validation_loss',
algorithm: 'bayesian',
maxTrials: 50,
maxConcurrent: 4,
searchSpace: {
learningRate: {
type: 'log_uniform',
min: 1e-5,
max: 1e-2
},
batchSize: {
type: 'choice',
values: [4, 8, 16, 32]
},
weightDecay: {
type: 'uniform',
min: 0.001,
max: 0.1
}
},
earlyStopping: {
metric: 'validation_loss',
patience: 3,
minDelta: 0.001
}
});
console.log(`Started tuning job: ${tuningJob.id}`);
// Monitor progress
const monitorTuning = async (jobId) => {
const job = await client.training.hyperparameters.get(jobId);
const progress = job.progress;
console.log(`Progress: ${progress.completedTrials}/${progress.totalTrials} trials`);
console.log(`Best value so far: ${progress.bestValue}`);
if (job.status === 'running' || job.status === 'pending') {
setTimeout(() => monitorTuning(jobId), 60000);
} else {
console.log('Tuning completed!');
console.log(`Best parameters: ${JSON.stringify(job.bestTrial.parameters)}`);
// Create optimized training job
const optimizedJob = await client.training.hyperparameters.deployBest({
tuningJobId: jobId,
name: 'optimized-model-training',
epochs: 10
});
console.log(`Created optimized training job: ${optimizedJob.id}`);
}
};
monitorTuning(tuningJob.id);
Advanced Configurations
Multi-Objective Optimization
{
"objectives": [
{
"metric": "accuracy",
"direction": "maximize",
"weight": 0.7
},
{
"metric": "inference_time",
"direction": "minimize",
"weight": 0.3
}
]
}
Conditional Parameters
{
"optimizer": {
"type": "choice",
"values": ["adam", "sgd"]
},
"adam_beta1": {
"type": "uniform",
"min": 0.8,
"max": 0.99,
"condition": "optimizer == 'adam'"
},
"sgd_momentum": {
"type": "uniform",
"min": 0.5,
"max": 0.99,
"condition": "optimizer == 'sgd'"
}
}
Budget-Aware Optimization
{
"budget": {
"type": "fidelity",
"parameter": "epochs",
"min": 5,
"max": 50,
"resource_multiplier": 1.0
},
"multifidelity": {
"enabled": true,
"promotion_strategy": "top_k",
"promotion_rate": 0.5
}
}
Error Handling
Common Errors
{
"error": "INVALID_SEARCH_SPACE",
"message": "Search space configuration is invalid",
"details": {
"parameter": "learning_rate",
"reason": "min value must be less than max value"
}
}
{
"error": "INSUFFICIENT_BUDGET",
"message": "Insufficient budget for requested trials",
"details": {
"requestedTrials": 100,
"estimatedCost": 1500,
"availableBudget": 500
}
}
{
"error": "ALGORITHM_NOT_SUPPORTED",
"message": "Algorithm not supported for this parameter type",
"details": {
"algorithm": "grid",
"unsupportedParameterTypes": ["log_uniform"]
}
}
Best Practices
Search Space Design
- Start with wide ranges and narrow down based on initial results
- Use appropriate parameter types (log_uniform for learning rates)
- Include both architectural and training hyperparameters
- Consider parameter interactions and dependencies
Algorithm Selection
- Use Bayesian optimization for expensive evaluations
- Use random search for initial exploration
- Use grid search for final fine-tuning
- Use Hyperband for neural architecture search
Resource Management
- Set appropriate budget constraints to control costs
- Use early stopping to avoid wasted computation
- Balance exploration vs exploitation with concurrent trials
- Monitor convergence to avoid unnecessary trials
Hyperparameter tuning jobs can be paused and resumed. Intermediate results are saved automatically for recovery.
Hyperparameter tuning can be computationally expensive. Set appropriate budget limits and use early stopping to control costs.
API key authentication. Use 'Bearer YOUR_API_KEY' format.
Hyperparameter tuning started
The response is of type object
.