Fine-tune Training Job
curl --request POST \
  --url https://api.tensorone.ai/v2/training/jobs/{id}/tune \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
  "hyperparameters": {
    "learningRate": 123,
    "batchSize": 123,
    "epochs": 123,
    "optimizer": "adam"
  },
  "searchStrategy": "random",
  "maxTrials": 10
}'
{
  "tuningJobId": "<string>",
  "status": "started",
  "estimatedDuration": "<string>"
}
Hyperparameter tuning is crucial for achieving optimal model performance. TensorOne’s automated hyperparameter optimization uses advanced algorithms to find the best parameter combinations while minimizing computational cost.

Start Hyperparameter Tuning

Launch a hyperparameter optimization job for an existing training configuration.

Required Parameters

  • objective: Metric to optimize (minimize or maximize)
  • metric: Target metric name (e.g., loss, accuracy, f1_score)
  • search_space: Parameter search space configuration
  • algorithm: Optimization algorithm (grid, random, bayesian, tpe, hyperband)

Optional Parameters

  • max_trials: Maximum number of trials (default: 50)
  • max_concurrent: Maximum concurrent trials (default: 4)
  • early_stopping: Early stopping configuration
  • budget: Resource budget constraints

Example Usage

Bayesian Optimization for Language Model

curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "objective": "minimize",
    "metric": "validation_loss",
    "algorithm": "bayesian",
    "max_trials": 50,
    "max_concurrent": 4,
    "search_space": {
      "learning_rate": {
        "type": "log_uniform",
        "min": 1e-5,
        "max": 1e-2
      },
      "batch_size": {
        "type": "choice",
        "values": [4, 8, 16, 32]
      },
      "weight_decay": {
        "type": "uniform",
        "min": 0.001,
        "max": 0.1
      },
      "warmup_steps": {
        "type": "int_uniform",
        "min": 50,
        "max": 500
      },
      "lora_rank": {
        "type": "choice",
        "values": [8, 16, 32, 64]
      },
      "lora_alpha": {
        "type": "choice",
        "values": [16, 32, 64, 128]
      }
    },
    "early_stopping": {
      "metric": "validation_loss",
      "patience": 3,
      "min_delta": 0.001
    },
    "budget": {
      "max_gpu_hours": 100,
      "max_cost": 500
    }
  }'

Grid Search for Computer Vision Model

curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_vision_123/tune" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "objective": "maximize",
    "metric": "accuracy",
    "algorithm": "grid",
    "max_trials": 24,
    "max_concurrent": 6,
    "search_space": {
      "learning_rate": {
        "type": "choice",
        "values": [0.001, 0.01, 0.1]
      },
      "batch_size": {
        "type": "choice",
        "values": [16, 32, 64]
      },
      "optimizer": {
        "type": "choice",
        "values": ["adam", "sgd", "adamw"]
      },
      "dropout_rate": {
        "type": "choice",
        "values": [0.1, 0.2, 0.3]
      }
    },
    "early_stopping": {
      "metric": "val_accuracy",
      "patience": 5,
      "min_delta": 0.005
    }
  }'

Hyperband Optimization

curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_multimodal_456/tune" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "objective": "maximize",
    "metric": "f1_score",
    "algorithm": "hyperband",
    "max_trials": 100,
    "max_concurrent": 8,
    "search_space": {
      "learning_rate": {
        "type": "log_uniform",
        "min": 1e-5,
        "max": 1e-1
      },
      "hidden_size": {
        "type": "choice",
        "values": [256, 512, 768, 1024]
      },
      "num_layers": {
        "type": "int_uniform",
        "min": 6,
        "max": 24
      },
      "attention_heads": {
        "type": "choice",
        "values": [8, 12, 16, 24]
      }
    },
    "hyperband_config": {
      "max_epochs": 50,
      "reduction_factor": 3,
      "min_epochs": 5
    }
  }'

Response

Returns the hyperparameter tuning job object:
{
  "id": "tune_1234567890abcdef",
  "trainingJobId": "job_1234567890abcdef",
  "status": "running",
  "algorithm": "bayesian",
  "objective": "minimize",
  "metric": "validation_loss",
  "progress": {
    "completedTrials": 12,
    "totalTrials": 50,
    "runningTrials": 4,
    "bestValue": 0.234,
    "percentage": 24.0
  },
  "searchSpace": {
    "learning_rate": {
      "type": "log_uniform",
      "min": 1e-5,
      "max": 1e-2
    },
    "batch_size": {
      "type": "choice",
      "values": [4, 8, 16, 32]
    }
  },
  "bestTrial": {
    "id": "trial_0008",
    "parameters": {
      "learning_rate": 3.2e-4,
      "batch_size": 16,
      "weight_decay": 0.045,
      "warmup_steps": 125,
      "lora_rank": 32,
      "lora_alpha": 64
    },
    "metrics": {
      "validation_loss": 0.234,
      "accuracy": 0.891,
      "f1_score": 0.887
    },
    "duration": 3420,
    "cost": 24.50
  },
  "estimatedCompletion": "2024-01-16T08:30:00Z",
  "createdAt": "2024-01-15T18:00:00Z",
  "updatedAt": "2024-01-15T20:15:00Z"
}

Get Tuning Job Status

Retrieve the current status and results of a hyperparameter tuning job.
curl -X GET "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune/tune_1234567890abcdef" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "id": "tune_1234567890abcdef",
  "trainingJobId": "job_1234567890abcdef",
  "status": "completed",
  "algorithm": "bayesian",
  "objective": "minimize",
  "metric": "validation_loss",
  "results": {
    "totalTrials": 50,
    "completedTrials": 47,
    "failedTrials": 3,
    "bestValue": 0.187,
    "improvementOverBaseline": 0.047,
    "totalDuration": 18000,
    "totalCost": 245.30
  },
  "bestTrial": {
    "id": "trial_0034",
    "parameters": {
      "learning_rate": 2.8e-4,
      "batch_size": 8,
      "weight_decay": 0.032,
      "warmup_steps": 180,
      "lora_rank": 16,
      "lora_alpha": 32
    },
    "metrics": {
      "validation_loss": 0.187,
      "accuracy": 0.924,
      "f1_score": 0.918,
      "perplexity": 8.92
    },
    "duration": 3840,
    "cost": 27.20,
    "modelId": "model_best_tune_034"
  },
  "trials": [
    {
      "id": "trial_0034",
      "parameters": {
        "learning_rate": 2.8e-4,
        "batch_size": 8
      },
      "metrics": {
        "validation_loss": 0.187,
        "accuracy": 0.924
      },
      "status": "completed",
      "duration": 3840,
      "cost": 27.20
    }
  ],
  "convergenceAnalysis": {
    "converged": true,
    "convergenceEpoch": 38,
    "remainingImprovement": 0.003,
    "recommendation": "Tuning has converged. Consider using best parameters for production."
  },
  "recommendations": [
    "Use trial_0034 parameters for best performance",
    "Learning rate 2.8e-4 shows optimal convergence",
    "Batch size 8 provides best memory/performance tradeoff"
  ]
}

List Tuning Jobs

Retrieve a list of hyperparameter tuning jobs for your account.
curl -X GET "https://api.tensorone.ai/v2/training/tune" \
  -H "Authorization: Bearer YOUR_API_KEY"

Query Parameters

  • status: Filter by status (pending, running, completed, failed, cancelled)
  • algorithm: Filter by algorithm (grid, random, bayesian, tpe, hyperband)
  • limit: Number of jobs to return (1-100, default: 50)
  • offset: Number of jobs to skip for pagination

Stop Tuning Job

Stop a running hyperparameter tuning job.
curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune/tune_1234567890abcdef/stop" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "reason": "User requested stop",
    "complete_running_trials": true
  }'

Create Training Job from Best Trial

Create a new training job using the best parameters found during tuning.
curl -X POST "https://api.tensorone.ai/v2/training/jobs/job_1234567890abcdef/tune/tune_1234567890abcdef/deploy-best" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "optimized-model-training",
    "epochs": 10,
    "use_best_trial": "trial_0034",
    "override_parameters": {
      "epochs": 10,
      "save_steps": 500
    }
  }'

Search Space Configuration

Parameter Types

Continuous Parameters

{
  "learning_rate": {
    "type": "uniform",
    "min": 0.001,
    "max": 0.1
  },
  "weight_decay": {
    "type": "log_uniform",
    "min": 1e-5,
    "max": 1e-1
  }
}

Discrete Parameters

{
  "batch_size": {
    "type": "choice",
    "values": [4, 8, 16, 32, 64]
  },
  "optimizer": {
    "type": "choice",
    "values": ["adam", "sgd", "adamw", "adagrad"]
  }
}

Integer Parameters

{
  "num_layers": {
    "type": "int_uniform",
    "min": 6,
    "max": 24
  },
  "hidden_size": {
    "type": "int_log_uniform",
    "min": 128,
    "max": 2048
  }
}

Optimization Algorithms

Bayesian Optimization

  • Best for: Expensive function evaluations, continuous parameters
  • Pros: Sample efficient, handles noise well
  • Cons: Slower for discrete spaces, requires more memory

Tree-structured Parzen Estimator (TPE)

  • Best for: Mixed discrete/continuous spaces
  • Pros: Efficient for complex search spaces
  • Cons: May get stuck in local optima
  • Best for: Quick exploration, baseline comparison
  • Pros: Simple, parallelizable, robust
  • Cons: Not sample efficient for complex spaces
  • Best for: Small search spaces, interpretable results
  • Pros: Exhaustive, deterministic
  • Cons: Exponential scaling, curse of dimensionality

Hyperband

  • Best for: Neural architecture search, large budgets
  • Pros: Efficient early stopping, handles varying budgets
  • Cons: Requires configurable training epochs

SDK Examples

Python SDK

from tensorone import TensorOneClient

client = TensorOneClient(api_key="YOUR_API_KEY")

# Start hyperparameter tuning
tuning_job = client.training.hyperparameters.tune(
    training_job_id="job_1234567890abcdef",
    objective="minimize",
    metric="validation_loss",
    algorithm="bayesian",
    max_trials=50,
    max_concurrent=4,
    search_space={
        "learning_rate": {
            "type": "log_uniform",
            "min": 1e-5,
            "max": 1e-2
        },
        "batch_size": {
            "type": "choice",
            "values": [4, 8, 16, 32]
        },
        "weight_decay": {
            "type": "uniform",
            "min": 0.001,
            "max": 0.1
        }
    },
    early_stopping={
        "metric": "validation_loss",
        "patience": 3,
        "min_delta": 0.001
    }
)

print(f"Started tuning job: {tuning_job.id}")

# Monitor progress
while tuning_job.status in ["pending", "running"]:
    tuning_job = client.training.hyperparameters.get(tuning_job.id)
    progress = tuning_job.progress
    print(f"Progress: {progress.completed_trials}/{progress.total_trials} trials")
    print(f"Best value so far: {progress.best_value}")
    time.sleep(60)

# Get final results
final_results = client.training.hyperparameters.get(tuning_job.id)
best_trial = final_results.best_trial
print(f"Best parameters: {best_trial.parameters}")
print(f"Best score: {best_trial.metrics[final_results.metric]}")

# Create optimized training job
optimized_job = client.training.hyperparameters.deploy_best(
    tuning_job_id=tuning_job.id,
    name="optimized-model-training",
    epochs=10
)

print(f"Created optimized training job: {optimized_job.id}")

JavaScript SDK

import { TensorOneClient } from '@tensorone/sdk';

const client = new TensorOneClient({ apiKey: 'YOUR_API_KEY' });

// Start hyperparameter tuning
const tuningJob = await client.training.hyperparameters.tune({
  trainingJobId: 'job_1234567890abcdef',
  objective: 'minimize',
  metric: 'validation_loss',
  algorithm: 'bayesian',
  maxTrials: 50,
  maxConcurrent: 4,
  searchSpace: {
    learningRate: {
      type: 'log_uniform',
      min: 1e-5,
      max: 1e-2
    },
    batchSize: {
      type: 'choice',
      values: [4, 8, 16, 32]
    },
    weightDecay: {
      type: 'uniform',
      min: 0.001,
      max: 0.1
    }
  },
  earlyStopping: {
    metric: 'validation_loss',
    patience: 3,
    minDelta: 0.001
  }
});

console.log(`Started tuning job: ${tuningJob.id}`);

// Monitor progress
const monitorTuning = async (jobId) => {
  const job = await client.training.hyperparameters.get(jobId);
  const progress = job.progress;
  
  console.log(`Progress: ${progress.completedTrials}/${progress.totalTrials} trials`);
  console.log(`Best value so far: ${progress.bestValue}`);
  
  if (job.status === 'running' || job.status === 'pending') {
    setTimeout(() => monitorTuning(jobId), 60000);
  } else {
    console.log('Tuning completed!');
    console.log(`Best parameters: ${JSON.stringify(job.bestTrial.parameters)}`);
    
    // Create optimized training job
    const optimizedJob = await client.training.hyperparameters.deployBest({
      tuningJobId: jobId,
      name: 'optimized-model-training',
      epochs: 10
    });
    
    console.log(`Created optimized training job: ${optimizedJob.id}`);
  }
};

monitorTuning(tuningJob.id);

Advanced Configurations

Multi-Objective Optimization

{
  "objectives": [
    {
      "metric": "accuracy",
      "direction": "maximize",
      "weight": 0.7
    },
    {
      "metric": "inference_time",
      "direction": "minimize",
      "weight": 0.3
    }
  ]
}

Conditional Parameters

{
  "optimizer": {
    "type": "choice",
    "values": ["adam", "sgd"]
  },
  "adam_beta1": {
    "type": "uniform",
    "min": 0.8,
    "max": 0.99,
    "condition": "optimizer == 'adam'"
  },
  "sgd_momentum": {
    "type": "uniform",
    "min": 0.5,
    "max": 0.99,
    "condition": "optimizer == 'sgd'"
  }
}

Budget-Aware Optimization

{
  "budget": {
    "type": "fidelity",
    "parameter": "epochs",
    "min": 5,
    "max": 50,
    "resource_multiplier": 1.0
  },
  "multifidelity": {
    "enabled": true,
    "promotion_strategy": "top_k",
    "promotion_rate": 0.5
  }
}

Error Handling

Common Errors

{
  "error": "INVALID_SEARCH_SPACE",
  "message": "Search space configuration is invalid",
  "details": {
    "parameter": "learning_rate",
    "reason": "min value must be less than max value"
  }
}
{
  "error": "INSUFFICIENT_BUDGET",
  "message": "Insufficient budget for requested trials",
  "details": {
    "requestedTrials": 100,
    "estimatedCost": 1500,
    "availableBudget": 500
  }
}
{
  "error": "ALGORITHM_NOT_SUPPORTED",
  "message": "Algorithm not supported for this parameter type",
  "details": {
    "algorithm": "grid",
    "unsupportedParameterTypes": ["log_uniform"]
  }
}

Best Practices

Search Space Design

  • Start with wide ranges and narrow down based on initial results
  • Use appropriate parameter types (log_uniform for learning rates)
  • Include both architectural and training hyperparameters
  • Consider parameter interactions and dependencies

Algorithm Selection

  • Use Bayesian optimization for expensive evaluations
  • Use random search for initial exploration
  • Use grid search for final fine-tuning
  • Use Hyperband for neural architecture search

Resource Management

  • Set appropriate budget constraints to control costs
  • Use early stopping to avoid wasted computation
  • Balance exploration vs exploitation with concurrent trials
  • Monitor convergence to avoid unnecessary trials
Hyperparameter tuning jobs can be paused and resumed. Intermediate results are saved automatically for recovery.
Hyperparameter tuning can be computationally expensive. Set appropriate budget limits and use early stopping to control costs.

Authorizations

Authorization
string
header
required

API key authentication. Use 'Bearer YOUR_API_KEY' format.

Path Parameters

id
string
required

Body

application/json

Response

202 - application/json

Hyperparameter tuning started

The response is of type object.