Experiment tracking enables systematic comparison of different training approaches, hyperparameters, and model architectures. TensorOne’s experiment management system provides comprehensive tools for organizing and analyzing your ML experiments.
Create Experiment
Create a new experiment to group related training runs and track their progress.
Required Parameters
name
: Human-readable name for the experiment (1-100 characters)
description
: Description of the experiment’s purpose
Optional Parameters
tags
: Array of tags for organization
metadata
: Additional metadata object
hypothesis
: Hypothesis being tested
baseline
: Baseline model or experiment for comparison
Example Usage
Create A/B Testing Experiment
curl -X POST "https://api.tensorone.ai/v2/training/experiments" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "llama-7b-lora-comparison",
"description": "Compare different LoRA configurations for LLaMA 7B fine-tuning",
"hypothesis": "Lower rank LoRA adapters with higher alpha values will achieve better performance/efficiency tradeoff",
"tags": ["lora", "llama", "efficiency", "comparison"],
"metadata": {
"model_family": "llama",
"base_model": "meta-llama/Llama-2-7b-hf",
"dataset": "custom-instruction-tuning",
"objective": "minimize_validation_loss"
},
"baseline": {
"type": "external",
"name": "Original LLaMA 7B",
"metrics": {
"validation_loss": 0.452,
"accuracy": 0.821
}
}
}'
Create Hyperparameter Search Experiment
curl -X POST "https://api.tensorone.ai/v2/training/experiments" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "vision-transformer-optimization",
"description": "Systematic hyperparameter optimization for Vision Transformer on ImageNet",
"hypothesis": "Larger patch sizes with adaptive learning rates will improve convergence speed",
"tags": ["vision-transformer", "imagenet", "hyperparameter-search"],
"metadata": {
"architecture": "vision-transformer",
"dataset": "imagenet-1k",
"optimization_method": "bayesian",
"budget": "100_trials"
}
}'
Response
Returns the created experiment object:
{
"id": "exp_1234567890abcdef",
"name": "llama-7b-lora-comparison",
"description": "Compare different LoRA configurations for LLaMA 7B fine-tuning",
"status": "active",
"hypothesis": "Lower rank LoRA adapters with higher alpha values will achieve better performance/efficiency tradeoff",
"tags": ["lora", "llama", "efficiency", "comparison"],
"metadata": {
"model_family": "llama",
"base_model": "meta-llama/Llama-2-7b-hf",
"dataset": "custom-instruction-tuning"
},
"baseline": {
"type": "external",
"name": "Original LLaMA 7B",
"metrics": {
"validation_loss": 0.452,
"accuracy": 0.821
}
},
"runs": [],
"statistics": {
"total_runs": 0,
"completed_runs": 0,
"best_metric": null,
"total_cost": 0.0
},
"createdAt": "2024-01-15T10:30:00Z",
"updatedAt": "2024-01-15T10:30:00Z"
}
Add Training Run to Experiment
Associate a training job with an experiment for tracking and comparison.
curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/runs" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"trainingJobId": "job_1234567890abcdef",
"name": "lora-rank-16-alpha-32",
"parameters": {
"lora_rank": 16,
"lora_alpha": 32,
"learning_rate": 3e-4,
"batch_size": 8,
"weight_decay": 0.01
},
"tags": ["rank-16", "alpha-32", "baseline"],
"notes": "Baseline configuration with standard LoRA parameters"
}'
Response
{
"id": "run_1234567890abcdef",
"experimentId": "exp_1234567890abcdef",
"trainingJobId": "job_1234567890abcdef",
"name": "lora-rank-16-alpha-32",
"status": "running",
"parameters": {
"lora_rank": 16,
"lora_alpha": 32,
"learning_rate": 3e-4,
"batch_size": 8,
"weight_decay": 0.01
},
"metrics": {
"current": {
"training_loss": 0.234,
"validation_loss": 0.267,
"accuracy": 0.892
},
"best": {
"validation_loss": 0.241,
"accuracy": 0.904
}
},
"artifacts": [
{
"type": "checkpoint",
"name": "best_model.pt",
"path": "/checkpoints/run_1234567890abcdef/best_model.pt"
}
],
"tags": ["rank-16", "alpha-32", "baseline"],
"notes": "Baseline configuration with standard LoRA parameters",
"createdAt": "2024-01-15T11:00:00Z",
"updatedAt": "2024-01-15T12:30:00Z"
}
Get Experiment Details
Retrieve comprehensive information about an experiment and its runs.
curl -X GET "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef" \
-H "Authorization: Bearer YOUR_API_KEY"
Response
{
"id": "exp_1234567890abcdef",
"name": "llama-7b-lora-comparison",
"description": "Compare different LoRA configurations for LLaMA 7B fine-tuning",
"status": "active",
"hypothesis": "Lower rank LoRA adapters with higher alpha values will achieve better performance/efficiency tradeoff",
"runs": [
{
"id": "run_1234567890abcdef",
"name": "lora-rank-16-alpha-32",
"status": "completed",
"parameters": {
"lora_rank": 16,
"lora_alpha": 32,
"learning_rate": 3e-4
},
"metrics": {
"final": {
"validation_loss": 0.241,
"accuracy": 0.904,
"f1_score": 0.897,
"training_time": 3600,
"cost": 28.50
}
},
"completedAt": "2024-01-15T13:00:00Z"
},
{
"id": "run_2345678901bcdefg",
"name": "lora-rank-8-alpha-64",
"status": "completed",
"parameters": {
"lora_rank": 8,
"lora_alpha": 64,
"learning_rate": 3e-4
},
"metrics": {
"final": {
"validation_loss": 0.228,
"accuracy": 0.912,
"f1_score": 0.905,
"training_time": 3200,
"cost": 25.20
}
},
"completedAt": "2024-01-15T14:15:00Z"
}
],
"statistics": {
"total_runs": 5,
"completed_runs": 2,
"running_runs": 2,
"failed_runs": 1,
"best_run": {
"id": "run_2345678901bcdefg",
"metric": "validation_loss",
"value": 0.228
},
"total_cost": 142.80,
"total_training_time": 16800
},
"insights": [
{
"type": "parameter_correlation",
"message": "Lower LoRA rank with higher alpha shows better performance",
"confidence": 0.87,
"evidence": "2/2 runs with rank < 16 achieved validation_loss < 0.25"
},
{
"type": "cost_efficiency",
"message": "rank-8 configuration provides best cost/performance ratio",
"savings": "11.6% cost reduction with 5.4% performance improvement"
}
],
"createdAt": "2024-01-15T10:30:00Z",
"updatedAt": "2024-01-15T14:15:00Z"
}
Compare Experiment Runs
Compare metrics and parameters across multiple runs within an experiment.
curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/compare" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"runs": ["run_1234567890abcdef", "run_2345678901bcdefg", "run_3456789012cdefgh"],
"metrics": ["validation_loss", "accuracy", "f1_score", "training_time", "cost"],
"analysis": {
"statistical_tests": true,
"correlation_analysis": true,
"performance_pareto": true
}
}'
Response
{
"comparison": {
"runs": [
{
"id": "run_1234567890abcdef",
"name": "lora-rank-16-alpha-32",
"parameters": {"lora_rank": 16, "lora_alpha": 32},
"metrics": {"validation_loss": 0.241, "accuracy": 0.904, "cost": 28.50}
},
{
"id": "run_2345678901bcdefg",
"name": "lora-rank-8-alpha-64",
"parameters": {"lora_rank": 8, "lora_alpha": 64},
"metrics": {"validation_loss": 0.228, "accuracy": 0.912, "cost": 25.20}
}
],
"rankings": {
"validation_loss": ["run_2345678901bcdefg", "run_1234567890abcdef"],
"accuracy": ["run_2345678901bcdefg", "run_1234567890abcdef"],
"cost": ["run_2345678901bcdefg", "run_1234567890abcdef"]
},
"statistical_analysis": {
"validation_loss": {
"mean": 0.235,
"std": 0.009,
"significant_difference": true,
"p_value": 0.023
}
},
"correlations": [
{
"parameters": ["lora_rank", "lora_alpha"],
"metric": "validation_loss",
"correlation": -0.78,
"strength": "strong"
}
],
"pareto_frontier": [
{
"runId": "run_2345678901bcdefg",
"reason": "Best accuracy/cost ratio"
}
],
"recommendations": [
"run_2345678901bcdefg provides the best overall performance",
"Lower LoRA rank with higher alpha consistently outperforms",
"Consider further reducing rank to 4 with alpha 128"
]
}
}
List Experiments
Retrieve a list of experiments for your account.
curl -X GET "https://api.tensorone.ai/v2/training/experiments" \
-H "Authorization: Bearer YOUR_API_KEY"
Query Parameters
status
: Filter by status (active
, completed
, archived
)
tags
: Filter by tags (comma-separated)
limit
: Number of experiments to return (1-100, default: 50)
offset
: Number of experiments to skip for pagination
sort
: Sort order (created_at
, updated_at
, name
)
order
: Sort direction (asc
, desc
, default: desc
)
Update Experiment
Update experiment metadata, hypothesis, or status.
curl -X PATCH "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "llama-7b-lora-optimization-study",
"description": "Updated: Comprehensive study of LoRA configurations for optimal performance/efficiency",
"status": "completed",
"conclusion": "Rank-8 Alpha-64 configuration provides optimal balance of performance and efficiency",
"tags": ["lora", "llama", "efficiency", "comparison", "completed"],
"metadata": {
"model_family": "llama",
"base_model": "meta-llama/Llama-2-7b-hf",
"dataset": "custom-instruction-tuning",
"optimal_config": {
"lora_rank": 8,
"lora_alpha": 64,
"learning_rate": 3e-4
}
}
}'
Archive Experiment
Archive a completed experiment to reduce clutter while preserving data.
curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/archive" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"reason": "Experiment completed, results documented",
"preserve_artifacts": true
}'
Clone Experiment
Create a new experiment based on an existing one.
curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/clone" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "llama-13b-lora-comparison",
"description": "Apply optimal LoRA configurations to LLaMA 13B model",
"modifications": {
"base_model": "meta-llama/Llama-2-13b-hf",
"parameters": {
"batch_size": 4
}
},
"copy_runs": false
}'
Export Experiment Results
Export experiment data for external analysis or reporting.
curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/export" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"format": "csv",
"include": ["parameters", "metrics", "artifacts", "metadata"],
"destination": {
"type": "s3",
"bucket": "ml-experiment-results",
"path": "llama-lora-study/"
}
}'
SDK Examples
Python SDK
from tensorone import TensorOneClient
import matplotlib.pyplot as plt
import pandas as pd
client = TensorOneClient(api_key="YOUR_API_KEY")
# Create experiment
experiment = client.training.experiments.create(
name="transformer-architecture-study",
description="Compare different transformer architectures for text classification",
hypothesis="Deeper models with fewer attention heads will achieve better accuracy",
tags=["transformer", "classification", "architecture"]
)
print(f"Created experiment: {experiment.id}")
# Add multiple training runs
configs = [
{"layers": 12, "attention_heads": 8, "hidden_size": 768},
{"layers": 24, "attention_heads": 4, "hidden_size": 512},
{"layers": 6, "attention_heads": 16, "hidden_size": 1024}
]
for i, config in enumerate(configs):
# Start training job (pseudo-code)
job = client.training.jobs.create(
name=f"transformer-config-{i}",
config=config
)
# Add to experiment
run = client.training.experiments.add_run(
experiment_id=experiment.id,
training_job_id=job.id,
name=f"config-{i}-layers-{config['layers']}",
parameters=config
)
print(f"Added run: {run.id}")
# Monitor experiment progress
experiment = client.training.experiments.get(experiment.id)
while experiment.statistics.completed_runs < len(configs):
print(f"Progress: {experiment.statistics.completed_runs}/{experiment.statistics.total_runs} runs completed")
time.sleep(60)
experiment = client.training.experiments.get(experiment.id)
# Compare results
comparison = client.training.experiments.compare(
experiment_id=experiment.id,
metrics=["accuracy", "f1_score", "training_time"],
analysis={"statistical_tests": True, "correlation_analysis": True}
)
# Plot results
runs_df = pd.DataFrame([
{
"name": run.name,
"layers": run.parameters["layers"],
"accuracy": run.metrics["final"]["accuracy"]
}
for run in experiment.runs
])
plt.figure(figsize=(10, 6))
plt.scatter(runs_df["layers"], runs_df["accuracy"])
plt.xlabel("Number of Layers")
plt.ylabel("Accuracy")
plt.title("Model Performance vs Architecture")
plt.show()
print("Best performing configuration:")
best_run = comparison.pareto_frontier[0]
print(f"Run: {best_run['runId']}")
print(f"Reason: {best_run['reason']}")
JavaScript SDK
import { TensorOneClient } from '@tensorone/sdk';
const client = new TensorOneClient({ apiKey: 'YOUR_API_KEY' });
// Create experiment
const experiment = await client.training.experiments.create({
name: 'llm-fine-tuning-comparison',
description: 'Compare different fine-tuning strategies for language models',
hypothesis: 'LoRA with lower rank will achieve better efficiency without sacrificing performance',
tags: ['llm', 'fine-tuning', 'lora', 'efficiency']
});
console.log(`Created experiment: ${experiment.id}`);
// Add training runs
const strategies = [
{ name: 'full-fine-tuning', strategy: 'full', parameters: {} },
{ name: 'lora-rank-16', strategy: 'lora', parameters: { rank: 16, alpha: 32 } },
{ name: 'lora-rank-8', strategy: 'lora', parameters: { rank: 8, alpha: 64 } }
];
const runs = [];
for (const strategy of strategies) {
// Create training job
const job = await client.training.jobs.create({
name: strategy.name,
modelType: 'llm',
config: {
strategy: strategy.strategy,
parameters: strategy.parameters
}
});
// Add to experiment
const run = await client.training.experiments.addRun({
experimentId: experiment.id,
trainingJobId: job.id,
name: strategy.name,
parameters: strategy.parameters
});
runs.push(run);
console.log(`Added run: ${run.id}`);
}
// Monitor and compare results
const monitorExperiment = async (experimentId) => {
const exp = await client.training.experiments.get(experimentId);
if (exp.statistics.completedRuns === exp.statistics.totalRuns) {
console.log('All runs completed!');
// Compare results
const comparison = await client.training.experiments.compare({
experimentId: experimentId,
metrics: ['validation_loss', 'accuracy', 'training_time', 'cost'],
analysis: {
statisticalTests: true,
correlationAnalysis: true,
performancePareto: true
}
});
console.log('Experiment Results:');
comparison.recommendations.forEach(rec => {
console.log(`- ${rec}`);
});
// Find best run
const bestRun = comparison.comparison.runs.find(
run => run.id === comparison.pareto_frontier[0].runId
);
console.log(`Best configuration: ${bestRun.name}`);
console.log(`Parameters: ${JSON.stringify(bestRun.parameters)}`);
} else {
console.log(`Progress: ${exp.statistics.completedRuns}/${exp.statistics.totalRuns} runs completed`);
setTimeout(() => monitorExperiment(experimentId), 60000);
}
};
monitorExperiment(experiment.id);
Experiment Templates
A/B Testing Template
{
"template": "ab_testing",
"parameters": {
"variants": ["A", "B"],
"traffic_split": [0.5, 0.5],
"success_metric": "accuracy",
"minimum_effect_size": 0.05,
"statistical_power": 0.8,
"alpha": 0.05
}
}
Hyperparameter Grid Search
{
"template": "grid_search",
"parameters": {
"search_space": {
"learning_rate": [1e-4, 3e-4, 1e-3],
"batch_size": [16, 32, 64],
"weight_decay": [0.01, 0.1]
},
"objective": "minimize",
"metric": "validation_loss"
}
}
Progressive Model Scaling
{
"template": "progressive_scaling",
"parameters": {
"model_sizes": ["small", "medium", "large"],
"scaling_law_analysis": true,
"efficiency_metrics": ["flops", "parameters", "memory"],
"performance_targets": {"accuracy": 0.9}
}
}
Error Handling
Common Errors
{
"error": "EXPERIMENT_NOT_FOUND",
"message": "Experiment with specified ID does not exist",
"details": {
"experimentId": "exp_invalid_id"
}
}
{
"error": "RUN_ALREADY_EXISTS",
"message": "Training job is already associated with another experiment",
"details": {
"trainingJobId": "job_1234567890abcdef",
"existingExperimentId": "exp_other_experiment"
}
}
{
"error": "INSUFFICIENT_RUNS",
"message": "Not enough completed runs for statistical comparison",
"details": {
"completedRuns": 1,
"minimumRequired": 2
}
}
Best Practices
Experiment Design
- Clearly define hypotheses before starting experiments
- Use appropriate statistical methods for comparison
- Include baseline models for meaningful comparisons
- Document assumptions and limitations
Organization
- Use consistent naming conventions across experiments
- Tag experiments with relevant metadata
- Group related experiments into collections
- Archive completed experiments to reduce clutter
Analysis
- Wait for sufficient runs before drawing conclusions
- Use statistical significance testing for comparisons
- Consider multiple metrics beyond primary objective
- Document insights and recommendations for future reference
Experiments can contain up to 1000 runs. For larger studies, consider creating multiple related experiments.
Deleting an experiment will also delete all associated run metadata. Archive experiments instead of deleting them.
API key authentication. Use 'Bearer YOUR_API_KEY' format.
Experiment created successfully
The response is of type object
.