Experiment Tracking - Tensor One

Experiment tracking enables systematic comparison of different training approaches, hyperparameters, and model architectures. TensorOne’s experiment management system provides comprehensive tools for organizing and analyzing your ML experiments.

Create Experiment

Create a new experiment to group related training runs and track their progress.

Required Parameters

name: Human-readable name for the experiment (1-100 characters)
description: Description of the experiment’s purpose

Optional Parameters

tags: Array of tags for organization
metadata: Additional metadata object
hypothesis: Hypothesis being tested
baseline: Baseline model or experiment for comparison

Example Usage

Create A/B Testing Experiment

curl -X POST "https://api.tensorone.ai/v2/training/experiments" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "llama-7b-lora-comparison",
    "description": "Compare different LoRA configurations for LLaMA 7B fine-tuning",
    "hypothesis": "Lower rank LoRA adapters with higher alpha values will achieve better performance/efficiency tradeoff",
    "tags": ["lora", "llama", "efficiency", "comparison"],
    "metadata": {
      "model_family": "llama",
      "base_model": "meta-llama/Llama-2-7b-hf",
      "dataset": "custom-instruction-tuning",
      "objective": "minimize_validation_loss"
    },
    "baseline": {
      "type": "external",
      "name": "Original LLaMA 7B",
      "metrics": {
        "validation_loss": 0.452,
        "accuracy": 0.821
      }
    }
  }'

Create Hyperparameter Search Experiment

curl -X POST "https://api.tensorone.ai/v2/training/experiments" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "vision-transformer-optimization",
    "description": "Systematic hyperparameter optimization for Vision Transformer on ImageNet",
    "hypothesis": "Larger patch sizes with adaptive learning rates will improve convergence speed",
    "tags": ["vision-transformer", "imagenet", "hyperparameter-search"],
    "metadata": {
      "architecture": "vision-transformer",
      "dataset": "imagenet-1k",
      "optimization_method": "bayesian",
      "budget": "100_trials"
    }
  }'

Response

Returns the created experiment object:

{
  "id": "exp_1234567890abcdef",
  "name": "llama-7b-lora-comparison",
  "description": "Compare different LoRA configurations for LLaMA 7B fine-tuning",
  "status": "active",
  "hypothesis": "Lower rank LoRA adapters with higher alpha values will achieve better performance/efficiency tradeoff",
  "tags": ["lora", "llama", "efficiency", "comparison"],
  "metadata": {
    "model_family": "llama",
    "base_model": "meta-llama/Llama-2-7b-hf",
    "dataset": "custom-instruction-tuning"
  },
  "baseline": {
    "type": "external",
    "name": "Original LLaMA 7B",
    "metrics": {
      "validation_loss": 0.452,
      "accuracy": 0.821
    }
  },
  "runs": [],
  "statistics": {
    "total_runs": 0,
    "completed_runs": 0,
    "best_metric": null,
    "total_cost": 0.0
  },
  "createdAt": "2024-01-15T10:30:00Z",
  "updatedAt": "2024-01-15T10:30:00Z"
}

Add Training Run to Experiment

Associate a training job with an experiment for tracking and comparison.

curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/runs" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "trainingJobId": "job_1234567890abcdef",
    "name": "lora-rank-16-alpha-32",
    "parameters": {
      "lora_rank": 16,
      "lora_alpha": 32,
      "learning_rate": 3e-4,
      "batch_size": 8,
      "weight_decay": 0.01
    },
    "tags": ["rank-16", "alpha-32", "baseline"],
    "notes": "Baseline configuration with standard LoRA parameters"
  }'

Response

{
  "id": "run_1234567890abcdef",
  "experimentId": "exp_1234567890abcdef",
  "trainingJobId": "job_1234567890abcdef",
  "name": "lora-rank-16-alpha-32",
  "status": "running",
  "parameters": {
    "lora_rank": 16,
    "lora_alpha": 32,
    "learning_rate": 3e-4,
    "batch_size": 8,
    "weight_decay": 0.01
  },
  "metrics": {
    "current": {
      "training_loss": 0.234,
      "validation_loss": 0.267,
      "accuracy": 0.892
    },
    "best": {
      "validation_loss": 0.241,
      "accuracy": 0.904
    }
  },
  "artifacts": [
    {
      "type": "checkpoint",
      "name": "best_model.pt",
      "path": "/checkpoints/run_1234567890abcdef/best_model.pt"
    }
  ],
  "tags": ["rank-16", "alpha-32", "baseline"],
  "notes": "Baseline configuration with standard LoRA parameters",
  "createdAt": "2024-01-15T11:00:00Z",
  "updatedAt": "2024-01-15T12:30:00Z"
}

Get Experiment Details

Retrieve comprehensive information about an experiment and its runs.

curl -X GET "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "id": "exp_1234567890abcdef",
  "name": "llama-7b-lora-comparison",
  "description": "Compare different LoRA configurations for LLaMA 7B fine-tuning",
  "status": "active",
  "hypothesis": "Lower rank LoRA adapters with higher alpha values will achieve better performance/efficiency tradeoff",
  "runs": [
    {
      "id": "run_1234567890abcdef",
      "name": "lora-rank-16-alpha-32",
      "status": "completed",
      "parameters": {
        "lora_rank": 16,
        "lora_alpha": 32,
        "learning_rate": 3e-4
      },
      "metrics": {
        "final": {
          "validation_loss": 0.241,
          "accuracy": 0.904,
          "f1_score": 0.897,
          "training_time": 3600,
          "cost": 28.50
        }
      },
      "completedAt": "2024-01-15T13:00:00Z"
    },
    {
      "id": "run_2345678901bcdefg",
      "name": "lora-rank-8-alpha-64",
      "status": "completed",
      "parameters": {
        "lora_rank": 8,
        "lora_alpha": 64,
        "learning_rate": 3e-4
      },
      "metrics": {
        "final": {
          "validation_loss": 0.228,
          "accuracy": 0.912,
          "f1_score": 0.905,
          "training_time": 3200,
          "cost": 25.20
        }
      },
      "completedAt": "2024-01-15T14:15:00Z"
    }
  ],
  "statistics": {
    "total_runs": 5,
    "completed_runs": 2,
    "running_runs": 2,
    "failed_runs": 1,
    "best_run": {
      "id": "run_2345678901bcdefg",
      "metric": "validation_loss",
      "value": 0.228
    },
    "total_cost": 142.80,
    "total_training_time": 16800
  },
  "insights": [
    {
      "type": "parameter_correlation",
      "message": "Lower LoRA rank with higher alpha shows better performance",
      "confidence": 0.87,
      "evidence": "2/2 runs with rank < 16 achieved validation_loss < 0.25"
    },
    {
      "type": "cost_efficiency",
      "message": "rank-8 configuration provides best cost/performance ratio",
      "savings": "11.6% cost reduction with 5.4% performance improvement"
    }
  ],
  "createdAt": "2024-01-15T10:30:00Z",
  "updatedAt": "2024-01-15T14:15:00Z"
}

Compare Experiment Runs

Compare metrics and parameters across multiple runs within an experiment.

curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/compare" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "runs": ["run_1234567890abcdef", "run_2345678901bcdefg", "run_3456789012cdefgh"],
    "metrics": ["validation_loss", "accuracy", "f1_score", "training_time", "cost"],
    "analysis": {
      "statistical_tests": true,
      "correlation_analysis": true,
      "performance_pareto": true
    }
  }'

Response

{
  "comparison": {
    "runs": [
      {
        "id": "run_1234567890abcdef",
        "name": "lora-rank-16-alpha-32",
        "parameters": {"lora_rank": 16, "lora_alpha": 32},
        "metrics": {"validation_loss": 0.241, "accuracy": 0.904, "cost": 28.50}
      },
      {
        "id": "run_2345678901bcdefg",
        "name": "lora-rank-8-alpha-64",
        "parameters": {"lora_rank": 8, "lora_alpha": 64},
        "metrics": {"validation_loss": 0.228, "accuracy": 0.912, "cost": 25.20}
      }
    ],
    "rankings": {
      "validation_loss": ["run_2345678901bcdefg", "run_1234567890abcdef"],
      "accuracy": ["run_2345678901bcdefg", "run_1234567890abcdef"],
      "cost": ["run_2345678901bcdefg", "run_1234567890abcdef"]
    },
    "statistical_analysis": {
      "validation_loss": {
        "mean": 0.235,
        "std": 0.009,
        "significant_difference": true,
        "p_value": 0.023
      }
    },
    "correlations": [
      {
        "parameters": ["lora_rank", "lora_alpha"],
        "metric": "validation_loss",
        "correlation": -0.78,
        "strength": "strong"
      }
    ],
    "pareto_frontier": [
      {
        "runId": "run_2345678901bcdefg",
        "reason": "Best accuracy/cost ratio"
      }
    ],
    "recommendations": [
      "run_2345678901bcdefg provides the best overall performance",
      "Lower LoRA rank with higher alpha consistently outperforms",
      "Consider further reducing rank to 4 with alpha 128"
    ]
  }
}

List Experiments

Retrieve a list of experiments for your account.

curl -X GET "https://api.tensorone.ai/v2/training/experiments" \
  -H "Authorization: Bearer YOUR_API_KEY"

Query Parameters

status: Filter by status (active, completed, archived)
tags: Filter by tags (comma-separated)
limit: Number of experiments to return (1-100, default: 50)
offset: Number of experiments to skip for pagination
sort: Sort order (created_at, updated_at, name)
order: Sort direction (asc, desc, default: desc)

Update Experiment

Update experiment metadata, hypothesis, or status.

curl -X PATCH "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "llama-7b-lora-optimization-study",
    "description": "Updated: Comprehensive study of LoRA configurations for optimal performance/efficiency",
    "status": "completed",
    "conclusion": "Rank-8 Alpha-64 configuration provides optimal balance of performance and efficiency",
    "tags": ["lora", "llama", "efficiency", "comparison", "completed"],
    "metadata": {
      "model_family": "llama",
      "base_model": "meta-llama/Llama-2-7b-hf",
      "dataset": "custom-instruction-tuning",
      "optimal_config": {
        "lora_rank": 8,
        "lora_alpha": 64,
        "learning_rate": 3e-4
      }
    }
  }'

Archive Experiment

Archive a completed experiment to reduce clutter while preserving data.

curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/archive" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "reason": "Experiment completed, results documented",
    "preserve_artifacts": true
  }'

Clone Experiment

Create a new experiment based on an existing one.

curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/clone" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "llama-13b-lora-comparison",
    "description": "Apply optimal LoRA configurations to LLaMA 13B model",
    "modifications": {
      "base_model": "meta-llama/Llama-2-13b-hf",
      "parameters": {
        "batch_size": 4
      }
    },
    "copy_runs": false
  }'

Export Experiment Results

Export experiment data for external analysis or reporting.

curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/export" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "format": "csv",
    "include": ["parameters", "metrics", "artifacts", "metadata"],
    "destination": {
      "type": "s3",
      "bucket": "ml-experiment-results",
      "path": "llama-lora-study/"
    }
  }'

SDK Examples

Python SDK

from tensorone import TensorOneClient
import matplotlib.pyplot as plt
import pandas as pd

client = TensorOneClient(api_key="YOUR_API_KEY")

# Create experiment
experiment = client.training.experiments.create(
    name="transformer-architecture-study",
    description="Compare different transformer architectures for text classification",
    hypothesis="Deeper models with fewer attention heads will achieve better accuracy",
    tags=["transformer", "classification", "architecture"]
)

print(f"Created experiment: {experiment.id}")

# Add multiple training runs
configs = [
    {"layers": 12, "attention_heads": 8, "hidden_size": 768},
    {"layers": 24, "attention_heads": 4, "hidden_size": 512},
    {"layers": 6, "attention_heads": 16, "hidden_size": 1024}
]

for i, config in enumerate(configs):
    # Start training job (pseudo-code)
    job = client.training.jobs.create(
        name=f"transformer-config-{i}",
        config=config
    )
    
    # Add to experiment
    run = client.training.experiments.add_run(
        experiment_id=experiment.id,
        training_job_id=job.id,
        name=f"config-{i}-layers-{config['layers']}",
        parameters=config
    )
    
    print(f"Added run: {run.id}")

# Monitor experiment progress
experiment = client.training.experiments.get(experiment.id)
while experiment.statistics.completed_runs < len(configs):
    print(f"Progress: {experiment.statistics.completed_runs}/{experiment.statistics.total_runs} runs completed")
    time.sleep(60)
    experiment = client.training.experiments.get(experiment.id)

# Compare results
comparison = client.training.experiments.compare(
    experiment_id=experiment.id,
    metrics=["accuracy", "f1_score", "training_time"],
    analysis={"statistical_tests": True, "correlation_analysis": True}
)

# Plot results
runs_df = pd.DataFrame([
    {
        "name": run.name,
        "layers": run.parameters["layers"],
        "accuracy": run.metrics["final"]["accuracy"]
    }
    for run in experiment.runs
])

plt.figure(figsize=(10, 6))
plt.scatter(runs_df["layers"], runs_df["accuracy"])
plt.xlabel("Number of Layers")
plt.ylabel("Accuracy")
plt.title("Model Performance vs Architecture")
plt.show()

print("Best performing configuration:")
best_run = comparison.pareto_frontier[0]
print(f"Run: {best_run['runId']}")
print(f"Reason: {best_run['reason']}")

JavaScript SDK

import { TensorOneClient } from '@tensorone/sdk';

const client = new TensorOneClient({ apiKey: 'YOUR_API_KEY' });

// Create experiment
const experiment = await client.training.experiments.create({
  name: 'llm-fine-tuning-comparison',
  description: 'Compare different fine-tuning strategies for language models',
  hypothesis: 'LoRA with lower rank will achieve better efficiency without sacrificing performance',
  tags: ['llm', 'fine-tuning', 'lora', 'efficiency']
});

console.log(`Created experiment: ${experiment.id}`);

// Add training runs
const strategies = [
  { name: 'full-fine-tuning', strategy: 'full', parameters: {} },
  { name: 'lora-rank-16', strategy: 'lora', parameters: { rank: 16, alpha: 32 } },
  { name: 'lora-rank-8', strategy: 'lora', parameters: { rank: 8, alpha: 64 } }
];

const runs = [];
for (const strategy of strategies) {
  // Create training job
  const job = await client.training.jobs.create({
    name: strategy.name,
    modelType: 'llm',
    config: {
      strategy: strategy.strategy,
      parameters: strategy.parameters
    }
  });
  
  // Add to experiment
  const run = await client.training.experiments.addRun({
    experimentId: experiment.id,
    trainingJobId: job.id,
    name: strategy.name,
    parameters: strategy.parameters
  });
  
  runs.push(run);
  console.log(`Added run: ${run.id}`);
}

// Monitor and compare results
const monitorExperiment = async (experimentId) => {
  const exp = await client.training.experiments.get(experimentId);
  
  if (exp.statistics.completedRuns === exp.statistics.totalRuns) {
    console.log('All runs completed!');
    
    // Compare results
    const comparison = await client.training.experiments.compare({
      experimentId: experimentId,
      metrics: ['validation_loss', 'accuracy', 'training_time', 'cost'],
      analysis: {
        statisticalTests: true,
        correlationAnalysis: true,
        performancePareto: true
      }
    });
    
    console.log('Experiment Results:');
    comparison.recommendations.forEach(rec => {
      console.log(`- ${rec}`);
    });
    
    // Find best run
    const bestRun = comparison.comparison.runs.find(
      run => run.id === comparison.pareto_frontier[0].runId
    );
    
    console.log(`Best configuration: ${bestRun.name}`);
    console.log(`Parameters: ${JSON.stringify(bestRun.parameters)}`);
    
  } else {
    console.log(`Progress: ${exp.statistics.completedRuns}/${exp.statistics.totalRuns} runs completed`);
    setTimeout(() => monitorExperiment(experimentId), 60000);
  }
};

monitorExperiment(experiment.id);

Experiment Templates

A/B Testing Template

{
  "template": "ab_testing",
  "parameters": {
    "variants": ["A", "B"],
    "traffic_split": [0.5, 0.5],
    "success_metric": "accuracy",
    "minimum_effect_size": 0.05,
    "statistical_power": 0.8,
    "alpha": 0.05
  }
}

Hyperparameter Grid Search

{
  "template": "grid_search",
  "parameters": {
    "search_space": {
      "learning_rate": [1e-4, 3e-4, 1e-3],
      "batch_size": [16, 32, 64],
      "weight_decay": [0.01, 0.1]
    },
    "objective": "minimize",
    "metric": "validation_loss"
  }
}

Progressive Model Scaling

{
  "template": "progressive_scaling",
  "parameters": {
    "model_sizes": ["small", "medium", "large"],
    "scaling_law_analysis": true,
    "efficiency_metrics": ["flops", "parameters", "memory"],
    "performance_targets": {"accuracy": 0.9}
  }
}

Error Handling

Common Errors

{
  "error": "EXPERIMENT_NOT_FOUND",
  "message": "Experiment with specified ID does not exist",
  "details": {
    "experimentId": "exp_invalid_id"
  }
}

{
  "error": "RUN_ALREADY_EXISTS",
  "message": "Training job is already associated with another experiment",
  "details": {
    "trainingJobId": "job_1234567890abcdef",
    "existingExperimentId": "exp_other_experiment"
  }
}

{
  "error": "INSUFFICIENT_RUNS",
  "message": "Not enough completed runs for statistical comparison",
  "details": {
    "completedRuns": 1,
    "minimumRequired": 2
  }
}

Best Practices

Experiment Design

Clearly define hypotheses before starting experiments
Use appropriate statistical methods for comparison
Include baseline models for meaningful comparisons
Document assumptions and limitations

Organization

Use consistent naming conventions across experiments
Tag experiments with relevant metadata
Group related experiments into collections
Archive completed experiments to reduce clutter

Analysis

Wait for sufficient runs before drawing conclusions
Use statistical significance testing for comparisons
Consider multiple metrics beyond primary objective
Document insights and recommendations for future reference

Experiments can contain up to 1000 runs. For larger studies, consider creating multiple related experiments.

Deleting an experiment will also delete all associated run metadata. Archive experiments instead of deleting them.

Authorizations

Authorization

string

header

required

API key authentication. Use 'Bearer YOUR_API_KEY' format.

Body

application/json

Response

201 - application/json

Experiment created successfully

The response is of type object.

Getting Started

Account Management

GPU Clusters (VPS)

Serverless Endpoints

Managed Training

AI Services

Payment & Billing

Monitoring & Analytics

​Create Experiment

​Required Parameters

​Optional Parameters

​Example Usage

​Create A/B Testing Experiment

​Create Hyperparameter Search Experiment

​Response

​Add Training Run to Experiment

​Response

​Get Experiment Details

​Response

​Compare Experiment Runs

​Response

​List Experiments

​Query Parameters

​Update Experiment

​Archive Experiment

​Clone Experiment

​Export Experiment Results

​SDK Examples

​Python SDK

​JavaScript SDK

​Experiment Templates

​A/B Testing Template

​Hyperparameter Grid Search

​Progressive Model Scaling

​Error Handling

​Common Errors

​Best Practices

​Experiment Design

​Organization

​Analysis

Authorizations

Body

Response

Create Experiment

Required Parameters

Optional Parameters

Example Usage

Create A/B Testing Experiment

Create Hyperparameter Search Experiment

Response

Add Training Run to Experiment

Response

Get Experiment Details

Response

Compare Experiment Runs

Response

List Experiments

Query Parameters

Update Experiment

Archive Experiment

Clone Experiment

Export Experiment Results

SDK Examples

Python SDK

JavaScript SDK

Experiment Templates

A/B Testing Template

Hyperparameter Grid Search

Progressive Model Scaling

Error Handling

Common Errors

Best Practices

Experiment Design

Organization

Analysis