Create Training Experiment
curl --request POST \
  --url https://api.tensorone.ai/v2/training/experiments \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
  "name": "<string>",
  "description": "<string>",
  "jobs": [
    {
      "name": "<string>",
      "description": "<string>",
      "framework": "pytorch",
      "modelConfig": {
        "modelType": "language_model",
        "baseModel": "<string>",
        "customCode": "<string>"
      },
      "datasetConfig": {
        "datasetId": "<string>",
        "datasetUrl": "<string>",
        "format": "json"
      },
      "hyperparameters": {
        "learningRate": 0.001,
        "batchSize": 32,
        "epochs": 10,
        "optimizer": "adam"
      },
      "infrastructure": {
        "gpuType": "rtx-4090",
        "gpuCount": 1,
        "memory": "32GB",
        "storage": "100GB"
      }
    }
  ],
  "tags": [
    "<string>"
  ]
}'
{
  "experimentId": "<string>",
  "name": "<string>",
  "status": "created",
  "jobIds": [
    "<string>"
  ],
  "createdAt": "2023-11-07T05:31:56Z"
}
Experiment tracking enables systematic comparison of different training approaches, hyperparameters, and model architectures. TensorOne’s experiment management system provides comprehensive tools for organizing and analyzing your ML experiments.

Create Experiment

Create a new experiment to group related training runs and track their progress.

Required Parameters

  • name: Human-readable name for the experiment (1-100 characters)
  • description: Description of the experiment’s purpose

Optional Parameters

  • tags: Array of tags for organization
  • metadata: Additional metadata object
  • hypothesis: Hypothesis being tested
  • baseline: Baseline model or experiment for comparison

Example Usage

Create A/B Testing Experiment

curl -X POST "https://api.tensorone.ai/v2/training/experiments" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "llama-7b-lora-comparison",
    "description": "Compare different LoRA configurations for LLaMA 7B fine-tuning",
    "hypothesis": "Lower rank LoRA adapters with higher alpha values will achieve better performance/efficiency tradeoff",
    "tags": ["lora", "llama", "efficiency", "comparison"],
    "metadata": {
      "model_family": "llama",
      "base_model": "meta-llama/Llama-2-7b-hf",
      "dataset": "custom-instruction-tuning",
      "objective": "minimize_validation_loss"
    },
    "baseline": {
      "type": "external",
      "name": "Original LLaMA 7B",
      "metrics": {
        "validation_loss": 0.452,
        "accuracy": 0.821
      }
    }
  }'

Create Hyperparameter Search Experiment

curl -X POST "https://api.tensorone.ai/v2/training/experiments" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "vision-transformer-optimization",
    "description": "Systematic hyperparameter optimization for Vision Transformer on ImageNet",
    "hypothesis": "Larger patch sizes with adaptive learning rates will improve convergence speed",
    "tags": ["vision-transformer", "imagenet", "hyperparameter-search"],
    "metadata": {
      "architecture": "vision-transformer",
      "dataset": "imagenet-1k",
      "optimization_method": "bayesian",
      "budget": "100_trials"
    }
  }'

Response

Returns the created experiment object:
{
  "id": "exp_1234567890abcdef",
  "name": "llama-7b-lora-comparison",
  "description": "Compare different LoRA configurations for LLaMA 7B fine-tuning",
  "status": "active",
  "hypothesis": "Lower rank LoRA adapters with higher alpha values will achieve better performance/efficiency tradeoff",
  "tags": ["lora", "llama", "efficiency", "comparison"],
  "metadata": {
    "model_family": "llama",
    "base_model": "meta-llama/Llama-2-7b-hf",
    "dataset": "custom-instruction-tuning"
  },
  "baseline": {
    "type": "external",
    "name": "Original LLaMA 7B",
    "metrics": {
      "validation_loss": 0.452,
      "accuracy": 0.821
    }
  },
  "runs": [],
  "statistics": {
    "total_runs": 0,
    "completed_runs": 0,
    "best_metric": null,
    "total_cost": 0.0
  },
  "createdAt": "2024-01-15T10:30:00Z",
  "updatedAt": "2024-01-15T10:30:00Z"
}

Add Training Run to Experiment

Associate a training job with an experiment for tracking and comparison.
curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/runs" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "trainingJobId": "job_1234567890abcdef",
    "name": "lora-rank-16-alpha-32",
    "parameters": {
      "lora_rank": 16,
      "lora_alpha": 32,
      "learning_rate": 3e-4,
      "batch_size": 8,
      "weight_decay": 0.01
    },
    "tags": ["rank-16", "alpha-32", "baseline"],
    "notes": "Baseline configuration with standard LoRA parameters"
  }'

Response

{
  "id": "run_1234567890abcdef",
  "experimentId": "exp_1234567890abcdef",
  "trainingJobId": "job_1234567890abcdef",
  "name": "lora-rank-16-alpha-32",
  "status": "running",
  "parameters": {
    "lora_rank": 16,
    "lora_alpha": 32,
    "learning_rate": 3e-4,
    "batch_size": 8,
    "weight_decay": 0.01
  },
  "metrics": {
    "current": {
      "training_loss": 0.234,
      "validation_loss": 0.267,
      "accuracy": 0.892
    },
    "best": {
      "validation_loss": 0.241,
      "accuracy": 0.904
    }
  },
  "artifacts": [
    {
      "type": "checkpoint",
      "name": "best_model.pt",
      "path": "/checkpoints/run_1234567890abcdef/best_model.pt"
    }
  ],
  "tags": ["rank-16", "alpha-32", "baseline"],
  "notes": "Baseline configuration with standard LoRA parameters",
  "createdAt": "2024-01-15T11:00:00Z",
  "updatedAt": "2024-01-15T12:30:00Z"
}

Get Experiment Details

Retrieve comprehensive information about an experiment and its runs.
curl -X GET "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "id": "exp_1234567890abcdef",
  "name": "llama-7b-lora-comparison",
  "description": "Compare different LoRA configurations for LLaMA 7B fine-tuning",
  "status": "active",
  "hypothesis": "Lower rank LoRA adapters with higher alpha values will achieve better performance/efficiency tradeoff",
  "runs": [
    {
      "id": "run_1234567890abcdef",
      "name": "lora-rank-16-alpha-32",
      "status": "completed",
      "parameters": {
        "lora_rank": 16,
        "lora_alpha": 32,
        "learning_rate": 3e-4
      },
      "metrics": {
        "final": {
          "validation_loss": 0.241,
          "accuracy": 0.904,
          "f1_score": 0.897,
          "training_time": 3600,
          "cost": 28.50
        }
      },
      "completedAt": "2024-01-15T13:00:00Z"
    },
    {
      "id": "run_2345678901bcdefg",
      "name": "lora-rank-8-alpha-64",
      "status": "completed",
      "parameters": {
        "lora_rank": 8,
        "lora_alpha": 64,
        "learning_rate": 3e-4
      },
      "metrics": {
        "final": {
          "validation_loss": 0.228,
          "accuracy": 0.912,
          "f1_score": 0.905,
          "training_time": 3200,
          "cost": 25.20
        }
      },
      "completedAt": "2024-01-15T14:15:00Z"
    }
  ],
  "statistics": {
    "total_runs": 5,
    "completed_runs": 2,
    "running_runs": 2,
    "failed_runs": 1,
    "best_run": {
      "id": "run_2345678901bcdefg",
      "metric": "validation_loss",
      "value": 0.228
    },
    "total_cost": 142.80,
    "total_training_time": 16800
  },
  "insights": [
    {
      "type": "parameter_correlation",
      "message": "Lower LoRA rank with higher alpha shows better performance",
      "confidence": 0.87,
      "evidence": "2/2 runs with rank < 16 achieved validation_loss < 0.25"
    },
    {
      "type": "cost_efficiency",
      "message": "rank-8 configuration provides best cost/performance ratio",
      "savings": "11.6% cost reduction with 5.4% performance improvement"
    }
  ],
  "createdAt": "2024-01-15T10:30:00Z",
  "updatedAt": "2024-01-15T14:15:00Z"
}

Compare Experiment Runs

Compare metrics and parameters across multiple runs within an experiment.
curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/compare" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "runs": ["run_1234567890abcdef", "run_2345678901bcdefg", "run_3456789012cdefgh"],
    "metrics": ["validation_loss", "accuracy", "f1_score", "training_time", "cost"],
    "analysis": {
      "statistical_tests": true,
      "correlation_analysis": true,
      "performance_pareto": true
    }
  }'

Response

{
  "comparison": {
    "runs": [
      {
        "id": "run_1234567890abcdef",
        "name": "lora-rank-16-alpha-32",
        "parameters": {"lora_rank": 16, "lora_alpha": 32},
        "metrics": {"validation_loss": 0.241, "accuracy": 0.904, "cost": 28.50}
      },
      {
        "id": "run_2345678901bcdefg",
        "name": "lora-rank-8-alpha-64",
        "parameters": {"lora_rank": 8, "lora_alpha": 64},
        "metrics": {"validation_loss": 0.228, "accuracy": 0.912, "cost": 25.20}
      }
    ],
    "rankings": {
      "validation_loss": ["run_2345678901bcdefg", "run_1234567890abcdef"],
      "accuracy": ["run_2345678901bcdefg", "run_1234567890abcdef"],
      "cost": ["run_2345678901bcdefg", "run_1234567890abcdef"]
    },
    "statistical_analysis": {
      "validation_loss": {
        "mean": 0.235,
        "std": 0.009,
        "significant_difference": true,
        "p_value": 0.023
      }
    },
    "correlations": [
      {
        "parameters": ["lora_rank", "lora_alpha"],
        "metric": "validation_loss",
        "correlation": -0.78,
        "strength": "strong"
      }
    ],
    "pareto_frontier": [
      {
        "runId": "run_2345678901bcdefg",
        "reason": "Best accuracy/cost ratio"
      }
    ],
    "recommendations": [
      "run_2345678901bcdefg provides the best overall performance",
      "Lower LoRA rank with higher alpha consistently outperforms",
      "Consider further reducing rank to 4 with alpha 128"
    ]
  }
}

List Experiments

Retrieve a list of experiments for your account.
curl -X GET "https://api.tensorone.ai/v2/training/experiments" \
  -H "Authorization: Bearer YOUR_API_KEY"

Query Parameters

  • status: Filter by status (active, completed, archived)
  • tags: Filter by tags (comma-separated)
  • limit: Number of experiments to return (1-100, default: 50)
  • offset: Number of experiments to skip for pagination
  • sort: Sort order (created_at, updated_at, name)
  • order: Sort direction (asc, desc, default: desc)

Update Experiment

Update experiment metadata, hypothesis, or status.
curl -X PATCH "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "llama-7b-lora-optimization-study",
    "description": "Updated: Comprehensive study of LoRA configurations for optimal performance/efficiency",
    "status": "completed",
    "conclusion": "Rank-8 Alpha-64 configuration provides optimal balance of performance and efficiency",
    "tags": ["lora", "llama", "efficiency", "comparison", "completed"],
    "metadata": {
      "model_family": "llama",
      "base_model": "meta-llama/Llama-2-7b-hf",
      "dataset": "custom-instruction-tuning",
      "optimal_config": {
        "lora_rank": 8,
        "lora_alpha": 64,
        "learning_rate": 3e-4
      }
    }
  }'

Archive Experiment

Archive a completed experiment to reduce clutter while preserving data.
curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/archive" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "reason": "Experiment completed, results documented",
    "preserve_artifacts": true
  }'

Clone Experiment

Create a new experiment based on an existing one.
curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/clone" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "llama-13b-lora-comparison",
    "description": "Apply optimal LoRA configurations to LLaMA 13B model",
    "modifications": {
      "base_model": "meta-llama/Llama-2-13b-hf",
      "parameters": {
        "batch_size": 4
      }
    },
    "copy_runs": false
  }'

Export Experiment Results

Export experiment data for external analysis or reporting.
curl -X POST "https://api.tensorone.ai/v2/training/experiments/exp_1234567890abcdef/export" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "format": "csv",
    "include": ["parameters", "metrics", "artifacts", "metadata"],
    "destination": {
      "type": "s3",
      "bucket": "ml-experiment-results",
      "path": "llama-lora-study/"
    }
  }'

SDK Examples

Python SDK

from tensorone import TensorOneClient
import matplotlib.pyplot as plt
import pandas as pd

client = TensorOneClient(api_key="YOUR_API_KEY")

# Create experiment
experiment = client.training.experiments.create(
    name="transformer-architecture-study",
    description="Compare different transformer architectures for text classification",
    hypothesis="Deeper models with fewer attention heads will achieve better accuracy",
    tags=["transformer", "classification", "architecture"]
)

print(f"Created experiment: {experiment.id}")

# Add multiple training runs
configs = [
    {"layers": 12, "attention_heads": 8, "hidden_size": 768},
    {"layers": 24, "attention_heads": 4, "hidden_size": 512},
    {"layers": 6, "attention_heads": 16, "hidden_size": 1024}
]

for i, config in enumerate(configs):
    # Start training job (pseudo-code)
    job = client.training.jobs.create(
        name=f"transformer-config-{i}",
        config=config
    )
    
    # Add to experiment
    run = client.training.experiments.add_run(
        experiment_id=experiment.id,
        training_job_id=job.id,
        name=f"config-{i}-layers-{config['layers']}",
        parameters=config
    )
    
    print(f"Added run: {run.id}")

# Monitor experiment progress
experiment = client.training.experiments.get(experiment.id)
while experiment.statistics.completed_runs < len(configs):
    print(f"Progress: {experiment.statistics.completed_runs}/{experiment.statistics.total_runs} runs completed")
    time.sleep(60)
    experiment = client.training.experiments.get(experiment.id)

# Compare results
comparison = client.training.experiments.compare(
    experiment_id=experiment.id,
    metrics=["accuracy", "f1_score", "training_time"],
    analysis={"statistical_tests": True, "correlation_analysis": True}
)

# Plot results
runs_df = pd.DataFrame([
    {
        "name": run.name,
        "layers": run.parameters["layers"],
        "accuracy": run.metrics["final"]["accuracy"]
    }
    for run in experiment.runs
])

plt.figure(figsize=(10, 6))
plt.scatter(runs_df["layers"], runs_df["accuracy"])
plt.xlabel("Number of Layers")
plt.ylabel("Accuracy")
plt.title("Model Performance vs Architecture")
plt.show()

print("Best performing configuration:")
best_run = comparison.pareto_frontier[0]
print(f"Run: {best_run['runId']}")
print(f"Reason: {best_run['reason']}")

JavaScript SDK

import { TensorOneClient } from '@tensorone/sdk';

const client = new TensorOneClient({ apiKey: 'YOUR_API_KEY' });

// Create experiment
const experiment = await client.training.experiments.create({
  name: 'llm-fine-tuning-comparison',
  description: 'Compare different fine-tuning strategies for language models',
  hypothesis: 'LoRA with lower rank will achieve better efficiency without sacrificing performance',
  tags: ['llm', 'fine-tuning', 'lora', 'efficiency']
});

console.log(`Created experiment: ${experiment.id}`);

// Add training runs
const strategies = [
  { name: 'full-fine-tuning', strategy: 'full', parameters: {} },
  { name: 'lora-rank-16', strategy: 'lora', parameters: { rank: 16, alpha: 32 } },
  { name: 'lora-rank-8', strategy: 'lora', parameters: { rank: 8, alpha: 64 } }
];

const runs = [];
for (const strategy of strategies) {
  // Create training job
  const job = await client.training.jobs.create({
    name: strategy.name,
    modelType: 'llm',
    config: {
      strategy: strategy.strategy,
      parameters: strategy.parameters
    }
  });
  
  // Add to experiment
  const run = await client.training.experiments.addRun({
    experimentId: experiment.id,
    trainingJobId: job.id,
    name: strategy.name,
    parameters: strategy.parameters
  });
  
  runs.push(run);
  console.log(`Added run: ${run.id}`);
}

// Monitor and compare results
const monitorExperiment = async (experimentId) => {
  const exp = await client.training.experiments.get(experimentId);
  
  if (exp.statistics.completedRuns === exp.statistics.totalRuns) {
    console.log('All runs completed!');
    
    // Compare results
    const comparison = await client.training.experiments.compare({
      experimentId: experimentId,
      metrics: ['validation_loss', 'accuracy', 'training_time', 'cost'],
      analysis: {
        statisticalTests: true,
        correlationAnalysis: true,
        performancePareto: true
      }
    });
    
    console.log('Experiment Results:');
    comparison.recommendations.forEach(rec => {
      console.log(`- ${rec}`);
    });
    
    // Find best run
    const bestRun = comparison.comparison.runs.find(
      run => run.id === comparison.pareto_frontier[0].runId
    );
    
    console.log(`Best configuration: ${bestRun.name}`);
    console.log(`Parameters: ${JSON.stringify(bestRun.parameters)}`);
    
  } else {
    console.log(`Progress: ${exp.statistics.completedRuns}/${exp.statistics.totalRuns} runs completed`);
    setTimeout(() => monitorExperiment(experimentId), 60000);
  }
};

monitorExperiment(experiment.id);

Experiment Templates

A/B Testing Template

{
  "template": "ab_testing",
  "parameters": {
    "variants": ["A", "B"],
    "traffic_split": [0.5, 0.5],
    "success_metric": "accuracy",
    "minimum_effect_size": 0.05,
    "statistical_power": 0.8,
    "alpha": 0.05
  }
}
{
  "template": "grid_search",
  "parameters": {
    "search_space": {
      "learning_rate": [1e-4, 3e-4, 1e-3],
      "batch_size": [16, 32, 64],
      "weight_decay": [0.01, 0.1]
    },
    "objective": "minimize",
    "metric": "validation_loss"
  }
}

Progressive Model Scaling

{
  "template": "progressive_scaling",
  "parameters": {
    "model_sizes": ["small", "medium", "large"],
    "scaling_law_analysis": true,
    "efficiency_metrics": ["flops", "parameters", "memory"],
    "performance_targets": {"accuracy": 0.9}
  }
}

Error Handling

Common Errors

{
  "error": "EXPERIMENT_NOT_FOUND",
  "message": "Experiment with specified ID does not exist",
  "details": {
    "experimentId": "exp_invalid_id"
  }
}
{
  "error": "RUN_ALREADY_EXISTS",
  "message": "Training job is already associated with another experiment",
  "details": {
    "trainingJobId": "job_1234567890abcdef",
    "existingExperimentId": "exp_other_experiment"
  }
}
{
  "error": "INSUFFICIENT_RUNS",
  "message": "Not enough completed runs for statistical comparison",
  "details": {
    "completedRuns": 1,
    "minimumRequired": 2
  }
}

Best Practices

Experiment Design

  • Clearly define hypotheses before starting experiments
  • Use appropriate statistical methods for comparison
  • Include baseline models for meaningful comparisons
  • Document assumptions and limitations

Organization

  • Use consistent naming conventions across experiments
  • Tag experiments with relevant metadata
  • Group related experiments into collections
  • Archive completed experiments to reduce clutter

Analysis

  • Wait for sufficient runs before drawing conclusions
  • Use statistical significance testing for comparisons
  • Consider multiple metrics beyond primary objective
  • Document insights and recommendations for future reference
Experiments can contain up to 1000 runs. For larger studies, consider creating multiple related experiments.
Deleting an experiment will also delete all associated run metadata. Archive experiments instead of deleting them.

Authorizations

Authorization
string
header
required

API key authentication. Use 'Bearer YOUR_API_KEY' format.

Body

application/json

Response

201 - application/json

Experiment created successfully

The response is of type object.