List Training Models
curl --request GET \
  --url https://api.tensorone.ai/v2/training/models \
  --header 'Authorization: <api-key>'
{
  "models": [
    {
      "modelId": "<string>",
      "name": "<string>",
      "status": "training",
      "framework": "<string>",
      "modelType": "<string>",
      "accuracy": 123,
      "createdAt": "2023-11-07T05:31:56Z",
      "completedAt": "2023-11-07T05:31:56Z"
    }
  ],
  "total": 123
}
The Model Management API provides comprehensive control over your trained AI models, including versioning, metadata management, and deployment preparation. It serves as the central registry for all models produced by your training jobs.

List Models

Retrieve a list of trained models for your account.
curl -X GET "https://api.tensorone.ai/v2/training/models" \
  -H "Authorization: Bearer YOUR_API_KEY"

Query Parameters

  • type: Filter by model type (llm, vision, multimodal, custom)
  • status: Filter by status (training, ready, deployed, archived)
  • trainingJobId: Filter by originating training job
  • limit: Number of models to return (1-100, default: 50)
  • offset: Number of models to skip for pagination
  • sort: Sort order (created_at, updated_at, name, size)
  • order: Sort direction (asc, desc, default: desc)

Response

{
  "models": [
    {
      "id": "model_1234567890abcdef",
      "name": "llama-7b-customer-support",
      "type": "llm",
      "status": "ready",
      "baseModel": "meta-llama/Llama-2-7b-hf",
      "version": "v1.0.0",
      "trainingJobId": "job_1234567890abcdef",
      "size": {
        "parameters": 7000000000,
        "bytes": 13421772800,
        "compressed": 6710886400
      },
      "metrics": {
        "finalLoss": 0.234,
        "accuracy": 0.892,
        "perplexity": 12.45
      },
      "deployments": {
        "active": 2,
        "endpoints": [
          "ep_prod_support_chat",
          "ep_staging_support_chat"
        ]
      },
      "createdAt": "2024-01-15T14:30:00Z",
      "updatedAt": "2024-01-15T16:45:00Z"
    }
  ],
  "pagination": {
    "total": 8,
    "limit": 50,
    "offset": 0,
    "hasMore": false
  }
}

Get Model Details

Retrieve detailed information about a specific model.
curl -X GET "https://api.tensorone.ai/v2/training/models/model_1234567890abcdef" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "id": "model_1234567890abcdef",
  "name": "llama-7b-customer-support",
  "type": "llm",
  "status": "ready",
  "description": "Fine-tuned LLaMA 7B model for customer support conversations",
  "baseModel": "meta-llama/Llama-2-7b-hf",
  "version": "v1.0.0",
  "trainingJobId": "job_1234567890abcdef",
  "datasetId": "ds_1234567890abcdef",
  "architecture": {
    "layers": 32,
    "attention_heads": 32,
    "hidden_size": 4096,
    "vocabulary_size": 32000,
    "context_length": 2048
  },
  "training": {
    "strategy": "lora",
    "parameters": {
      "rank": 16,
      "alpha": 32,
      "dropout": 0.1,
      "target_modules": ["q_proj", "v_proj", "k_proj", "o_proj"]
    },
    "epochs": 3,
    "final_learning_rate": 1.2e-5,
    "total_steps": 1875,
    "training_time": 10800
  },
  "size": {
    "parameters": 7000000000,
    "trainable_parameters": 4194304,
    "bytes": 13421772800,
    "compressed": 6710886400
  },
  "metrics": {
    "training": {
      "final_loss": 0.234,
      "best_loss": 0.198,
      "perplexity": 12.45
    },
    "validation": {
      "loss": 0.267,
      "accuracy": 0.892,
      "f1_score": 0.885,
      "bleu_score": 0.78
    },
    "custom": {
      "helpfulness_score": 8.7,
      "safety_score": 9.2,
      "coherence_score": 8.9
    }
  },
  "files": [
    {
      "name": "pytorch_model.bin",
      "size": 13421772800,
      "type": "model_weights",
      "checksum": "sha256:a1b2c3d4e5f6..."
    },
    {
      "name": "config.json",
      "size": 1024,
      "type": "configuration",
      "checksum": "sha256:b2c3d4e5f6a7..."
    },
    {
      "name": "tokenizer.json",
      "size": 2048,
      "type": "tokenizer",
      "checksum": "sha256:c3d4e5f6a7b8..."
    }
  ],
  "deployments": [
    {
      "id": "ep_prod_support_chat",
      "name": "Production Support Chat",
      "status": "active",
      "url": "https://api.tensorone.ai/v2/ep_prod_support_chat/runsync",
      "workers": 3,
      "createdAt": "2024-01-15T16:00:00Z"
    }
  ],
  "tags": ["customer-support", "production", "llama"],
  "metadata": {
    "domain": "customer_support",
    "language": "en",
    "use_case": "conversational_ai",
    "quality_gate": "passed"
  },
  "createdAt": "2024-01-15T14:30:00Z",
  "updatedAt": "2024-01-15T16:45:00Z"
}

Update Model Metadata

Update model information, tags, and metadata.
curl -X PATCH "https://api.tensorone.ai/v2/training/models/model_1234567890abcdef" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "llama-7b-customer-support-v2",
    "description": "Updated fine-tuned model with improved safety",
    "tags": ["customer-support", "production", "llama", "v2"],
    "metadata": {
      "domain": "customer_support",
      "language": "en",
      "use_case": "conversational_ai",
      "quality_gate": "passed",
      "safety_review": "approved",
      "performance_tier": "premium"
    }
  }'

Create Model Version

Create a new version of an existing model from a training job.
curl -X POST "https://api.tensorone.ai/v2/training/models/model_1234567890abcdef/versions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "version": "v1.1.0",
    "trainingJobId": "job_new_training_123",
    "description": "Improved version with additional safety training",
    "changelog": [
      "Enhanced safety guidelines training",
      "Improved response coherence",
      "Reduced hallucination rate by 15%"
    ]
  }'

Deploy Model

Deploy a model to create a new inference endpoint.
curl -X POST "https://api.tensorone.ai/v2/training/models/model_1234567890abcdef/deploy" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "customer-support-chat-v2",
    "environment": "production",
    "gpuType": "a100",
    "workers": 3,
    "scaling": {
      "minWorkers": 1,
      "maxWorkers": 10,
      "targetUtilization": 0.7
    },
    "configuration": {
      "max_tokens": 512,
      "temperature": 0.7,
      "top_p": 0.9,
      "repetition_penalty": 1.1
    }
  }'

Response

{
  "deployment": {
    "id": "ep_customer_support_v2",
    "modelId": "model_1234567890abcdef",
    "name": "customer-support-chat-v2",
    "status": "deploying",
    "environment": "production",
    "url": "https://api.tensorone.ai/v2/ep_customer_support_v2/runsync",
    "configuration": {
      "gpuType": "NVIDIA A100",
      "workers": 3,
      "maxTokens": 512,
      "temperature": 0.7
    },
    "estimatedReadyTime": "2024-01-15T17:05:00Z",
    "createdAt": "2024-01-15T17:00:00Z"
  }
}

Download Model

Download model files for local deployment or analysis.
curl -X POST "https://api.tensorone.ai/v2/training/models/model_1234567890abcdef/download" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "files": ["pytorch_model.bin", "config.json", "tokenizer.json"],
    "format": "pytorch",
    "compression": "zip"
  }'

Response

{
  "downloadUrl": "https://download.tensorone.ai/models/model_1234567890abcdef.zip",
  "downloadToken": "tok_download_1234567890abcdef",
  "expiresAt": "2024-01-15T19:00:00Z",
  "size": 6710886400,
  "files": [
    {
      "name": "pytorch_model.bin",
      "size": 13421772800,
      "checksum": "sha256:a1b2c3d4e5f6..."
    },
    {
      "name": "config.json",
      "size": 1024,
      "checksum": "sha256:b2c3d4e5f6a7..."
    },
    {
      "name": "tokenizer.json",
      "size": 2048,
      "checksum": "sha256:c3d4e5f6a7b8..."
    }
  ]
}

Compare Models

Compare performance metrics between different models or versions.
curl -X POST "https://api.tensorone.ai/v2/training/models/compare" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "models": [
      "model_1234567890abcdef",
      "model_2345678901bcdefg",
      "model_3456789012cdefgh"
    ],
    "metrics": ["accuracy", "f1_score", "perplexity", "bleu_score"],
    "benchmarks": ["custom_eval_suite", "hellaswag", "truthfulqa"]
  }'

Response

{
  "comparison": {
    "models": [
      {
        "id": "model_1234567890abcdef",
        "name": "llama-7b-customer-support",
        "version": "v1.0.0",
        "metrics": {
          "accuracy": 0.892,
          "f1_score": 0.885,
          "perplexity": 12.45,
          "bleu_score": 0.78
        },
        "benchmarks": {
          "custom_eval_suite": 85.2,
          "hellaswag": 76.8,
          "truthfulqa": 42.3
        }
      }
    ],
    "best_performing": {
      "accuracy": "model_1234567890abcdef",
      "f1_score": "model_2345678901bcdefg",
      "perplexity": "model_3456789012cdefgh"
    },
    "recommendations": [
      "model_1234567890abcdef shows best overall performance",
      "Consider model_2345678901bcdefg for precision-critical tasks"
    ]
  }
}

Archive Model

Archive a model to reduce storage costs while maintaining metadata.
curl -X POST "https://api.tensorone.ai/v2/training/models/model_1234567890abcdef/archive" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "reason": "Replaced by newer version",
    "retention_period": "90d"
  }'

SDK Examples

Python SDK

from tensorone import TensorOneClient

client = TensorOneClient(api_key="YOUR_API_KEY")

# List models
models = client.training.models.list(
    type="llm",
    status="ready",
    limit=20
)

for model in models:
    print(f"{model.name} - {model.size.parameters} parameters")

# Get model details
model = client.training.models.get("model_1234567890abcdef")
print(f"Model: {model.name}")
print(f"Accuracy: {model.metrics.validation.accuracy}")
print(f"F1 Score: {model.metrics.validation.f1_score}")

# Deploy model
deployment = client.training.models.deploy(
    model_id="model_1234567890abcdef",
    name="customer-support-chat",
    environment="production",
    gpu_type="a100",
    workers=3,
    scaling={
        "min_workers": 1,
        "max_workers": 10,
        "target_utilization": 0.7
    },
    configuration={
        "max_tokens": 512,
        "temperature": 0.7,
        "top_p": 0.9
    }
)

print(f"Deployment created: {deployment.url}")

# Compare models
comparison = client.training.models.compare(
    models=[
        "model_1234567890abcdef",
        "model_2345678901bcdefg"
    ],
    metrics=["accuracy", "f1_score", "perplexity"]
)

for model_result in comparison.models:
    print(f"{model_result.name}: {model_result.metrics}")

JavaScript SDK

import { TensorOneClient } from '@tensorone/sdk';

const client = new TensorOneClient({ apiKey: 'YOUR_API_KEY' });

// List models
const models = await client.training.models.list({
  type: 'llm',
  status: 'ready',
  limit: 20
});

models.forEach(model => {
  console.log(`${model.name} - ${model.size.parameters} parameters`);
});

// Get model details
const model = await client.training.models.get('model_1234567890abcdef');
console.log(`Model: ${model.name}`);
console.log(`Accuracy: ${model.metrics.validation.accuracy}`);
console.log(`F1 Score: ${model.metrics.validation.f1Score}`);

// Deploy model
const deployment = await client.training.models.deploy('model_1234567890abcdef', {
  name: 'customer-support-chat',
  environment: 'production',
  gpuType: 'a100',
  workers: 3,
  scaling: {
    minWorkers: 1,
    maxWorkers: 10,
    targetUtilization: 0.7
  },
  configuration: {
    maxTokens: 512,
    temperature: 0.7,
    topP: 0.9
  }
});

console.log(`Deployment created: ${deployment.url}`);

// Download model
const downloadInfo = await client.training.models.download('model_1234567890abcdef', {
  files: ['pytorch_model.bin', 'config.json'],
  format: 'pytorch',
  compression: 'zip'
});

console.log(`Download URL: ${downloadInfo.downloadUrl}`);

Model Formats

PyTorch Models

  • pytorch_model.bin: Model weights in PyTorch format
  • config.json: Model architecture configuration
  • tokenizer.json: Tokenizer configuration and vocabulary

Hugging Face Compatible

  • model.safetensors: Safe tensor format for weights
  • pytorch_model.bin: PyTorch weights (legacy)
  • config.json: Transformers configuration
  • tokenizer_config.json: Tokenizer configuration

ONNX Export

  • model.onnx: ONNX format for cross-platform deployment
  • config.json: Model metadata
  • tokenizer.json: Tokenizer information

TensorRT Optimization

  • model.trt: TensorRT optimized engine
  • config.json: Optimization parameters
  • profiling_data.json: Performance profiling results

Error Handling

Common Errors

{
  "error": "MODEL_NOT_FOUND",
  "message": "Model with specified ID does not exist",
  "details": {
    "modelId": "model_invalid_id"
  }
}
{
  "error": "DEPLOYMENT_FAILED",
  "message": "Model deployment failed due to resource constraints",
  "details": {
    "reason": "Insufficient GPU capacity",
    "availableGpuTypes": ["v100", "rtx4090"],
    "requestedGpuType": "h100"
  }
}
{
  "error": "MODEL_TOO_LARGE",
  "message": "Model exceeds size limits for deployment",
  "details": {
    "modelSize": "50GB",
    "maxAllowedSize": "40GB",
    "suggestions": [
      "Use model compression",
      "Deploy on higher-tier GPU instances"
    ]
  }
}

Best Practices

Model Organization

  • Use consistent naming conventions for models and versions
  • Tag models with relevant metadata (domain, use case, quality)
  • Maintain clear version histories with detailed changelogs
  • Archive outdated models to reduce storage costs

Performance Optimization

  • Choose appropriate deployment configurations based on latency requirements
  • Use auto-scaling to handle variable workloads efficiently
  • Monitor model performance metrics continuously
  • Implement A/B testing for model comparisons

Security and Compliance

  • Implement proper access controls for sensitive models
  • Maintain audit trails for model deployments
  • Use encryption for model files and communications
  • Regular security scans for deployed models
Model deployments typically take 3-5 minutes to become active. Larger models may require additional time for optimization and loading.
Archived models can be restored within the retention period. After that, they are permanently deleted and cannot be recovered.

Authorizations

Authorization
string
header
required

API key authentication. Use 'Bearer YOUR_API_KEY' format.

Query Parameters

status
enum<string>

Filter by model status

Available options:
training,
completed,
failed,
cancelled
limit
integer
default:20

Maximum number of models to return

Required range: 1 <= x <= 100

Response

200 - application/json

List of training models

The response is of type object.