Get Job Status
curl --request GET \
  --url https://api.tensorone.ai/v2/jobs/{jobId}/status \
  --header 'Authorization: <api-key>'
{
  "jobId": "<string>",
  "status": "queued",
  "progress": 50,
  "result": "<any>",
  "error": "<string>"
}
Get comprehensive status information for asynchronous endpoint executions, including progress tracking, resource usage, and execution metadata. This endpoint is essential for monitoring long-running tasks and batch processing operations.

Path Parameters

  • jobId: The unique identifier of the job to check status for

Query Parameters

  • include: Optional array of additional fields to include (logs, metrics, resources)
  • format: Response format (json, summary) - defaults to json

Example Usage

Basic Status Check

curl -X GET "https://api.tensorone.ai/v2/jobs/job_1234567890abcdef/status" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"

Detailed Status with Logs and Metrics

curl -X GET "https://api.tensorone.ai/v2/jobs/job_1234567890abcdef/status?include=logs,metrics,resources" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"

Multiple Job Status Check

curl -X POST "https://api.tensorone.ai/v2/jobs/status/batch" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "jobIds": [
      "job_1234567890abcdef",
      "job_2345678901bcdefg",
      "job_3456789012cdefgh"
    ],
    "include": ["metrics"]
  }'

Response

Successful Response

{
    "jobId": "job_1234567890abcdef",
    "status": "running",
    "progress": 75,
    "endpointId": "ep_image_processor",
    "priority": "high",
    "createdAt": "2024-01-15T14:30:00Z",
    "startedAt": "2024-01-15T14:31:15Z",
    "estimatedCompletion": "2024-01-15T14:45:30Z",
    "currentStep": "processing_images",
    "metadata": {
        "imagesProcessed": 750,
        "totalImages": 1000,
        "averageProcessingTime": 1.2,
        "currentBatch": 15,
        "totalBatches": 20
    },
    "resources": {
        "gpuType": "NVIDIA A100",
        "gpuUtilization": 85,
        "memoryUsage": "12.5GB",
        "memoryTotal": "40GB",
        "cpuUsage": 45
    },
    "execution": {
        "duration": 847,
        "costAccrued": 2.47,
        "tokensProcessed": 125000,
        "apiCalls": 1247
    },
    "tags": {
        "userId": "user_12345",
        "projectId": "proj_batch_processing",
        "category": "image_enhancement"
    }
}

Completed Job Response

{
    "jobId": "job_completed_example",
    "status": "completed",
    "progress": 100,
    "endpointId": "ep_text_generator",
    "createdAt": "2024-01-15T14:00:00Z",
    "startedAt": "2024-01-15T14:01:00Z",
    "completedAt": "2024-01-15T14:15:30Z",
    "executionTime": 870.5,
    "output": {
        "result": "Generated content successfully saved to storage",
        "outputUrl": "https://storage.tensorone.ai/outputs/doc_abc123.pdf",
        "size": "2.4MB",
        "format": "pdf"
    },
    "finalMetrics": {
        "documentsGenerated": 50,
        "wordsGenerated": 45000,
        "totalCost": 5.67,
        "averageLatency": 17.4
    },
    "resources": {
        "peakGpuUtilization": 92,
        "peakMemoryUsage": "18.2GB",
        "totalComputeTime": 852.3
    }
}

Failed Job Response

{
    "jobId": "job_failed_example",
    "status": "failed",
    "progress": 45,
    "endpointId": "ep_video_processor",
    "createdAt": "2024-01-15T13:30:00Z",
    "startedAt": "2024-01-15T13:31:00Z",
    "failedAt": "2024-01-15T13:45:22Z",
    "error": {
        "code": "RESOURCE_EXHAUSTED",
        "message": "GPU memory limit exceeded during processing",
        "details": {
            "step": "video_encoding",
            "memoryRequired": "45GB",
            "memoryAvailable": "40GB",
            "suggestion": "Reduce input resolution or use smaller batch size"
        }
    },
    "partialResults": {
        "processedFrames": 15420,
        "totalFrames": 34200,
        "outputFiles": [
            "https://storage.tensorone.ai/temp/partial_output_1.mp4"
        ]
    },
    "retryable": true,
    "retryCount": 1,
    "nextRetryAt": "2024-01-15T14:00:00Z"
}

Status Values

Primary States

  • queued: Job submitted and waiting for available resources
  • initializing: Job is starting up and loading resources
  • running: Job is actively processing
  • completed: Job finished successfully
  • failed: Job encountered an error and stopped
  • cancelled: Job was cancelled by user or system
  • timeout: Job exceeded maximum execution time
  • paused: Job temporarily paused (manual or automatic)

Substates for Running Jobs

  • warming_up: Cold start - loading model and dependencies
  • processing: Actively processing input data
  • finalizing: Completing processing and preparing output
  • uploading: Transferring output to storage

Progress Tracking

Progress Information

Jobs include detailed progress information:
  • progress: Percentage completion (0-100)
  • currentStep: Current processing phase description
  • estimatedCompletion: Predicted completion timestamp
  • metadata: Task-specific progress details
  • throughput: Processing rate (items/second, tokens/second, etc.)

Real-time Updates

# Use Server-Sent Events for real-time status updates
curl -X GET "https://api.tensorone.ai/v2/jobs/job_1234567890abcdef/status/stream" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Accept: text/event-stream"

Resource Monitoring

GPU Utilization

Track GPU usage in real-time:
{
    "resources": {
        "gpus": [
            {
                "id": "gpu_0",
                "type": "NVIDIA A100",
                "utilization": 87,
                "memoryUsed": "35.2GB",
                "memoryTotal": "40GB",
                "temperature": 72,
                "powerUsage": "280W"
            }
        ],
        "cpu": {
            "utilization": 45,
            "cores": 16,
            "memory": "64GB"
        }
    }
}

Cold Start Monitoring

Startup Performance

Track cold start metrics for optimization:
{
    "coldStart": {
        "occurred": true,
        "duration": 45.2,
        "phases": {
            "containerStart": 12.5,
            "modelLoad": 28.1,
            "dependencyLoad": 4.6
        },
        "cacheHit": false,
        "optimizationSuggestions": [
            "Consider using warm pools for frequently accessed models",
            "Optimize model size or use quantized versions"
        ]
    }
}

Error Handling

404 Not Found

{
    "error": "JOB_NOT_FOUND",
    "message": "Job job_invalid does not exist or has expired",
    "details": {
        "jobId": "job_invalid",
        "possibleReasons": [
            "Job ID is incorrect",
            "Job was deleted after completion",
            "Job expired (older than 30 days)"
        ]
    }
}

403 Forbidden

{
    "error": "ACCESS_DENIED",
    "message": "You don't have permission to view this job",
    "details": {
        "jobId": "job_1234567890abcdef",
        "requiredPermission": "jobs:read",
        "userPermissions": ["endpoints:execute"]
    }
}

429 Rate Limited

{
    "error": "RATE_LIMIT_EXCEEDED",
    "message": "Too many status check requests",
    "details": {
        "limit": 100,
        "window": "1m",
        "retryAfter": 30,
        "recommendation": "Use webhooks or SSE for real-time updates"
    }
}

SDK Examples

Python SDK

from tensorone import TensorOneClient
import time
import asyncio

client = TensorOneClient(api_key="your_api_key")

# Basic status check
def check_job_status(job_id):
    status = client.jobs.get_status(job_id)
    print(f"Job {job_id}: {status.status} ({status.progress}%)")
    return status

# Real-time status monitoring
async def monitor_job_with_streaming(job_id):
    async for update in client.jobs.stream_status(job_id):
        print(f"Progress: {update.progress}% - {update.current_step}")
        
        if update.status == "completed":
            print(f"Job completed! Output: {update.output}")
            break
        elif update.status == "failed":
            print(f"Job failed: {update.error.message}")
            break

# Batch status checking
def check_multiple_jobs(job_ids):
    statuses = client.jobs.get_batch_status(
        job_ids=job_ids,
        include=["metrics", "resources"]
    )
    
    for status in statuses:
        print(f"Job {status.job_id}: {status.status}")
        if status.resources:
            print(f"  GPU Usage: {status.resources.gpu_utilization}%")
            print(f"  Memory: {status.resources.memory_usage}")

# Monitor with custom intervals
def monitor_job_with_backoff(job_id, max_wait=3600):
    intervals = [1, 2, 5, 10, 30, 60]  # Exponential backoff
    interval_index = 0
    start_time = time.time()
    
    while time.time() - start_time < max_wait:
        status = client.jobs.get_status(job_id, include=["resources"])
        print(f"Status: {status.status}, Progress: {status.progress}%")
        
        if status.status in ["completed", "failed", "cancelled"]:
            return status
            
        # Use exponential backoff
        sleep_time = intervals[min(interval_index, len(intervals) - 1)]
        time.sleep(sleep_time)
        interval_index += 1
    
    raise TimeoutError("Job monitoring timed out")

# Usage examples
if __name__ == "__main__":
    job_id = "job_1234567890abcdef"
    
    # Check current status
    current_status = check_job_status(job_id)
    
    # Monitor multiple jobs
    check_multiple_jobs([
        "job_1234567890abcdef",
        "job_2345678901bcdefg"
    ])
    
    # Stream real-time updates
    asyncio.run(monitor_job_with_streaming(job_id))

JavaScript SDK

import { TensorOneClient } from "@tensorone/sdk";

const client = new TensorOneClient({ apiKey: "your_api_key" });

// Basic status checking
async function checkJobStatus(jobId) {
    const status = await client.jobs.getStatus(jobId);
    console.log(`Job ${jobId}: ${status.status} (${status.progress}%)`);
    return status;
}

// Real-time monitoring with async iterators
async function monitorJobProgress(jobId) {
    for await (const update of client.jobs.streamStatus(jobId)) {
        console.log(`Progress: ${update.progress}% - ${update.currentStep}`);
        
        if (update.resources) {
            console.log(`GPU: ${update.resources.gpuUtilization}%, Memory: ${update.resources.memoryUsage}`);
        }
        
        if (update.status === "completed") {
            console.log("Job completed!", update.output);
            break;
        } else if (update.status === "failed") {
            console.error("Job failed:", update.error.message);
            break;
        }
    }
}

// Batch status checking
async function checkMultipleJobs(jobIds) {
    const statuses = await client.jobs.getBatchStatus({
        jobIds,
        include: ["metrics", "resources", "logs"]
    });
    
    statuses.forEach(status => {
        console.log(`Job ${status.jobId}: ${status.status}`);
        if (status.metrics) {
            console.log(`  Cost: $${status.execution.costAccrued}`);
            console.log(`  Duration: ${status.execution.duration}s`);
        }
    });
}

// Monitor with promise-based polling
async function pollJobStatus(jobId, options = {}) {
    const { maxWait = 3600000, interval = 5000 } = options;
    const startTime = Date.now();
    
    while (Date.now() - startTime < maxWait) {
        const status = await client.jobs.getStatus(jobId, {
            include: ["resources", "metrics"]
        });
        
        console.log(`Status: ${status.status}, Progress: ${status.progress}%`);
        
        if (["completed", "failed", "cancelled"].includes(status.status)) {
            return status;
        }
        
        // Dynamic interval adjustment based on progress
        const dynamicInterval = status.progress > 90 ? 2000 : interval;
        await new Promise(resolve => setTimeout(resolve, dynamicInterval));
    }
    
    throw new Error("Job monitoring timed out");
}

// Usage examples
async function main() {
    const jobId = "job_1234567890abcdef";
    
    try {
        // Check current status
        const status = await checkJobStatus(jobId);
        
        // Monitor with different strategies
        if (status.status === "running") {
            // Use streaming for real-time updates
            await monitorJobProgress(jobId);
        } else if (status.status === "queued") {
            // Use polling for queued jobs
            await pollJobStatus(jobId);
        }
        
        // Check multiple jobs
        await checkMultipleJobs([
            "job_1234567890abcdef",
            "job_2345678901bcdefg"
        ]);
        
    } catch (error) {
        console.error("Error monitoring job:", error);
    }
}

main();

Use Cases

Production Monitoring

  • Job Dashboards: Build real-time dashboards showing job progress and resource usage
  • Alert Systems: Set up alerts for failed jobs or resource constraints
  • Capacity Planning: Monitor resource utilization to optimize cluster sizing

Batch Processing

  • Progress Tracking: Monitor large batch jobs processing thousands of items
  • Resource Optimization: Track GPU utilization to optimize batch sizes
  • Cost Management: Monitor execution costs in real-time

Development and Testing

  • Debugging: Monitor job execution to identify bottlenecks and errors
  • Performance Tuning: Track cold start times and resource usage patterns
  • Load Testing: Monitor system behavior under different load conditions

Best Practices

Monitoring Strategy

  • Use Webhooks: Prefer webhooks over polling for better performance
  • Exponential Backoff: Implement exponential backoff for polling to reduce API load
  • Batch Requests: Check multiple job statuses in a single request when possible
  • Real-time Streaming: Use SSE for real-time updates on critical jobs

Performance Optimization

  • Selective Fields: Only request additional fields (logs, metrics) when needed
  • Caching: Cache status responses for non-critical monitoring
  • Rate Limiting: Respect rate limits to avoid throttling

Error Handling

  • Graceful Degradation: Handle temporary API failures gracefully
  • Retry Logic: Implement retry logic for transient errors
  • Timeout Management: Set appropriate timeouts for long-running jobs

Cost Management

  • Monitor Costs: Track costAccrued field to prevent budget overruns
  • Resource Alerts: Set up alerts when resource usage exceeds thresholds
  • Optimization: Use status data to identify optimization opportunities
Job status is updated in real-time. Use streaming endpoints or webhooks for the most current information without overwhelming the API with polling requests.
Jobs older than 30 days are automatically deleted. Ensure you save important status information and outputs before they expire.
Use the include parameter strategically - only request additional data like logs and metrics when you actually need them to keep responses fast and reduce bandwidth usage.

Authorizations

Authorization
string
header
required

API key authentication. Use 'Bearer YOUR_API_KEY' format.

Path Parameters

jobId
string
required

Response

200 - application/json

Job status

The response is of type object.