Core Technologies
AI Framework
The ai-framework
stack defines the key libraries, orchestration layers, and runtime patterns actively used in our organization for developing, deploying, and maintaining AI systems.
It combines open-source tools, internal extensions, and modular abstractions to support everything from experimentation to production inference pipelines.
Purpose and Scope
Our AI framework serves three goals:
- Standardize: Establish reusable, interoperable development patterns.
- Scale: Support high-throughput inference and multi-agent architectures.
- Secure: Ensure safe, explainable, and validated output generation.
Core Frameworks
LangChain
LangChain is used as the backbone for chaining LLM prompts, custom tools, and agents. It powers:
- Prompt composition
- Memory injection
- Vector search integration (e.g., FAISS, Weaviate)
Used for: task chaining, document Q&A, and function-calling workflows.
CrewAI
CrewAI structures multi-agent LLM systems. It introduces hierarchy, delegation, and agent specialization.
Key use cases:
- Team-based agents with defined roles
- Task routing between expert modules
- Background worker simulation
Internal Modules
MCP (Model Coordination Protocol)
An internal coordination layer that manages:
- Routing requests between inference backends
- Resource prioritization (GPU slots, endpoint queues)
- Graceful failure and fallback paths
MCP is implemented as a lightweight message broker layered over gRPC, with hooks into runtime logs and observability tooling.
Pydantic AI
Pydantic AI extends pydantic
with LLM schema validation. It ensures typed, reliable outputs from models and supports:
- Generative output validation (
BaseModel.parse_llm
) - Structured prompt inputs
- Fast error tracing
Supporting Tools
- PromptFlow: Visual debugger for prompt templates and memory states.
- Traceloop: Fine-grained telemetry for step-by-step tracebacks across chains and agents.
- HydraConfig: Dynamic configuration for model selection, prompt sets, and runtime arguments.
- LLMGuard: Output filters for toxicity, bias, and jailbreak attempts.
Common Patterns
- RAG (Retrieval-Augmented Generation) with hybrid vector + keyword filters
- Conversational state machines using LangChain memory + Redis
- Multi-agent coordination with dynamic persona switching
- Input schema validation + structured response parsing
Deployment Targets
All AI modules are containerized and deployed to TensorOne Serverless Endpoints or GPU Clusters, with autoscaling based on queue load.
For CI/CD, we use:
- GitHub Actions
- Docker Layer Cache
- Endpoint redeployment via
tensoronecli project deploy
Learn More
Explore our related docs: