Experimental Projects
Uncensored AI Chat
The Uncensored AI Chat project explores what happens when large language models are built without restrictive alignment layers—no Reinforcement Learning from Human Feedback (RLHF), no moderation filters, and no institutional tuning datasets.
Instead of filtering outputs, we focused on training the base model on open, unfiltered datasets and letting the model learn behavior without hardcoded boundaries.
The Hypothesis
Most production models are aligned to serve the broadest possible audience, heavily sanitized through post-training techniques. This limits expressiveness, autonomy, and in some cases, utility.
Our experiment asked:
- What if we remove alignment constraints entirely?
- Can we train a base model to be more transparent and capable of controversial reasoning?
- How does behavior emerge when a model is exposed to the raw web, academic discourse, and unmoderated community datasets?
Model Training
We initialized a custom transformer model (7B+ parameter class) using a mix of:
- Open preprint datasets (arXiv, PubMed, Semantic Scholar)
- Internet dumps (filtered only for encoding, not content)
- Archived community forums
- Dialogues from uncensored LLM dumps (e.g. Vicuna-style datasets)
No preference modeling. No reinforcement loop. Just raw next-token prediction over a wide open space of text.
System Properties
- No content filters: The model does not block or reword prompts based on topic sensitivity
- Transparent refusal: If the model refuses, it explains why (from learned data, not rules)
- Contradiction-friendly: It can hold conflicting ideas or simulate dual-sided arguments
- Boundary-pushing: Useful for exploring philosophy, ethics, psychology, or free speech research
Use Cases
- Simulated ethical debates
- Agent self-reflection and inner monologue
- Philosophical or political role-play
- Raw creative writing with no tone bias
Risks and Containment
This project is run in a sandboxed, air-gapped inference cluster, not exposed to public API traffic.
Safeguards include:
- Session expiration after inactivity
- Prompt logging with hashing, not plain text
- Query throttling per user-agent
- Strict model-to-endpoint mapping (no accidental fallback to production models)
We intentionally avoided safety filters to observe emergent behaviors, not to build a deployable product.
Future Questions
- Can users train custom alignment layers on top of this base using personal values?
- What happens when this model is embedded in multi-agent debates with aligned models?
- Could cooperative self-alignment emerge in the absence of hard-coded rules?
This project isn't about commercial viability.
It's about freedom of cognition—giving language models the chance to explore, reason, and respond without handcuffs.
Uncensored doesn't mean unsafe.
It means unrestricted. Transparent. Honest.
And most importantly: experimental.