Tenure One · 2025–2026

Viveka 1.0

Mechanistic interpretability of large language models — probing the internal circuits behind hallucinations, factual recall, and in-context learning.

Focus AreaMechanistic Interpretability
DurationAug 2025 – May 2026
Researchers11 Members
StatusActive
01 — Technical Introduction

What lives inside a language model?

Hallucinations Non linear low- dimensional subspaces of truthfulness, truthflow , autoencoders
Factual Recall Two Hop Circuits, Circuit Analysis
Studying Transformers in Mathematical Framework of Hidden Markov Models
Methods

Meet the Viveka 1.0 Team

2 project leads and 9 researchers driving interpretability research through the 2025–26 academic year.

View Team →
02 — Research Output

Blogs, papers & publications.

LLM Hallucination Detection Non linear Probing
Factual correctness representations are non linear and lie in low dimensional subspaces.
Authors· 2025
Upcoming
Circuits
Two Hop Factual Recall
Saahil Faraaz Shaikh · 2026
Upcoming
Circuits Logit-Lens Norm-Lens
Through the Lenses: A Circuit Odyssey
Pakshal Nagda, Smitali Bhandari· 2025
📄 Blog Post
λ
ICL Hidden Markov Models
In-Context Learning of Switching Processes in Transformers
Sriram V, Jayden Koshy Joe, Smitali Bhandari· 2026
Upcoming
σ
Interpretability Non Linear Steering
Autoencoders for Steering Truthfulness and Uncertainity Directions
Samrudh Raaj, Saahil Faraaz Shaikh · 2026
📝 Blog Post Upcoming
σ
Interpretability Flow Models
Explaining Truthflow
Eshika Nahata, Samrudh Raaj · 2026
📝 Blog Post Upcoming
03 — Recruitment

Join Viveka 1.0

Recruitment for this tenure is over. Find the application below.