AI & ML interests

NLP, Hallucination detection, AI verification

Recent Activity

AI-that-worksย  updated a Space about 1 hour ago
groundlens/README
AI-that-worksย  published a Space about 1 hour ago
groundlens/README
AI-that-worksย  updated a Space about 2 hours ago
groundlens/groundlens-api
View all activity

Organization Card

Groundlens

groundlens

Geometric methods for LLM grounding verification. No second LLM. Deterministic. Same inputs, same scores, every time.

What we do

We detect LLM hallucinations using embedding geometry โ€” not by asking another model to judge the output. Two metrics, each targeting a different failure mode:

  • SGI (Semantic Grounding Index) โ€” measures whether a response actually used the source material it was given. Built for RAG pipeline verification.
  • DGI (Directional Grounding Index) โ€” measures whether a response follows geometric patterns typical of grounded answers. Works without any source context.

Both methods run a single embedding call. Deterministic. Auditable by design.

Research

Three peer-reviewed papers form the foundation:

  1. Semantic Grounding Index (arXiv:2512.13771) โ€” ratio-based grounding verification for RAG systems.
  2. A Geometric Taxonomy of Hallucinations (arXiv:2602.13224) โ€” three-type hallucination classification with a confabulation benchmark.
  3. Rotational Dynamics of Factual Constraint Processing (arXiv:2603.13259) โ€” transformers reject wrong answers via rotation, not rescaling. Phase transition at 1.6B parameters.

Use groundlens

How What
Python library pip install groundlens โ€” GitHub ยท Docs
MCP server pip install groundlens-mcp โ€” works with Claude Desktop, Cursor, Windsurf โ€” GitHub
REST API groundlens-api โ€” hosted on this Space, Swagger docs at /docs
Interactive demo groundlens-demo โ€” try it without installing anything

Philosophy

groundlens is verification triage, not truth detection. It tells you which responses earned the right to be trusted and which need human review. We publish our AUROC numbers even when they're unflattering. We document what we can't detect (Type III confabulations) as a theorem, not a footnote.

Links

models 0

None public yet