Publications
White papers and technical write-ups from Xtensyon Labs. Practical notes on building production AI, governance, evaluation, and the unglamorous parts that make systems work.
White papers and technical write-ups from Xtensyon Labs. Practical notes on building production AI, governance, evaluation, and the unglamorous parts that make systems work.
Some teams cannot send sensitive data to third-party APIs, and network routes are not always predictable. This case study shares a private deployment pattern aimed at strong controls, clear cost reporting, and operational reliability.
Most RAG projects fail quietly: the demo works, then accuracy drifts after a few indexing tweaks. This brief introduces a scorecard approach with simple metrics, a golden query set, and release gates to keep enterprise RAG systems stable in production.
RAG can make LLM outputs auditable and grounded, but only if retrieval quality, policy controls, and monitoring are treated as first-class production concerns. This playbook outlines a governance pattern teams can adopt in weeks, not quarters.
RAG pipelines get slow for predictable reasons: parsing, retrieval, reranking, and long generations. This brief shows how to budget latency across steps and choose optimizations that do not reduce trust.
Permission-aware retrieval fails when group memberships drift or metadata is inconsistent. This paper covers a practical identity sync and ACL strategy that keeps retrieval correct without slowing teams down.
Public benchmarks rarely match internal workflows. This brief shows how to turn a set of real tickets into a repeatable evaluation suite that catches regressions after prompt or indexing changes.
Policy teams update documents often, but users keep bookmarking the old files. This brief shows a versioning and canonical link approach that reduces confusion and improves citations in RAG systems.
We share a pattern for automating repetitive SAP operations requests without letting the model act freely. The focus is on approvals, safe tool scopes, and clear fallbacks when data is incomplete.
Search quality drops when users mix languages, abbreviations, and technical terms in one query. This brief covers indexing, normalization, and evaluation methods that improve recall without punishing precise keyword searches.
Prompt injection is not a theory problem. It shows up through emails, PDFs, tickets, and chat logs. This paper lays out a hardening checklist that security teams can validate and engineers can ship.