Camilla, l'assistente AI che sfida la burocrazia nella PA

Over the past year and a half I built two of the core components, from scratch, of a chatbot deployed nationally for Italian Public Administration: its search engine and an automated testing-and-evaluation pipeline. The system is called Camilla, and it’s a public-facing tool: any citizen can use it to search across the 2,200+ open public tenders (around ten thousand in total, counting closed ones), from the job postings of individual comuni and province to national bodies, the armed forces, and any other branch of the PA.

It handles real queries from real users in a regulated context, turning natural-language questions into verifiable answers tied to the tenders that actually match what they asked. That work forced me to get clear on a few interesting questions about hybrid AI systems. How do you evaluate an agentic system in a way that actually catches what breaks in production? How do you build hybrid retrieval over a corpus that’s inconsistent by design? How do you ship an AI product to the Italian PA while meeting the EU AI Act’s transparency obligations?

I wrote about the architecture and the governance decisions, the ones that turned out to matter more than the ML choices, in a piece for Agenda Digitale.

Both systems are described in more detail on their project pages: the two-stage search engine and the behavioral evaluation framework.

Camilla, l’assistente AI che sfida la burocrazia nella PA →