Orizzonte Sistemi Navali, a leading company in system engineering in the naval field, offers the opportunity to carry out a thesis activity dedicated to the validation of AI-based chatbots in a technical-industrial context. The project is part of a program for the digitization of the support and maintenance of complex naval systems. The objective is to support the consultation of technical manuals through an AI chatbot and to define solid methodologies to evaluate their quality and reliability.
The resource must develop a structured mapping of the state of the art on Chatbot/LLM validation methodologies in contexts informed by specific documentation (manuals, procedures, knowledge base), with particular attention to the technical-industrial domain. The student will take care of the following activities:
· Collection and classification of approaches/metrics for evaluating chatbots (automatic, human, hybrid)
· Focus on key concepts: accuracy vs relevance, semantic similarity, robustness to ambiguous/out-of-scope queries, aspects of reliability/confidence
· Implement automatic comparison tools between chatbot responses and original content (ground truth) through semantic similarity and retrieval accuracy techniques.
· Perform a comparative analysis (pros/cons, data prerequisites, costs, repeatability)
Produce a final report containing a proposal for a 'validation framework' applicable to the business context.