Three-year thesis project in the field of AI: state of the art on chatbot validation models

Apply Now
Genoa, Rome
Competence Center & Engineering

Orizzonte Sistemi Navali, a leading company in system engineering in the naval field, offers the opportunity to carry out a thesis activity dedicated to the validation of AI-based chatbots in a technical-industrial context. The project is part of a program for the digitization of the support and maintenance of complex naval systems. The objective is to support the consultation of technical manuals through an AI chatbot and to define solid methodologies to evaluate their quality and reliability.

Activities and Responsibilities

The resource must develop a structured mapping of the state of the art on Chatbot/LLM validation methodologies in contexts informed by specific documentation (manuals, procedures, knowledge base), with particular attention to the technical-industrial domain. The student will take care of the following activities:

· Collection and classification of approaches/metrics for evaluating chatbots (automatic, human, hybrid)

· Focus on key concepts: accuracy vs relevance, semantic similarity, robustness to ambiguous/out-of-scope queries, aspects of reliability/confidence

· Implement automatic comparison tools between chatbot responses and original content (ground truth) through semantic similarity and retrieval accuracy techniques.

· Perform a comparative analysis (pros/cons, data prerequisites, costs, repeatability)

Produce a final report containing a proposal for a 'validation framework' applicable to the business context.

Requirements

  • Three-year degree in progress (Computer Engineering/Mathematics/AI/Data Science or related)
  • Interest in NLP/LLM and evaluation of AI systems
  • Ability to read and summarize scientific articles (EN)
  • Basic knowledge of Python, embeddings, NLP metrics
  • Familiarity with RAG/knowledge base concepts

Specific skills

Fill out the following form to apply:

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.