This study introduces an innovative approach to safety report generation using a Retrieval-Augmented Generation (RAG) framework, tailored to synthesize comprehensive reports from descriptions and logs of work sessions. The core contribution of our study is the comparison and optimization of various Large Language Model variants (based on LLaMA) and embedding models, aiming to identify the most effective combination for accurately capturing and reflecting the intricacies of safety-related data in a given domain. Our RAG-based system leverages the strengths of different LLaMA models and embedding techniques to process and contextualize the input data, which include detailed session descriptions and operational logs. By integrating these models, we aim to automate the generation of safety reports that are not only coherent and contextually relevant, but also adhere to the stringent requirements of safety documentation in professional environments. The validation of our approach is performed using an aviation safety dataset and classic metrics in the field, such as Recall@5, GLEU, METEOR, and BERTscore. Our findings demonstrate the potential of RAG-based systems in streamlining the process of safety report generation, offering significant improvements in efficiency and accuracy over traditional methods and non domain-specific tailored models.
Automatic Job Safety Report Generation using RAG-based LLMs
Pecori, Riccardo
2024-01-01
Abstract
This study introduces an innovative approach to safety report generation using a Retrieval-Augmented Generation (RAG) framework, tailored to synthesize comprehensive reports from descriptions and logs of work sessions. The core contribution of our study is the comparison and optimization of various Large Language Model variants (based on LLaMA) and embedding models, aiming to identify the most effective combination for accurately capturing and reflecting the intricacies of safety-related data in a given domain. Our RAG-based system leverages the strengths of different LLaMA models and embedding techniques to process and contextualize the input data, which include detailed session descriptions and operational logs. By integrating these models, we aim to automate the generation of safety reports that are not only coherent and contextually relevant, but also adhere to the stringent requirements of safety documentation in professional environments. The validation of our approach is performed using an aviation safety dataset and classic metrics in the field, such as Recall@5, GLEU, METEOR, and BERTscore. Our findings demonstrate the potential of RAG-based systems in streamlining the process of safety report generation, offering significant improvements in efficiency and accuracy over traditional methods and non domain-specific tailored models.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.