Monitoring workplace activities is critical for ensuring job safety. Generative Artificial Intelligence (Gen-AI) and Human-centered Artificial Intelligence (Hum-AI) can suggest new trustworthy solutions to automate these monitoring procedures, ensuring improved work accident prevention. In this paper, we present a novel framework that combines Retrieval Augmented Generation (RAG) with explainable LLMs to automatically generate job safety reports from unstructured accident descriptions. Our method integrates embeddings like BERT and SciBERT and explainable AI exploiting Layer-Wise Relevance Propagation (LRP) to highlight root causes of accidents within the generated reports. We evaluate multiple LLMs, including LLaMA 3.1, Mixtral-8x7B, and DeepSeek v2, on the Aviation Safety Reporting System (ASRS) dataset. Results show that our best configuration (Mixtral-8x7B with SciBERT) achieves F1-scores up to 0.909 and GLEU and METEOR scores above 0.3 and 0.2. These findings demonstrate the effectiveness and interpretability of the proposed system in real-world job safety contexts and how the proposed approach could assist safety experts or inspectors more explicitly.

Automatic Generation of Job Safety Reports with Explainable RAG-Based LLMs

Panella, Giovanni
;
Pecori, Riccardo;
2025-01-01

Abstract

Monitoring workplace activities is critical for ensuring job safety. Generative Artificial Intelligence (Gen-AI) and Human-centered Artificial Intelligence (Hum-AI) can suggest new trustworthy solutions to automate these monitoring procedures, ensuring improved work accident prevention. In this paper, we present a novel framework that combines Retrieval Augmented Generation (RAG) with explainable LLMs to automatically generate job safety reports from unstructured accident descriptions. Our method integrates embeddings like BERT and SciBERT and explainable AI exploiting Layer-Wise Relevance Propagation (LRP) to highlight root causes of accidents within the generated reports. We evaluate multiple LLMs, including LLaMA 3.1, Mixtral-8x7B, and DeepSeek v2, on the Aviation Safety Reporting System (ASRS) dataset. Results show that our best configuration (Mixtral-8x7B with SciBERT) achieves F1-scores up to 0.909 and GLEU and METEOR scores above 0.3 and 0.2. These findings demonstrate the effectiveness and interpretability of the proposed system in real-world job safety contexts and how the proposed approach could assist safety experts or inspectors more explicitly.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11389/75195
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact