Virtual Construction Safety Assistant - based on Multimodal Large Language Model embedded with construction safety knowledge

I&T Wish

Virtual Construction Safety Assistant - based on Multimodal Large Language Model embedded with construction safety knowledge
(REF: W-0571)

Matched I&T Solution

Trial Project

Summary and Challenges

This project will develop a Virtual Construction Safety Assistant (VCSA) based on a Multi-modal Large-Language Model (MLLM) fine-tuned for "construction works" and "safety." The VCSA will provide real-time safety monitoring on construction sites using semantic reasoning to detect and record a broader range of safety rule violations that require high-level reasoning, surpassing conventional computer vision methods suffering from the following limitations:
1. Inflexible against diverse regulatory safety standards: Current AI models are trained on static scenarios, thus are limited to predefined tasks and inflexible to evolving site conditions;
2. Ineffective computational resource utilization: Cloud-only computing suffers from high computational costs and latency in critical incident alerting, while edge-only computing lacks analytical capacity for complex reasoning-intensive analyses;
3. Reactive risk identification without causal reasoning: Existing systems superficially flag safety violations, while lacking capabilities in deeper root-cause analyses and actionable mitigation planning tailored to site-specific metadata;
4. Lacked contextual awareness and refinement capability: Existing anomaly detection models excel at spotting obvious violations but struggle with ambiguous edge cases and over-reliance on manual corrections involving tedious data labeling.

Expected Outcome

By automatically identifying safety violations and issuing timely alerts, the VCSA aims to enhance worker well-being and optimize resource allocation for safety. It will be integrated as a software plugin into CCTV monitoring systems and alerting devices, ensuring comprehensive safety management on-site.
Technologies shall include but not limited to the followings:
1. Large-small language model co-adapter frameworks, with a coefficient variation-based difficulty grading mechanism to stratify training datasets by the samples’ degree of complexity, and with multi-stage curriculum learning strategies to progressively optimize the AI models toward advanced loss functions tailor-made for construction safety applications;
2. Hybrid edge-cloud computing paradigms with staged analyses based on criticality-latency trade-off, for cost-effective resource utilization and timely incident reporting to safety managers;
3. Integration of chain-of-thought prompting with agentic retrieval-augmented generation frameworks, for transforming risk identification into root cause reasoning and actionable mitigation planning, dynamically augmented by site-specific metadata (e.g. historical incidents, evolving project requirements or site conditions);
4. Multi-stage time-series analytical frameworks for automated video-based anomaly detection for high-confidence tasks, integrated with continual feedback-and-correction mechanisms upon low-confidence edge cases, to refine the AI models’ site-specific knowledge for enhanced scene-adaptive generalization and contextual awareness of the AI models.

Expected Trial Duration

15-month

Contact Information

I&T Wish Proposer	:	Drainage Services Department (DSD)
Contact Person	:	Wong Chi Kau, Gloria
Position	:	Engr/Special Duty1/2
Tel	:	2594 7431
Email	:	ckwong11@dsd.gov.hk

Upload Date

2025-05-27

Closing Date

2025-06-10

Propose I&T Solution

Submit Proposal

I&T Wish - Virtual Construction Safety Assistant - based on Multimodal Large Language Model embedded with construction safety knowledge 2025-05-21

Virtual Construction Safety Assistant - based on Multimodal Large Language Model embedded with construction safety knowledge