VLM-Based Hazard Reasoning

Wed, 01 Jan 2025 00:00:00 +0000

YOLO can flag a red zone but cannot explain why the scene is dangerous. This project tests whether general-purpose VLMs can fill that explainability gap using structured domain-aware prompting.

Prompts give models physical context about melt shop environments: what pot haulers are, what molten metal implies, what worker corridors mean for safety. Two-stage reasoning pipeline: first describe the scene neutrally, then evaluate against safety conditions.

Output covers scene description, detected entities, spatial relationships, hazard assessment, risk level, and recommended action.

Vision-Language Models | Jay Polra

VLM-Based Hazard Reasoning