生成式人工智能有助于提高建筑工地的安全性

How generative AI could help make construction sites safer
作者:Andrew Rosenblum    发布时间:2025-07-04 11:47:18    浏览次数:0
To combat the shortcuts and risk-taking, Lorenzo is working on a tool for the San Francisco–based company DroneDeploy, which sells software that creates daily digital models of work progress from videos and images, known in the trade as “reality capture.” The tool, called Safety AI, analyzes each day’s reality capture imagery and flags conditions that violate Occupational Safety and Health Administration (OSHA) rules, with what he claims is 95% accuracy.
为了打击快捷方式和冒险,洛伦佐(Lorenzo)正在为基于旧金山的公司Dronedeploy制作工具,该公司销售软件,该软件每天创建工作数字模型,从视频和图像中创建数字模型,在交易中称为“现实捕获”。该工具称为安全AI,分析了每天的现实捕获图像,并标记违反职业安全与健康管理(OSHA)规则的条件,他声称的准确性为95%。

That means that for any safety risk the software flags, there is 95% certainty that the flag is accurate and relates to a specific OSHA regulation. Launched in October 2024, it’s now being deployed on hundreds of construction sites in the US, Lorenzo says, and versions specific to the building regulations in countries including Canada, the UK, South Korea, and Australia have also been deployed.
这意味着,对于任何安全风险,软件标志都有95%的确定性是准确的,并且与特定的OSHA法规有关。洛伦佐说,该公司于2024年10月推出,现在已在美国的数百个建筑地点部署,以及特定于加拿大,英国,韩国和澳大利亚在内的国家规定的版本。

Safety AI is one of multiple AI construction safety tools that have emerged in recent years, from Silicon Valley to Hong Kong to Jerusalem. Many of these rely on teams of human “clickers,” often in low-wage countries, to manually draw bounding boxes around images of key objects like ladders, in order to label large volumes of data to train an algorithm.
安全AI是近年来从硅谷到香港再到耶路撒冷的多种AI建筑安全工具之一。其中许多依赖于通常在低工资国家中的人类“点击器”团队来手动围绕诸如梯子之类的关键对象的图像绘制界限,以便标记大量数据以训练算法。

Lorenzo says Safety AI is the first one to use generative AI to flag safety violations, which means an algorithm that can do more than recognize objects such as ladders or hard hats. The software can “reason” about what is going on in an image of a site and draw a conclusion about whether there is an OSHA violation. This is a more advanced form of analysis than the object detection that is the current industry standard, Lorenzo claims. But as the 95% success rate suggests, Safety AI is not a flawless and all-knowing intelligence. It requires an experienced safety inspector as an overseer.
洛伦佐(Lorenzo)说,安全性AI是第一个使用生成AI来标记安全性违规行为的安全性AI,这意味着一种算法可以做的不仅仅是识别梯子或硬帽等物体。该软件可以“理由”网站图像中发生的事情,并得出关于是否存在OSHA违规的结论。洛伦佐(Lorenzo)声称,这是比当前行业标准的对象检测更先进的分析形式。但是,正如95%的成功率所表明的那样,安全性AI并不是完美无瑕的智能。它需要经验丰富的安全检查员作为监督者。

A visual language model in the real world
现实世界中的视觉语言模型

Robots and AI tend to thrive in controlled, largely static environments, like factory floors or shipping terminals. But construction sites are, by definition, changing a little bit every day.
机器人和人工智能倾向于在受控的,在很大程度上静态环境(例如工厂地板或运输码头)中壮成长。但是,根据定义,建筑工地每天都在改变一点。

Lorenzo thinks he’s built a better way to monitor sites, using a type of generative AI called a visual language model, or VLM. A VLM is an LLM with a vision encoder, allowing it to “see” images of the world and analyze what is going on in the scene.
洛伦佐(Lorenzo)认为,他使用一种称为视觉语言模型或VLM的生成AI的类型来监视站点。VLM是带有视觉编码器的LLM,允许它“查看”世界的图像并分析场景中发生的事情。

Using years of reality capture imagery gathered from customers, with their explicit permission, Lorenzo’s team has assembled what he calls a “golden data set” encompassing tens of thousands of images of OSHA violations. Having carefully stockpiled this specific data for years, he is not worried that even a billion-dollar tech giant will be able to “copy and crush” him.
洛伦佐(Lorenzo)的团队通过从客户那里收集的多年现实捕获图像,汇集了他所谓的“黄金数据集”,其中包括成千上万的OSHA违规图像。多年来,他仔细地库存了这些特定数据,他并不担心即使是十亿美元的科技巨头也能够“复制和压制”他。

To help train the model, Lorenzo has a smaller team of construction safety pros ask strategic questions of the AI. The trainers input test scenes from the golden data set to the VLM and ask questions that guide the model through the process of breaking down the scene and analyzing it step by step the way an experienced human would. If the VLM doesn’t generate the correct response—for example, it misses a violation or registers a false positive—the human trainers go back and tweak the prompts or inputs. Lorenzo says that rather than simply learning to recognize objects, the VLM is taught “how to think in a certain way,” which means it can draw subtle conclusions about what is happening in an image.
为了帮助培训模型,洛伦佐(Lorenzo)拥有一个较小的建筑安全专业人员团队,询问AI的战略问题。培训师输入了从黄金数据集到VLM的测试场景,并提出问题,以通过分解场景并逐步进行分析的过程来指导模型,以经验丰富的人类的方式进行分析。如果VLM无法产生正确的响应(例如,它会错过违规行为或记录误报阳性),人类教练会回去调整提示或输入。洛伦佐(Lorenzo)说,VLM不是简单地学习识别对象,而是教授“如何以某种方式思考”,这意味着它可以对图像中发生的事情得出微妙的结论。

最新文章

热门文章