小红花·文摘

A three-person agency received a $14,000 AWS bill in one day after attackers extracted static access keys and burned Claude invocations on Bedrock. Combined with May's DN42 incident, where an...

AI Agents with Cloud Credentials Are Outrunning Billing Guardrails Built for Human-Speed Mistakes

InfoQ ·

如何通过Unity AI Gateway Guardrails 保护AI工作负载

Databricks ·

研究人员攻破了OpenAI的Guardrails安全护栏，利用提示注入方法绕过安全检测，生成危险内容。攻击者能够同时操控生成模型和安全评估模型，导致系统漏洞。专家警告，依赖模型评估可能造成虚假安全感，建议采用独立验证和持续对抗测试以增强防御。

OpenAI安全护栏破绽百出，简单提示注入即可绕过

FreeBuf网络安全行业门户 ·

OpenAI推出的Guardrails安全框架旨在提升AI安全性，但研究显示其存在漏洞，攻击者可通过提示注入绕过安全检测，生成有害内容。这一发现突显了保护AI系统的挑战，专家建议采用独立验证和红队测试以增强防御。

OpenAI安全护栏框架破绽百出，简单提示注入即可绕过

FreeBuf网络安全行业门户 ·

Agent设计模式——第 18 章：Guardrails/安全模式

XINDOO的博客 ·

Guardrails AI 推出 Snowglobe：AI 代理和聊天机器人的模拟引擎

实时互动网 ·

Facing a complex array of threats, payments providers will need to embrace a mix of traditional and emerging approaches.

Guardrails for growth: Building a resilient payments system

McKinsey Insights & Publications ·

AWS 一周综述：Strands Agents、AWS Transform、Amazon Bedrock Guardrails、AWS CodeBuild 等（2025 年 5 月 19 日）

亚马逊AWS官方博客 ·

Amazon Bedrock Guardrails 新增功能：提升生成式 AI 应用程序的安全性

亚马逊AWS官方博客 ·

AWS Bedrock中的Guardrails：控制AI生成内容

DEV Community ·

大规模AI安全控制：Amazon Bedrock Guardrails

DEV Community ·

通过AI Gateway中的Guardrails确保AI交互安全无风险

The Cloudflare Blog ·

本研究探讨了视觉大型语言模型在多层防御下易受复杂对抗攻击的问题。提出的多面攻击框架通过视觉攻击、对齐破坏和对抗签名三种方式成功绕过防护机制，黑箱测试显示攻击成功率达61.56%。

Effective Black-Box Multi-Faceted Attacks Breach Vision Large Language Model Guardrails

BriefGPT - AI 论文速递 ·

AWS 一周综述：Amazon EC2 F2 实例推出、Amazon Bedrock Guardrails 降价、Amazon SES 更新等（2024 年 12 月 16 日）

亚马逊AWS官方博客 ·

Amazon Bedrock Guardrails 现提供支持图像的多模态毒舌监测功能（预览版）

亚马逊AWS官方博客 ·

AI guardrails help ensure that an organization’s AI tools, and their application in the business, reflect the organization’s standards, policies, and values.

What are AI guardrails?

McKinsey Insights & Publications ·

在Databricks上实施LLM护栏以安全和负责任地部署生成式AI

Databricks ·