BriefGPT - AI 论文速递 ·

将PEFT视为攻击！在联邦参数高效微调中破解语言模型

📝

内容提要

本研究针对联邦参数高效微调（FedPEFT）中的安全隐患进行探讨，揭示了PEFT方法可被利用为攻击向量，绕过语言模型的安全机制并生成有害内容。我们提出的PEFT-as-an-Attack（PaaA）威胁显示，在训练可调参数不足1%的情况下，就能实现约80%的攻击成功率。这表明需要研发更有效的防御机制，以保障联邦微调的安全性与模型性能。

🏷️

继续阅读

在AI帮助下黑客在漏洞公布数小时后就针对WordPress发起远程代码执行攻击
#安全资讯在漏洞公布数小时后，黑客就利用 AI 成功发掘 WordPress 高危安全漏洞并发起攻击，部分网站可能会被黑客添加管理员账号或在服务器上部署...
Announcing the Public Preview of Discover and Domains, powered by Unity Catalog
Today, we're announcing the Public Preview of Domains and the Discover pa...
Peak Design’s modular Field Bracket has a finder tag built-in
I am a very clumsy man. So clumsy, that I have AirTags hanging off practicall...
Nearly every Kindle is steeply discounted at Best Buy
If you’ve been thinking about picking up a Kindle before school starts, or fo...
Single-pass AI code isn’t dead, but “high-reasoning” is the next frontier
Ask an AI model what comes next after “bacon-double”, and the return is fairl...
Apple’s rumored ‘Upgrade’ program brings lease-to-own pricing for iPhones, Macs, and iPads
As component and RAM shortages drive prices higher, Apple is reportedly launc...

内容提要

标签

继续阅读