小红花·文摘

本研究提出了一种高效的Perceiver基架构（Long LoRA Perceiver - LLP），旨在解决Transformer在长序列处理中的复杂度问题。通过引入三种结构增强，该架构在自回归建模中实现了高性能与计算效率的平衡，实验结果表明其在多个基准测试中超越了最新的Transformer模型。

Enhanced Computationally Efficient Long LoRA Inspired Perceiver Architecture for Auto-Regressive Language Modeling

BriefGPT - AI 论文速递 ·

通过引入 Perceiver-Prompt 方法，利用 P-Tuning 对 Whisper 大规模模型进行微调，并通过可训练的 Perceiver 从可变长度输入中生成固定长度的说话人提示，以提高对中国发音障碍语音的模型识别性能。我们的实验结果表明，Perceiver-Prompt 在中国发音障碍语音数据集中获得了持续的识别性能改善，CER 相对减少高达 13.04%。

Perceiver-Prompt: 強調可變的語者適應，用於中文失調語音識別

BriefGPT - AI 论文速递 ·

Perceiver AR是一种新型自回归生成模型，能够处理长达100,000个元素的输入序列。它通过交叉注意力将输入编码到潜在空间，解耦计算需求与模型深度，从而显著提高生成效率。在长序列生成任务中，Perceiver AR的表现优于传统Transformer，能够生成和谐的音乐作品。

Perceiver AR：通用长序列自回归生成

Google DeepMind Blog ·

We develop Perceiver AR, an autoregressive, modality-agnostic architecture which uses cross-attention to map long-range inputs to a small number of latents while also maintaining end-to-end causal...

Perceiver AR: general-purpose, long-context autoregressive generation

Google DeepMind Blog ·

We develop Perceiver AR, an autoregressive, modality-agnostic architecture which uses cross-attention to map long-range inputs to a small number of latents while also maintaining end-to-end causal...

Perceiver AR: general-purpose, long-context autoregressive generation

Google DeepMind Blog ·

Perceiver IO: a scalable, fully-attentional model that works on any modality

Hugging Face - Blog ·