华为云官方博客 ·

基于卷积神经网络的MAE自监督方法

💡 原文中文，约1000字，阅读约需3分钟。

📝

内容提要

本文介绍了基于卷积神经网络的MAE自监督方法，通过对输入图片进行mask并进行重建任务，学习到了鲁棒的视觉特征。作者提出了基于CNN的类MAE方法，通过稀疏卷积和分层次的解码器实现了和ViT类似的效果。实验结果表明，该方法在性能上媲美原始的MAE方法，并在各种下游任务中取得了SOTA的结果。

🎯

关键要点

本文介绍了基于卷积神经网络的MAE自监督方法，学习鲁棒的视觉特征。
MAE是由何凯明提出的自监督预训练方法，借鉴了BERT的预训练任务。
MAE通过mask输入图片的patch进行重建任务，性能超过以往的对比学习方法。
ViT结构复杂，计算量大，基于CNN的类MAE方法具有研究价值。
CNN的滑窗操作导致模型受到mask部分的影响，无法直接应用常规MAE。
作者借鉴3D点云领域的稀疏卷积，仅对未mask的像素进行计算。
设计了分层次的解码器，参考UNet结构以学习多尺度特征。
实验结果表明，该方法性能媲美原始MAE，并在下游任务中取得SOTA结果。

🏷️

标签

MAE自监督方法卷积神经网络神经网络稀疏卷积自监督重建任务鲁棒的视觉特征

➡️

继续阅读

基于超1万肿瘤样本训练，哈佛医学院等提出泛癌症基础模型COMPASS，平均性能优于22种现有方法
COMPASS 首次将这一架构引入癌症转录组分析领域，通过利用免疫相关基因集，并建立：基因（gene）→ 基因集（gene set）→ 概念（concep...
Announcing the Public Preview of Discover and Domains, powered by Unity Catalog
Today, we're announcing the Public Preview of Domains and the Discover pa...
Peak Design’s modular Field Bracket has a finder tag built-in
I am a very clumsy man. So clumsy, that I have AirTags hanging off practicall...
Nearly every Kindle is steeply discounted at Best Buy
If you’ve been thinking about picking up a Kindle before school starts, or fo...
Single-pass AI code isn’t dead, but “high-reasoning” is the next frontier
Ask an AI model what comes next after “bacon-double”, and the return is fairl...
Apple’s rumored ‘Upgrade’ program brings lease-to-own pricing for iPhones, Macs, and iPads
As component and RAM shortages drive prices higher, Apple is reportedly launc...