蝈蝈俊 ·

Transformer中的token - 蝈蝈俊

💡 原文中文，约800字，阅读约需2分钟。

📝

内容提要

ChatGPT和视觉Transformer都使用token作为最小单位，ChatGPT的token大概为3/4的一个单词，而视觉Transformer把图片拆分成一个个patch，每个patch看作是一个token，以减少计算量。

🎯

关键要点

tokenization（分词）是将长文本分解为以字词为单位的数据结构。
在中文中，句子“我很开心”被分解为三个tokens：‘我’，‘很’，‘开心’。
不同的分词策略会导致不同的token划分，ChatGPT的token大约为3/4个单词。
ChatGPT的例子中，字符串‘ChatGPT is great！’被编码为六个tokens。
视觉Transformer将图片切分为16x16的patch，每个patch作为一个token，以减少计算量。
ViT通过切分图片来降低token数量，从而避免计算量过大。
token是具有独立语义的最小单位，每个token代表一个独立的单元，具有一定的语义含义。

🏷️

标签

ChatGPT patch token transformer 视觉Transformer 计算量

➡️

继续阅读

九章云极Alaya Token完成Kimi K3适配全球首个开源3T级模型入驻Token工厂
OpenAI fixed GPT-5.6 Sol’s most frustrating flaw: Burning limits while it waits
OpenAI introduced GPT-5.6 Sol earlier this month as a model built for more de...
Anthropic backs urgent call for the most powerful AI labs to hit the brakes
Less than a week after OpenAI disclosed that two experimental AI models escap...
“The beast needs a cage”: Why PortSwigger’s agentic pentesting is kept safe behind bars
As agentic services diversify across the entire enterprise technology stack, ...
OpenAI, Anthropic, and Cursor all localized pricing for India. Only two focused on value.
Cursor is the latest AI company to target India with localized pricing, annou...
Energy runs on volatile markets. Finance protects the margin.
Ask an energy CFO where this year's margin is landing and you will always...