How much do LLM memorize? key definition unintended memorization: memorize a specific dataset generalization (intended memorization): contains about the true data-generation process calculation...
在阅读unsloth博客的《手动自动求导》后,我尝试解析模型,发现了更多可优化的点。torchview是一个很好的工具。