增强语言模型的能力：我在使用工具测试大型语言模型时的收获

LLMs are great at creative writing and language tasks, but they often stumble on basic knowledge retrieval and math. Popular tests like counting r's in "strawberry" or doing simple arithmetic...

大型语言模型（LLMs）在创意写作和语言任务上表现优异，但在基础知识检索和数学计算方面常出现错误。使用工具可以提升模型表现。测试表明，尽管一些小模型在使用工具时能正确计算，但它们通常不愿意使用工具，可能需要更多训练以认识自身局限。

创意写作大型语言模型工具数学计算训练语言模型