AI Optim

Orgnized by the way of thinking, this page shows issues and solutions in each phase of DL Data quantity, quality, public availability impacts result new datasets S - Need more this kind of data DeepMind LLM Gopher（地鼠）是 DeepMind 早期的大语言模型之一。 Chinchilla（龙猫/毛丝鼠）是另一种小型啮齿类动物，名字上与 Gopher 保持了一种“动物家族”的风格模型参数 vs 训练数据量在 Chinchilla 之前，主流观点认为：只要模型参数越多，性能就越好，即使训练数据量不变。 Chinchilla Scaling Laws 的核心观点: 在固定计算预算下，最优的训练方式是：模型规模和训练数据量应同时增长 T - FineWeb Dataset Other datasets are comparatively small English CommonCrawl section of Matrix (1.3T tokens), English CC-100 (70B tokens), Colossal-OSCAR (850B tokens) RedPajama ...