LLM360 Introduces K2: A Fully-Reproducible Open-Sourced Large Language Model Efficiently Surpassing Llama 2 70B with 35% Less Computational Power

K2 is a cutting-edge large language model (LLM) developed by LLM360 in collaboration with MBZUAI and Petuum. This model, known as K2-65B, boasts 65 billion parameters and is fully reproducible, meaning all artifacts, including code, data, model checkpoints, and intermediate results, are open-sourced and accessible to the public. This level of transparency aims to demystify the training recipe used for similar models, such as Llama 2 70B and provides a clear insight into the development process and performance metrics.

The development of K2 was a collaborative effort among several prominent institutions: MBZUAI, Petuum, and LLM360. This collaboration leveraged the expertise and resources of these organizations to create a state-of-the-art language model that stands out for its performance and transparency. The model is available under the Apache 2.0 license, promoting widespread use and further development by the community.

LLM360 has provided a robust set of evaluations for K2, encompassing general and domain-specific benchmarks. These evaluations cover medical, mathematical, and coding knowledge, ensuring the model performs well across various tasks and domains. The LLM360 Performance and Evaluation Collection and the K2 Weights and Biases project document a detailed analysis of K2’s performance.

K2 was trained using diverse datasets to achieve results comparable to those of the Llama 2 70B model. The training process involved two stages, extensively using datasets such as dm-math, PubMed-abstracts, uspto, and others, totaling 1.3 trillion tokens. This comprehensive data mix ensured that K2 developed a broad understanding and capability across various subjects and languages.

LLM360 has made K2’s intermediate checkpoints available, allowing researchers and developers to track the model’s development and improvement over time. This is part of K2’s fully reproducible nature, providing transparency and facilitating further research and development. Tutorials for reproducing the pretraining and finetuning processes are also offered, catering to academic and industry researchers.

Also, LLM360 is an open research lab that enables community-owned artificial general intelligence (AGI) through open-source large model research and development. They aim to create an open ecosystem with equitable computational resources, high-quality data, and a flowing technical knowledge base to ensure ethical AGI development and universal access. LLM360 aims to empower innovators by advancing the capabilities of large language models and fostering a collaborative environment for research and development.

In conclusion, K2 by LLM360 offers transparency, performance, and a robust development framework. Through open-source collaboration and comprehensive evaluation, K2 sets a new standard for LLM development, ensuring ethical practices and broad accessibility for future innovations in AI.

The post LLM360 Introduces K2: A Fully-Reproducible Open-Sourced Large Language Model Efficiently Surpassing Llama 2 70B with 35% Less Computational Power appeared first on MarkTechPost.

#AIShorts #Applications #ArtificialIntelligence #EditorsPick #LanguageModel #LargeLanguageModel #Staff #TechNews #Technology
[Source: AI Techpark]