• Mon. Nov 25th, 2024

This AI Research from China Explains How Common 7B Language Models Already Possess Strong Mathematical Capabilities

Mar 13, 2024

Large Language Models (LLMs) have demonstrated impressive capabilities in almost every domain. From generating unique content just like humans, answering questions to summarizing massive textual paragraphs, completing codes and translating languages, LLMs are one of the best advancements in the field of Artificial Intelligence (AI). 

However, it is widely believed that in order for language models to have great mathematical capabilities, they are required to be very vast in scale or go through a rigorous pre-training process involving mathematics. A recent research challenges this idea by demonstrating that the LLaMA-2 7B model already displays outstanding mathematical abilities, even with standard pre-training. 

It can choose the optimum response from 256 random generations with remarkable accuracy rates of 97.7% and 72.0% on the GSM8K and MATH benchmarks, respectively. The main problem with the existing base model is that, although it can produce accurate answers with high accuracy, it cannot reliably evoke its innate mathematical capabilities. The considerable decline in accuracy to 49.5% and 7.9% on the GSM8K and MATH benchmarks, respectively, when focusing solely on the first response, emphasizes this discrepancy.

To address this issue, the team has suggested scaling up supervised fine-tuning (SFT) data. The accuracy of the responses generated can be greatly improved by increasing the amount of data used for fine-tuning. However, the lack of publicly available math problems limits the potential for large-scale scalability. The team has used synthetic data, which works almost as well as real data, to get around this restriction. 

The team has created fictitious math problems with the GPT-4 Turbo model and has found that utilizing GPT-4 Turbo for verification after implementing a basic generating approach yields incredibly efficient outcomes. Using artificially generated maths problems allows for large scaling of the supervised fine-tuning data, with nearly matching real-world accuracy.

By using this simple method, the team was able to enhance accuracy noticeably. They attained 82.6% accuracy on GSM8K and 40.6% accuracy on MATH using LLaMA-2 7B models, which exceeds the accuracy of earlier models by 14.2% and 20.8%, respectively.

The research has also offered insights into scaling behaviors across various mistake types and reasoning difficulties. This analysis clarifies methods to reduce errors during the scaling process and helps comprehend how the model’s performance changes as data volumes increase.

In conclusion, this study demonstrates that language models can attain excellent mathematical capabilities without requiring large-scale models or intensive pre-training. Considerable progress in the area of mathematical problem-solving with language models can be made by utilizing synthetic data and increasing the amount of supervised fine-tuning.


Check out the PaperAll credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 38k+ ML SubReddit

The post This AI Research from China Explains How Common 7B Language Models Already Possess Strong Mathematical Capabilities appeared first on MarkTechPost.


#AIPaperSummary #AIShorts #Applications #ArtificialIntelligence #EditorsPick #LanguageModel #LargeLanguageModel #Staff #TechNews #Technology
[Source: AI Techpark]

Related Post