免费咨询热线
18052739825The University of Waterloo in Canada has developed a new training method called SubTrack++, which not only significantly reduces the pre training time of large language models, but also significantly improves their accuracy. This breakthrough is expected to reduce the cost and environmental burden of building artificial intelligence (AI) tools, enabling more people to use powerful and convenient AI technology.
The big language model is an AI system based on deep neural networks that focuses on understanding and generating human natural language. Its core ability comes from pre training massive text data, through which it learns grammar rules, semantic logic, and contextual associations, and outputs content that is close to human expression habits. The "size" of this type of model is reflected in two aspects: firstly, the training data scale is huge, and secondly, the model parameters are extremely large. Therefore, pre training it often takes several months and consumes a lot of computing power, specialized hardware, and electricity, making it difficult for general enterprises and institutions to afford the high cost.
To solve this problem, the team developed the SubTrack++method, which can reduce pre training time by half. The team pointed out that large language models have extremely high energy consumption, and even if the training time is reduced by only 5%, it can bring significant benefits. In the long run, such technological advancements will drive more people to independently construct their own specialized language models.
The team explained that the big language model is essentially a neural network composed of a massive matrix of numbers, which learns to predict text sequences through billions of trial and error attempts. Whenever the prediction is incorrect, the model fine tunes its mathematical parameters to improve accuracy. This process is like having the model "read the entire library" and learn how humans use language from it. SubTrack++simplifies the calibration process by focusing on the most critical core parameters of the task, achieving efficient fine-tuning and accelerating overall pre training.
The team expects that by saving pre training time, not only large enterprises but also ordinary users can build and customize their own AI tools in the future. After learning personal preferences safely, big language models can become truly intelligent digital assistants, adapting to different user styles, goals, and needs, and becoming powerful partners in human work and creation.
The team will officially present relevant papers at the Neural Information Processing Systems Conference held in Mexico City.
Copyright © 2021 depengseo 版权所有(https://en.688b2b.com) 备案号:苏ICP备11090668号