Chinese tech unicorn 01.AI admits ‘oversight’ in changing name of AI model built on Meta Platforms’ Llama system
- Beijing-based 01.AI said the company made several name changes in its open-source large language model’s code as part of experimental requirements
- The firm has decided to change the so-called tensor name of its AI model Yi-34B to reflect that it was built on Meta Platforms’ Llama system
Tensors are data containers in the AI machine-learning process that hold and arrange information in a structured manner, making it easier for LLMs to understand and generate humanlike text.
LLMs are deep-learning AI algorithms that can recognise, summarise, translate, predict and generate content using very large data sets.
“During extensive training experiments, we made several renamings in the code to meet experimental requirements,” 01.AI’s Lin said in his post on Tuesday. “But we kinda dropped the ball and didn’t switch them back before pushing out our release … We’re sorry for the confusion.”
The company said in a response to the Post via WeChat on Wednesday that it changed the tensor name of its Yi-34B LLM to “fully test the [Llama] model” and that there was no intention to mask the source of the AI model.
Hugging Face community member Hartford, who had raised more than a week ago questions about the tensor name of 01.AI’s Yi-34B, which was released on November 6, on Wednesday dismissed any oddity with using another company’s AI model to help develop its own LLM based on the open source community’s perspective.
China start-up 01.AI hits US$1 billion value with top-ranked open-source model
“In our open source community, we often share each other’s code,” Hartford, a senior researcher at conversational AI tech firm Convai, said. “We take ideas from different architectures and share ideas from different architectures. This is normal for us.”
Hartford said various AI models “may have the same architecture, but the data that they were trained with is completely different”.
Because of the investment and tooling around the Llama architecture, there is value in using the same names for the tensors, Hartford suggested in his earlier post on the Hugging Face platform.
There is a need “to better honour” the open source community’s conventions, according to a Hangzhou-based AI entrepreneur who requested anonymity owing to the sensitivity of the topic. That practice is akin to citing a research paper and giving the proper attribution, the entrepreneur said.