"'Fake it before you make it' is an ignoble product of Silicon Valley," said Christopher Manning, director of the Artificial Intelligence Laboratory at Stanford University, commenting on some researchers at the university who plagiarized the achievements by institutions such as China's Tsinghua University.
斯坦福大学人工智能实验室主任克里斯托弗·曼宁在评论该校一些研究人员抄袭中国清华大学等机构的成果时表示:“‘作假,直至成功’,这是硅谷不光彩的文化。”
On May 29, a research team at Stanford University released a large model called Llama3-V, claiming it can achieve the same effects as large models such as GPT-4V with a pre-training cost of only US$500.
5月29日,斯坦福大学的一个研究小组发布了一种名为Llama3-V的大模型,声称只要500美元的预训练成本,就能用它获得比肩GPT-4V等大模型的效果。
The news was widely spread on social media and in the academic community of artificial intelligence.
随即这一消息在社交媒体和人工智能学术界疯传。
However, industry insiders soon suspected that the Standford team plagiarized the MiniCPM-Llama3-V 2.5 large model released by Tsinghua University and other Chinese institutions.
但很快,业内人士就怀疑斯坦福团队抄袭了清华大学等中国机构发布的MiniCPM-Llama3-V 2.5大模型。
Both Llama3-V and the MiniCPM-Llama3-V 2.5 large model are based on the open-source Llama3 large model.
Llama3-V和MiniCPM-Llama3-V 2.5大模型都是基于开源的Llama3大模型。
Still, the team in Tsinghua conducted unique training, including using the "Tsinghua Bamboo Slips," a collection of Chinese texts written on strips of bamboo which date back to the Warring States Period, to train the model to recognize ancient Chinese characters.
不过,清华大学的研究小组进行了独特的训练,包括利用“清华简”(这是一套写在竹片上的中国文字,可以追溯到战国时期),以训练模型识别古代汉字。
Tests show that the model released by the Stanford University team can also recognize the "Tsinghua Bamboo Slips."
测试显示,斯坦福大学团队发布的大模型居然也能识别“清华简”。
"We are quite sure that the Stanford team has plagiarized our big model research results," Liu Zhiyuan, a tenured associate professor of the Department of Computer Science at Tsinghua University, told Xinhua.
清华大学计算机系长聘副教授刘知远接受新华社采访时说道:“我们非常确定,斯坦福这个团队抄袭了我们的大模型研究成果。”
"The data we scanned and annotated word by word from the 'Tsinghua Bamboo Slips' has never been made public, and Llama3-V has shown the same ability to identify the 'Tsinghua Bamboo Slips,' even the error examples are the same," said Liu, who is also a member of the Tsinghua big model team.
作为清华这个大模型团队成员的刘知远表示:“我们从‘清华简’逐字扫描并标注的数据集从未公开,而Llama3-V展现出了一模一样的识别‘清华简’的能力,甚至错误示例也相同。”
As doubt accumulated, the Stanford team deleted the database and promotion articles on the Internet, Liu said, adding "from the evidence and their reactions, the nature of plagiarism has been relatively confirmed."
刘知远称,在质疑声发酵后,斯坦福大学团队删除了网上发布的数据库和宣传文章,并表示“从证据和对方的反应来看,抄袭得性质已比较确定”。
Following Manning's criticism, two members of the Stanford team, Aksh Garg and Siddharth Sharma, formally apologized on social media.
在曼宁的批评之后,斯坦福大学团队的两名成员阿克什·加格和西达尔特·夏尔马在社交媒体上正式道歉。
"We've taken all references to Llama3-V down and we apologize once again for the inconvenience we may have caused," they said.
他们表示:“我们已经撤下了所有提及Llama3-V的内容,并再次为我们可能造成的不便表示歉意。”