SELECT LANGUAGE BELOW

Blaze News original: China’s DeepSeek Coder claims it is the first open-source model to surpass GPT-4 Turbo amid tense AI race

Chinese AI startups Deep Seek The company recently made waves with the release of DeepSeek Coder V2, an open source, expert-driven code language model. Previously discussed A rival to ChatGPT, DeepSeek Chat, trained on 2 trillion Chinese and English tokens, has been gaining attention in the AI ​​world following the release of the latter.

of The company’s latest developments It excels in both math and coding tasks. DeepSeek Coder V2 can support 300 different programming languages, as well as outperforming closed-source models such as Claude 3 Opus, Gemini 1.5 Pro, and GPT-4 Turbo. The company claims to be the first to create an open-source model to accomplish such tasks, far outperforming Llama 3-70B and other models in the same category.

Moreover, DeepSeek Coder V2 appears to perform better on both language and inference tasks.

Unique Features of DeepSeek

What sets DeepSeek’s recent developments apart is that they are open source and relatively small-scale. Samuel Hammond, senior economist at the Foundation for American Innovation, told Blaze News that DeepSeek Coder V2 “incorporates the latest ‘mixture of experts’ and sparse methods, which are also integrated into the new U.S. model. This allows GPT-4o and 4-turbo to run much faster than the original GPT-4.”

Some of the most popular chatbots, such as Gemini, Claude, and ChatGPT, use MoE to address prompts from a diverse user base. To gain broad and deep expertise on a particular subject, these chatbots need to be able to access highly specialized data and share it with their users.

“They are bringing in talented people, but many Chinese companies also have a history of stealing intellectual property from their Western competitors.”

The goal of the MoE strategy is to integrate multiple specialized models, called “experts,” into one comprehensive system, allowing each “expert” to specialize or focus on a specific task to generate the deepest and most sophisticated information. This strategy is very different from one-size-fits-all machine learning systems, which can generate a large amount of information but may be under-specialized.

Another reason DeepSeek V2 is powerful is that it is open source, which means the source code is available to anyone in the public domain. This allows people to use, modify, and distribute their discoveries and developments. The open source model lends itself to outside creativity and innovation, whereas the closed source model does not.

“The reason people are surprised is that DeepSeek V2 is one of the best open source MoE models available today. U.S. companies have models that are equal or better, including Meta. Llama3 400b “The model has not been announced yet, but it has not been released yet,” Hammond said.

“DeepSeek V2 highlights the danger that U.S. AI companies may be reluctant to open-source their models due to public pressure and potential legal risks. Independent developers and researchers around the world will always want to use the best open source models available, and we want those best open models to be made in the U.S., not made in China.”

The Potential of General Artificial Intelligence

DeepSeek was founded in 2023 with the mission of “unraveling the mysteries of AGI.” [artificial general intelligence] “Be curious” This may be a lofty goal, but there is still debate in the AI ​​world about whether achieving AGI is feasible. AGI is generally defined as artificial intelligence that has human-like reasoning and problem-solving abilities, as well as the ability to learn and adapt on its own.

OpenAI CEO Sam Altman seems optimistic about the future of AGI, claiming that it is already “pretty close,” but he adds that AGI will “change the world a lot less than we think, and will change jobs a lot less than we think.”

he He continued: “People want to be disappointed. [by what AGI can really do] And it will be. We actually [artificial general intelligence] And that is what is expected of us.”

Moreover, Shane Legg, lead AGI scientist at Google DeepMind, has said that there is a 50% chance that AGI will be realized by 2028. However, not everyone in the field is so optimistic.

Grady Booch, IBM Fellow and Principal Scientist in Software Engineering, says AGI will never happen: “As a historian of computing, I have a rather cynical and pessimistic view of the extreme optimism in our field, and so I’m conditioned to be somewhat of a contrarian when it comes to predictions like this.”

A particular challenge for DeepSeek and OpenAI, the two companies that claim AGI is their ultimate goal, is that the problem quickly becomes one of philosophy rather than technology: to achieve AGI, you need to establish what it will look like.

Sarah Hooker is Cohere for AI“This is really a philosophical question, so in some ways it’s a very difficult time to be in this field, because we’re a scientific field,” said a researcher at a lab focused on machine learning, adding that much of the discussion around AGI is values-driven rather than technology-driven, which can obscure a meaningful definition of AGI.

Hooker further stated that it is “highly unlikely” that AGI will be defined or achieved by “checking a box at one event and saying, ‘AGI has been achieved.'” For AGI to be realistically achieved, there must be a testable definition that everyone in the field can agree on.

Microsoft Research, in collaboration with OpenAI, Published a paper In 2023, a study was published suggesting that GPT-4 will demonstrate an early example of AGI. Researchers from the project argued that “GPT-4, along with ChatGPT, Google’s PaLM, and others, is part of a new cohort of LLMs that exhibit more general intelligence than previous AI models.”

Until experts and others in the field can productively and concretely define AGI, it’s somewhat unclear what researchers mean when they say GPT-4 is “exhibiting”[ed] It has more general intelligence than previous AI models.”

The battle for AI supremacy

Earlier this month, Data analytics company Govini This suggests that the US is lagging behind China in the AI ​​race, and therefore, if a serious conflict breaks out between the two superpowers, it will be difficult for the US to win a fight against the PLA.

Goviny Investigate performance It focuses on the 15 most critical national security technologies for the federal government, particularly from an acquisition, hostile capital, procurement, supply chain, foreign influence, and science and technology perspective.

“The AI ​​side of our spending is so heavily R&D-focused that you’re not really building this into the weapons systems and platforms that we currently have fielded. Obviously, that’s to be expected, given that it’s artificial intelligence.”

According to the Govini report, the U.S. continues to underinvest in valuable AI capabilities and is also slowing down in the research and development phase: Nine of the 12 areas assessed in the report noted that more than 65% of government funding will still be in the research and development phase as of 2023. As a result, many of these potentially valuable technologies are not yet production-ready.

“Despite artificial intelligence being an extremely high-profile and arguably most significant transformative technology in a major technology race for the U.S. and the world, the Department of Defense still treats it primarily as a research and development effort,” said Tara Murphy Dougherty, CEO of Govini.

“While there is still work to be done in artificial intelligence research and development, it is long past time for the Department of Defense to stop treating AI like merely a science project,” she continued.

Goviny’s 2023 report noted that the US is falling behind China in the technology race, putting it at serious risk of “weakness and dependency”. 2022The data analytics firm found that the US is not investing enough in AI and ML to win a potential tech race with its Eastern rivals.

“When you add in an AI advantage that the U.S. does not have, the war could become unwinnable.” [for the U.S.]” Dougherty said.

“Deepseek is one of the Chinese companies investing heavily in the Digital First Project,” Nathan Reimer, executive director of the Digital First Project, told Blaze News. [the AI] industry.”

“China is bringing in talented people, but many Chinese companies also have a history of stealing intellectual property from their Western competitors. AI is an arms race and given that China is pouring billions of dollars into it, it’s not surprising they’re developing cutting-edge technology,” he added.

This seems to echo Dougherty’s point that AI could manifest itself in subtle ways during war: She said China doesn’t need to weaponize AI to have a dramatic effect during a conflict, but rather the People’s Liberation Army could use it to infiltrate the U.S. energy grid, with devastating effect.

While the Pentagon appears to be slow to move AI and ML technologies out of the development stage, Dougherty said there’s still an opportunity: “The AI ​​side of our spending is so heavily R&D-focused that you don’t see us actually building this into the weapon systems and platforms that we currently field, which is understandable, given that it’s artificial intelligence.”

“But I believe the Department of Defense has a good framework in place to govern that and ensure that AI is used appropriately in a military context. So let’s do that,” she added.

Moreover, the United States appears to be lagging behind China in patent grants in 13 of 15 key technology areas. According to Dougherty, China has managed to accelerate patent grants in recent years since the 14th Five-Year Plan for Information Development.

“When you think about patents, they are a leading indicator of technological advantage,” she added.

Hammond said that in addition to providing a detailed analysis of why the Pentagon is lagging behind, the U.S. [China] Access to AI hardware — the advanced AI chips needed to scale and serve the largest models.

Software engineer Mike Wacker said the U.S. “should be concerned about AI dominance, both generally and for military applications,” but he added that “it’s interesting that DeepSeek is open source. If this was truly valuable to the Chinese Communist Party, they wouldn’t have let researchers open source it in the first place.”

While the United States may be in a relatively good position in AI development, DeepSeek shows that China is not far behind, and it doesn’t appear that China is cutting back on its investments in the AI ​​race with the West.

Like Blaze News? Bypass the censorship and sign up for our newsletter to get stories like this one directly to your inbox. Register here!

Facebook
Twitter
LinkedIn
Reddit
Telegram
WhatsApp

Related News