Advertisements
In recent times, the buzz surrounding DeepSeek has been hard to ignoreHaving downloaded the app myself and using it for several days, I must say I’m thoroughly impressedThis marks my first deep dive into a homegrown AI application, as my previous experiences were almost solely with ChatGPTUnlike those who are driven by novelty and eager to test out new technology, I tend to prefer products that have solid reputations and proven qualityHowever, as a competitive alternative to ChatGPT, DeepSeek definitely lives up to the hype, often exceeding my expectations with its responses to various queriesDespite some limitations—such as its inability to generate images or engage in voice interactions—its speed and quality have made it a valuable addition to my daily toolkitChatGPT can be quite slow at times, and DeepSeek’s capability helps fill that gap effectively.
I've also come across analyses from professionals in the field and reviews from abroad that shed light on DeepSeek's performanceWhile I won't reiterate those insights here, they do give rise to an interesting perspective: could this be China’s moment to turn the tables on American AI dominance? The backdrop of this scenario is further underscored by the struggles of companies like Meta, headed by Mark Zuckerberg. (I’m intentionally leaving out commentary on Baidu, my previous employer, to remain focused.)
To be honest, the results have exceeded my preconceived notionsI didn’t expect this level of brilliance from a Chinese AI, prompting me to reflect on my previous year’s article and acknowledge that some of my reasoning still holds.
When comparing China's AI development with that of the United States, I identify four main differencesFirst is the algorithms, which can be pursued and have the potential for the smallest performance gapSecond, there's computational power
Advertisements
This isn’t merely a financial issue; the ban on advanced GPU chip sales to China means this gap could widen over timeThirdly, there's the data corpusThe Chinese corpus still lags behind its English counterpart in the technology spaceLastly, compliance; the higher compliance costs in China constitute a real problem that deserves its own thorough discussion.
Regarding algorithms, I’ve always believed they present the least substantial gap and can be bridgedNonetheless, I underestimated the prowess of Chinese AI developersThe remarkable performance of DeepSeek, as evidenced through analyses of their papers, indicates their optimization capabilities and training cost efficiencies far exceed those of their American counterparts.
From a technical standpoint, I've come to understand that DeepSeek itself acknowledges its foundation in algorithmic concepts derived from ChatGPT’s publicly available insightsThough ChatGPT cannot be directly accessed as open-source, DeepSeek has utilized the ideas presented in published research papers and executed a remarkable reproduction of themThey’ve layered their innovations on top of this foundation, introducing a system of self-motivation that enables significant "emergent" capabilities even with limited training data and resourcesThe term “Aha moment,” or epiphany, aptly encapsulates this phenomenon—an insightful breakthrough that allows for a deeper understanding of complex causal relationships and nuances in problem-solvingWhile ChatGPT's insights stem from considerable training costs and resources, DeepSeek has lowered the barriers to achieving similar breakthroughs, allowing even ordinary academic labs or high-end home computers to facilitate such learningMany educational and research institutions have expressed admiration for DeepSeek, regarding it as a significant treasure.
It’s worth highlighting that the Chinese talent for optimization is exceptionally strong
Advertisements
Before DeepSeek emerged, notable initiatives like Colossal AI, led by Professor You Yang, already presented frameworks that markedly reduced model training costs and garnered recognition across the industryMoreover, looking back to earlier times before AI models became prominent, I found that both Alibaba and Tencent excelled in optimizing database load capabilities to a global benchmark.
Given these remarkable optimization capabilities, one can see that computational shortfalls become less of an obstacleInterestingly, Huankuang Quantization has long been one of the top players in China’s computational resources sceneAlthough it wasn’t initially developed to support large AI models, it has proved to be a fortuitous alignment of interestsIn comparison to the advanced capabilities of American AI giants, there remains a significant gap, yet within China, we still boast a strong presence in leading computational resources.
When it comes to data sources, my instincts predicted the identity of their data provider, which I later verified privatelyDiscussing this matter openly remains sensitiveHowever, I have a hunch—though potentially inaccurate—based on ongoing tests: the core logic and knowledge repertoire of Chinese large models are primarily constructed from English corpus trainingMany early iterations of Chinese models exposed this challenge; for instance, they often produced erroneous translations where ‘bus’ was interpreted as ‘big rat’ and ‘mouse’ translated to ‘an actual mouse.’ I believe that while English data is essential for quality assurance, Chinese corpus definitely adds immense value across relevant local contextsYet for in-depth technical inquiries, a solid foundation within English corpus may be necessary to yield satisfactory results.
Thus, we owe a debt of gratitude to the undisclosed data supplier who plays a crucial role in enriching the global quality of data available for training, empowering Chinese models to evolve swiftly.
Next comes compliance, which I couldn’t resist probing into during my interactions with DeepSeek
Advertisements
Naturally, there were questions that should not be posed directly; however, it was often possible to circumvent this by reformulating the inquiries subtlyIn these instances, the system would output an initial response and then abruptly switch to a piece of standard, encouraging content when it approached potentially sensitive issuesThis indicates a thorough vetting mechanism occurs concurrently during output generationWhile this may be manageable, American users undoubtedly introduce instruments for compliance that would need addressing.
After unpacking these four points, we return to the pivotal query: Could this signal a turning point in AI innovation, favoring China over the US?
In my view, it’s premature to claim a definitive shift at this momentAlthough standing upon the shoulders of giants, DeepSeek has demonstrated exceptional algorithmic innovations and incremental advancements cannot yet demonstrate an insurmountable leadIt may be on par with or even surpass the free versions of ChatGPT, but declaring that it has entirely eclipsed ChatGPT feels imprudent, especially considering that ChatGPT possesses even more robust versions that remain undisclosed.
Furthermore, after being open-sourced, foreign giants would likely respond quickly to any innovationsHence, maintaining a lead in algorithms will be challengingHowever, should this open-source project advance rapidly through collaborative global efforts, we may very well see a day when its major contributors aren’t solely from this Chinese company—a development worthy of celebrationLeveraging the strengths of an open-source community, not just commercial enterprises, could indeed render an outcome where ChatGPT is ultimately overcomeI look forward to seeing a roster of worldwide talents emerge among the contributors to this open-source project.
Lastly, the most pressing risk lies in the extraordinary achievements of the DeepSeek team, greatly increasing the likelihood of core members being wooed away by major tech firms
Advertisements
Advertisements
Leave a Reply