ChatGPT or DeepSeek: which one should you use?
Having rapidly evolved over the past few years, AI models like OpenAI's ChatGPT have set the benchmark for performance and versatility. However, a new player, DeepSeek, is making waves, challenging established models with unique capabilities and innovative approaches.
While OpenAI's flagship model GPT 4o reportedly cost about $100 million to deploy, DeepSeek developed their magnum opus at a fraction of that cost, at an alleged $5 million. As of now, the performance of DeepSeek's models is claimed by many to be on par with that of OpenAI's, dispelling the notion that generative AI development has a mountainous power requirement.
So how does this translate to the average user's experience? Does ChatGPT still reign supreme in the realm of AI assistance? Or does the current version of DeepSeek hold up to the competition? Let's take a look. Keep in mind that the comparison is mostly derived from general user consensus across the web, so individual experience may vary.
Text generation: ChatGPT 4o vs DeepSeek V3
Conjuring huge piles of text out of thin air is the bread and butter of Large Language Models (LLM) like ChatGPT. Since its launch, herds of students, researchers and writers alike have flocked to its versatile generative abilities to ameliorate their writing whether it be school homework or a journal publication.
DeepSeek V3, being the base model, holds its own against the base GPT models. Demonstrating exceptional proficiency in producing precise, (mostly) factually accurate text, particularly in multilingual contexts, although from user consensus across the web, GPT 4 does have an edge when it comes to creative nuance, and emulating human-like interaction.
According to DeepSeek's own benchmark analysis, in the Massive Multitask Language Understanding (MMLU) designed to test model proficiency, the V3 performed on par with the state-of-the-art GPT 4o model with a score of 88.5% against 88.7%.
Verdict: DeepSeek for concise and to-the-point text. ChatGPT for a more conversational, human-like tone.
Reasoning capabilities: ChatGPT o1 vs DeepSeek R1
AI assistants have become a must-have tool in the arsenal of all professionals, with increasing workloads requiring intensive critical and analytical reasoning. In response to that demand, DeepSeek launched R1, designed specifically for tasks that require reasoning such as solving complex math equations and writing coherent code, or parsing through an airtight legal document.
ChatGPT crowns its very own GPT o1 to be the most intelligent problem-solving model. They are however dethroned in certain benchmark scores by the Chinese newcomer, namely the AIME, Math 500, and SWE-bench, albeit by an atomic margin. Even then, for most tasks, the o1 model - along with its costlier counterpart o1 pro - mostly supersedes.
Interestingly, this time the DeepSeek's R1 model turns out to be more human-like in interaction when tested on text generation whereas o1 is the more factually reasonable model. For coding tasks, given substantially long context, both R1 and o1 can give nearly similar results, other than the occasional stutters that R1 might face.
Verdict: ChatGPT o1/o1 pro for 'zero room for error' scenarios. DeepSeek R1 for a 'close enough' performance with room for error.
Image and audio processing: DALL-E/SORA vs Janus-Pro
OpenAI's Whisper allows for advanced speech recognition alongside text-to-speech features and is already in use by many third-party applications. Image processing has enabled users to achieve a new level of productivity because the solutions to most problems can be found by simply uploading a picture of the scenario, whether it is a word problem or a situational challenge. OpenAI's DALL-E model allows ChatGPT to produce true-to-life imagery, while SORA combines text, image and video inputs to output a cohesive video.
Being able to infer conditions directly from a photo is a task that the DeepSeek R1 and V3 models are not able to do on their own. However, DeepSeek also launched their multi-modal image model Janus-Pro, designed specifically for both image and text processing. When compared with DALL-E 3 and other competitors, the Janus Pro 7B model achieves the highest average performance on multimodal understanding tasks, while also demonstrating high accuracy on instruction-following benchmarks for a text-to-image generation.
Ultimately, it is up to the eyes of the beholder to judge which model reaches closer to realism. Not to forget the fact that DeepSeek is still lacking in the audio processing department.
Here are images generated by the two AI models with the prompt: "A modern office space design with collaborative workstations, private meeting pods, and natural light, presented as a 3D-style rendering".
Verdict: OpenAI's Suite (DALL-E 3, Whisper, Sora, and ChatGPT) for both audio and visual processing.
Paid versions
If the best that DeepSeek can offer is only being on par with the state-of-the-art models, why exactly has it taken the world by storm all of a sudden? However, not only does it draw astronomically less computing power, but all of its services are also completely free, so far. Even the Janus Pro image model is free to use as opposed to DALL-E 3, which is locked behind a premium subscription paywall.
ChatGPT Plus is currently priced at $20/month and offers limited access to all of its AI tools, including 4o, o1, and DALL-E 3. The Premium subscription at $200/month lifts any usage limits as long as the usage is within ethical boundaries while enabling access to o1 pro, the best reasoning model OpenAI has to offer.
On DeepSeek's end, all of its AI tools which are on par and in certain instances even surpass the OpenAI competitors are completely free of cost. This means state-of-the-art level performance without costing a dime.
Even the API pricing is substantially lower with V3 being priced at $0.28 per million tokens stacked against GPT 4o's $2.5 per million tokens. Token in this instance refers to the smallest unit of text that the model has to process, so you can see for yourself the winner in this segment.
Verdict: DeepSeek is completely free (as of the time of writing).
Usage limitations
What hammers the nail in the coffin is the accessibility of the highly-performing models to the general user. Whereas ChatGPT's free version allows users limited access to the 4o and its mini version, with about 5-10 messages per 5-6 hours, the $20/month Plus version bumps it up to 80 messages per 3 hours. The o1 version tightens the restriction to 50 messages a week.
On the other hand, DeepSeek's GPT competitors R1 and V3 seem to not have any usage limits at all thus far. DeepSeek states on its website that it wants to cater to every request but how long it can keep to that promise could be a point of contention.
Verdict: DeepSeek has no usage limits (as of the time of writing).
Comments