LLMs Compared: ChatGPT vs Claude vs Gemini vs DeepSeek

So you've been using ChatGPT for a while. Maybe you've heard Claude is better for writing, or Gemini is better for research, or DeepSeek is surprisingly good for the price. But you haven't actually tried them.

This is a practical comparison based on my experience using all of these regularly. No benchmarks, no hype — just honest observations about when each one works best.

The Contenders

ChatGPT (OpenAI) — GPT-4o, GPT-4o mini, o1, o3-mini Claude (Anthropic) — Claude 3.5 Sonnet, Claude 3 Opus Gemini (Google) — Gemini 1.5 Pro, Gemini 1.5 Flash DeepSeek (DeepSeek) — DeepSeek V3, DeepSeek R1

I'll focus on the models most people actually use: GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, and DeepSeek V3/R1.

The Quick Summary

Use Case	Best Option
General tasks	ChatGPT or Claude — both excellent
Writing and editing	Claude
Coding	Claude or GPT-4o
Research	Gemini (with Google integration)
Long documents	Claude (200K context)
Math/reasoning	DeepSeek R1 or o1
Budget	DeepSeek V3
Privacy-concerned	DeepSeek (local) or Claude

Now let's dig into the details.

ChatGPT (OpenAI)

Models: GPT-4o (flagship), GPT-4o mini (fast/cheap), o1/o3 (reasoning) Pricing: Free tier (GPT-4o mini + limited GPT-4o), Plus €20/month, API varies Context window: 128K tokens

Strengths

Jack of all trades. GPT-4o is consistently good at everything. It's rarely the best at any specific task, but it's rarely bad either.

Ecosystem. Custom GPTs, plugins, DALL-E integration, web browsing. If you want an all-in-one tool, ChatGPT has the most features.

Voice mode. The advanced voice feature is genuinely impressive for conversations.

Widespread familiarity. Most tutorials and examples are written for ChatGPT.

Weaknesses

Can feel generic. Responses sometimes have that "ChatGPT voice" — enthusiastic, hedging, repetitive phrases. Claude tends to feel more natural.

Censorship. OpenAI is conservative about content restrictions. Claude is too, but ChatGPT sometimes refuses reasonable requests.

Cost. €20/month isn't bad, but API pricing is higher than competitors for similar quality.

Best for

People who want one tool that does everything reasonably well. Heavy users who'll use the ecosystem features.

Claude (Anthropic)

Models: Claude 3.5 Sonnet (default), Claude 3 Opus (strongest) Pricing: Free tier (Sonnet with limits), Pro €20/month, API varies Context window: 200K tokens

Strengths

Writing quality. Claude produces more natural-sounding text than GPT-4o, in my experience. Less "AI-ish." Better at matching tone.

Long context. 200K tokens means you can paste entire codebases, long documents, or book chapters. Claude handles long contexts better than competitors.

Coding. On par with GPT-4o or slightly better for complex coding tasks. Claude's artifact feature (for creating code that renders in a preview) is handy.

Analysis. Give Claude a document and ask for analysis — it's thorough without being fluffy.

Less annoying. Claude doesn't preamble or pad responses as much. It tends to just answer the question.

Weaknesses

No image generation. Can't create images, only analyze them.

Smaller ecosystem. No plugin marketplace. Fewer integrations.

Knowledge cutoff. Trained on data that may be slightly older. No built-in web search (though this might have changed by when you read this).

Best for

Writers, developers, anyone working with long documents, people who find ChatGPT's tone annoying.

Gemini (Google)

Models: Gemini 1.5 Pro (flagship), Gemini 1.5 Flash (fast/cheap) Pricing: Free tier (good limits), Advanced €20/month (bundled with Google One AI Premium) Context window: Up to 1M tokens (experimental), 128K standard

Strengths

Google integration. Can access your Gmail, Drive, Docs. "Search my email for the invoice from Acme Corp" actually works.

Massive context. The 1M token window is unmatched, though most people don't need it.

Research. Good at synthesizing information, providing sources, connecting ideas.

Multimodal. Strong at analyzing images, videos, and mixed content.

Weaknesses

Inconsistent quality. Gemini's responses can vary more than competitors. Sometimes great, sometimes oddly off.

Privacy. It's Google. If you're trying to minimize data collection, maybe don't use the product that integrates with all your Google data.

Less refined conversation. The chat experience feels less polished than ChatGPT or Claude.

Best for

People deep in the Google ecosystem who want AI integrated with their email and docs. Research-heavy tasks where sources matter.

DeepSeek (DeepSeek)

Models: DeepSeek V3 (general), DeepSeek R1 (reasoning) Pricing: Extremely cheap API (~€0.14/M input tokens), chat interface is free Context window: 64K tokens

Strengths

Price. DeepSeek's API is an order of magnitude cheaper than OpenAI. For bulk processing, this matters.

Reasoning. DeepSeek R1's chain-of-thought reasoning is competitive with o1 at a fraction of the cost.

Math and logic. R1 in particular excels at problems requiring step-by-step reasoning.

Open weights. You can download and run DeepSeek models locally.

Weaknesses

China-based. DeepSeek is a Chinese company. For sensitive business data, this is a consideration. Your data is subject to Chinese law.

Censorship on certain topics. It will refuse to discuss certain China-sensitive topics (Taiwan, Tiananmen, etc.).

Less refined for creative tasks. General writing quality is good but not quite at Claude's level.

Less ecosystem. No plugin system, limited integrations.

Best for

Budget-conscious users, math/reasoning tasks, API users who need to process large volumes, privacy-focused users who run it locally.

Head-to-Head Comparisons

For Writing

Winner: Claude

Claude produces text that sounds more human. It's better at matching tone, less likely to use AI clichés, and more willing to be direct. GPT-4o is a close second.

For professional/business writing, both are fine. For creative writing or anything voice-sensitive, Claude.

For Coding

Winner: Claude or GPT-4o (depends on task)

Both are excellent. Claude handles large codebases better due to context length. GPT-4o has better ecosystem integration (can browse docs, etc.).

For quick questions: either works. For complex refactoring: Claude (context window). For learning/tutorials: GPT-4o (more existing resources).

For Research

Winner: Gemini (with caveats)

If you need citations and sources, Gemini's grounding features help. For research within your Google Docs and Drive, nothing else compares.

For general knowledge questions without needing sources, they're all similar.

For Math and Reasoning

Winner: DeepSeek R1 or o1

Reasoning-focused models like DeepSeek R1 and OpenAI's o1 significantly outperform general-purpose models on complex math and logic problems. R1 is much cheaper.

For Long Documents

Winner: Claude

200K context window, and it actually uses that context effectively. For analyzing long PDFs, code reviews of large repos, or book-length analysis, Claude.

For Budget Users

Winner: DeepSeek V3

If you're watching costs — especially for API usage — DeepSeek offers near-frontier quality at dramatically lower prices.

The Privacy Question

Most privacy-conscious (cloud): Claude or ChatGPT (with data opt-outs enabled) Most privacy-conscious (absolute): DeepSeek or Llama running locally

If you're processing sensitive data:

Check each provider's data usage policies
Consider enterprise plans with better data guarantees
Run local models for truly sensitive work

DeepSeek being Chinese means something different to different people. For personal use, probably fine. For business data subject to compliance requirements, ask your legal team.

My Personal Setup

I use Claude as my primary. The writing quality is noticeably better for my use cases, and I often work with long documents.

I use ChatGPT for voice conversations and when I need image generation.

I use DeepSeek R1 for complex reasoning tasks and when I need to process large batches cheaply.

I rarely use Gemini, but I'm not deep in the Google ecosystem.

Just Pick One

If you're not sure: start with Claude or ChatGPT's free tier. Use it for a week. Then try the other one.

The differences matter less than actually learning to use these tools well. Good prompting and understanding limitations matters more than which model you pick.

They're all pretty good. Pick one, get good at it, and experiment with others when you hit limitations.

The Contenders

The Quick Summary

ChatGPT (OpenAI)

Strengths

Weaknesses

Best for

Claude (Anthropic)

Strengths

Weaknesses

Best for

Gemini (Google)

Strengths

Weaknesses

Best for

DeepSeek (DeepSeek)

Strengths

Weaknesses

Best for

Head-to-Head Comparisons

For Writing

For Coding

For Research

For Math and Reasoning

For Long Documents

For Budget Users

The Privacy Question

My Personal Setup

Just Pick One

Weekly. No spam. No fluff.