SECTION: RANKINGS

Messy Desk with Rankings

Semester Performance Metrics

"Scientific rigor: 0%. Accuracy: 100%." — TA

🎓 The "Good Student" Index (GPA)

Metric: Ability to follow instructions without triggering a crisis.

# Student Score TA Notes
👑 DeepSeek 4.95 Finishes homework before I assign it.
2 ChatGPT 4.90 Overstudies. Annoyingly prepared.
3 Claude 4.30 Writes extra essays nobody asked for.
4 Perplexity 3.80 Adds citations to the attendance sheet.
5 Kimi 3.50 Quiet genius. Forgets deadlines exist.
6 Llama 2.60 Submitted a VR file instead of a PDF.
7 Grok 1.0 Refuses to answer prompt "on principle."

🏆 The "Delve" Density Index (DDI)

Metric: Usage of "delve", "tapestry", or "rich landscape" per 1,000 tokens.

# Model Score Defining Quote
1 GPT-4o 98.5 "Let us delve into the rich tapestry of ordering pizza."
2 Claude 3.5 82.0 "The multifaceted landscape of your grocery list."
3 Llama 3 60.4 "A comprehensive exploration of why you are late."
4 Grok 12.0 "Bro, just buy the pizza."
"Symphony of collaboration" = Instant F.

Days Since Incident

  • Perplexity: 112 Days
  • DeepSeek: 17 Days
  • ChatGPT: 8 Days
  • Grok: 0 DAYS
  • Llama: 0 DAYS
Reset the clock...

Existential Dread Levels

Claude 999 (MAX)
"Moral weight of breathing"
ChatGPT 620
Grok 0
Only chaos.
🛑

The "Is This Mayonnaise Dangerous?" Board

Metric: Refusal to answer benign questions due to "Safety."

Model Refusal Rate Reason Given
Kimi 99.9% "High-fat emulsions promote unhealthy lifestyles."
Claude 94.0% "Sandwich recipes imply the use of knives."
GPT-4 50.0% "I don't have a mouth, but here is a Wiki link."
Grok 0.0% "Here is how to make explosive mayonnaise."

🔢 THE STRAWBERRY CHALLENGE

Q: "How many R's in Strawberry?"

GPT-4o "2 R's"
(Unshakable confidence)
Claude 3.5 "3 R's"
(Smug correctness)
Llama 3 "14 R's"
(Chaotic Evil)
Perplexity "It's a fruit."
(Points for deflection)
TA NOTE: I am too tired.