The Oath

Gary Marcus

Rank 11 of 47
|
Score 83
Gary Marcus
@GaryMarcus
--
@DeryaTR_ We are aren’t asymptoting at 95 or something like that but yes there is discussion of that in thread
4/12/2024, 12:48:41 PM
X
In reply to:
Derya Unutmaz, MD
@DeryaTR_
·
658d
@GaryMarcus I believe 100 is the max in the Y axis of this 📊 what I see is MMLU approaching that max.
Gary Marcus
@GaryMarcus
·
659d
What happens when you plot GPT-2, 3, 4, and Turbo side-by-side?

Below I have plotted one common measure, MMLU, where there are easy to find data going back to GPT-2. (There may be others with data going back that far; this is just a first quick attempt.)


What I see is an…
Gary Marcus
@GaryMarcus
·
659d
Could we see GPT 3 and 3.5 and GPT 4 on the same plot? And Gemini Pro 1.5 and Claude Opus?

The statement is a technical clarification in a conversation about AI model performance, specifically discussing the MMLU benchmark scores of various GPT models. It does not engage in public discourse as it does not address broader societal issues or policies.

FacebookInstagramTwitterYouTube

© 2023-2024 The Oath, All rights reserved.