Photo: Patrick T. Fallon/Bloomberg (Getty Images)
A mysterious new AI chatbot called “gpt2-chatbot” is turning heads this week after it became available on a major large language model benchmarking site, LMSYS Org. No one knows where it came from, but many consider it to have roughly the same capabilities as OpenAI’s GPT-4. This puts gpt2-chatbot in a rare class of AI models that only a handful of developers worldwide have been able to achieve.
“No one knows who made it or what it is, but I have been playing with it a little and it appears to be in the same rough ability level as GPT-4,” Ethan Mollick, a Professor researching artificial intelligence at the Wharton School of the University of Pennsylvania, said in a tweet on Monday.
Online AI communities have gone wild about the anonymous gpt2-chatbot. One X user claims that gpt2-chatbot nearly coded a perfect clone of the mobile game Flappy Bird. Another X user says it solved an International Math Olympiad problem in one shot. On long Reddit threads, users are speculating wildly about the origins of the gpt2-chatbot and arguing over whether it’s from OpenAI, Google, or Anthropic. There’s no evidence for these claims, but tweets from OpenAI CEO Sam Altman and other executives have just added fuel to the fire.
You can try out the gpt2-chatbot yourself at LMSYS Org’s website. Navigate to “Direct Chat” or “Arena (side-by-side)” and select it from the dropdown menu. LMSYS Org says in its policy blog that certain AI model developers can test anonymous unreleased models before a broader release. This has led many to believe that gpt2-chatbot is an anonymous model from a major AI developer.
“Just
to
clarify,
following
our
policy,
we’ve
partnered
with
several
model
developers
to
bring
their
new
models
to
our
platform
for
community
preview
testing,”
said
LMSYS
Org
in
a
tweet
on
Monday,
responding
to
a
thread
about
gpt2-chatbot.
“These
models
are
strictly
for
testing
and
won’t
be
listed
on
the
leaderboard
until
they
go
public.”
LMYSYS
Org
and
OpenAI
did
not
immediately
respond
to
Gizmodo’s
request
for
comment.
In
Gizmodo’s
limited
testing,
we
found
the
gpt2-chatbot
has
capabilities
that
are
similar
to
leading
AI
models
from
Anthropic
and
OpenAI.
It
exhibited
behavior
exclusive
to
advanced
large
language
models,
reasoning
well
and
outlining
detailed
plans
for
complicated
tasks.
Here
are
some
of
our
examples
comparing
gpt2-chatbot
(left)
and
Anthropic’s
Claude
Opus
model
(right).
Instruction prompt: gpt2-chatbot (left) vs. Claude 3 Opus (right) Screenshot: LMSYS Org
Reasoning prompt: gpt2-chatbot (left) vs. Claude 3 Opus (right) Screenshot: LMSYS Org
A computer engineering professor at the University of Wisconsin found that gpt2-chatbot could perform a task that other leading AI models could not. Dimitris Papailiopoulos asked gpt2-chatbot to solve a math riddle that involves learning some inexplicit rules. AI largely struggles to answer questions like this.
Ultimately,
there’s
very
little
information
available
about
the
gpt2-chatbot
just
yet.
However,
it
seems
clear
that
a
power
player
is
behind
this
AI
model.
In
the
coming
weeks,
the
creator
and
origins
of
the
gpt2-chatbot
will
likely
become
public.
This
could
mean
a
new
AI
model
is
on
the
horizon or
maybe
there’s
a new
AI
developer
on
the
scene.
Comments