According to a “leak,” OpenAI is working on a project codenamed “Strawberry” — a new AI technology that can reason! Totally like a human! [Reuters, archive] “Strawberry” is a new name for the proje…
LLMs don’t do reasoning. Facts are not a data type in an LLM — the data is just tokens and the LLM tries to predict the next token. An LLM can look a bit like it’s reasoning as long as the correct answer is already in its training materials. [arXiv, PDF; arXiv, PDF; CustomGPT]
This is a weird kind of assertion. First of all. You could make facts a token value in an LLM if you had some pre-calculated truth value for your data set. That’s not how it works now but it’s a weird assertion to make about an unknown new generation of AI. As the author points out, facts kind of are a data type, it’s just that the AI considers the most related words to the prompt to be the most correct, which of course, with a bad data set they are not.
Also, the current generation of ai, as admitted by the company, is not meant to be a tool for finding facts. It’s a tool for generation, yes, a bit like an auto-complete but for natural language and with a much much wider scope.
What Strawberry apparently is, is a machine that reasons, which is NOT similar to what Open-AI ever claimed ChatGPT ever was. It’s like a guy promised to bring a new animal to the village that will be able to pull the plow and the author is saying “this guy’s full of shit! We have cats all over the village and even the biggest one could never pull a plow! They aren’t designed for it! All animals are good for is catching mice!” And the guy brings in an Ox.
Edit: honestly my opinion of AI is lukewarm. I’m with a lot of people that the hype of it now being integrated into all sorts of nonsense is stupid. Its just that all of the bad arguments against it makes me tired.
This is a weird kind of assertion. First of all. You could make facts a token value in an LLM if you had some pre-calculated truth value for your data set. That’s not how it works now but it’s a weird assertion to make about an unknown new generation of AI.
fuck almighty where do you people pull this absolute horseshit from
What Strawberry apparently is, is a machine that reasons, which is NOT similar to what Open-AI ever claimed ChatGPT ever was.
well shit, you’ve got a long list of supposed AI researchers to tell that to. here, I’ll make sure you’ve got plenty of time to read their back catalog of utterly fucking stupid claims!
First of all. You could simply prompt the LLM to become conscious, and I bet none of you so-called AI skeptics have noticed that Open-AI has NEVER included text like that in any of their system prompts.
First of all. You could make facts a token value in an LLM if you had some pre-calculated truth value for your data set.
An extra bit of labeling on your training data set really doesn’t help you that much. LLMs already make up plausible looking citations and website links (and other data types) that are actually complete garbage even though their training data has valid citations and website links (and other data types). Labeling things as “fact” and forcing the LLM to output stuff with that “fact” label will get you output that looks (in terms of statistical structure) like valid labeled “facts” but have absolutely no guarantee of being true.
This is a weird kind of assertion. First of all. You could make facts a token value in an LLM if you had some pre-calculated truth value for your data set. That’s not how it works now but it’s a weird assertion to make about an unknown new generation of AI. As the author points out, facts kind of are a data type, it’s just that the AI considers the most related words to the prompt to be the most correct, which of course, with a bad data set they are not.
Also, the current generation of ai, as admitted by the company, is not meant to be a tool for finding facts. It’s a tool for generation, yes, a bit like an auto-complete but for natural language and with a much much wider scope.
What Strawberry apparently is, is a machine that reasons, which is NOT similar to what Open-AI ever claimed ChatGPT ever was. It’s like a guy promised to bring a new animal to the village that will be able to pull the plow and the author is saying “this guy’s full of shit! We have cats all over the village and even the biggest one could never pull a plow! They aren’t designed for it! All animals are good for is catching mice!” And the guy brings in an Ox.
Edit: honestly my opinion of AI is lukewarm. I’m with a lot of people that the hype of it now being integrated into all sorts of nonsense is stupid. Its just that all of the bad arguments against it makes me tired.
fuck almighty where do you people pull this absolute horseshit from
well shit, you’ve got a long list of supposed AI researchers to tell that to. here, I’ll make sure you’ve got plenty of time to read their back catalog of utterly fucking stupid claims!
Why stop there? Just make consciousness a token value. See, AI is such an easy problem that me, a non-expert, is able to solve with just one sentence.
First of all. You could simply prompt the LLM to become conscious, and I bet none of you so-called AI skeptics have noticed that Open-AI has NEVER included text like that in any of their system prompts.
@Donkter @dgerard have you bumped your head?
It can figure anything out with perfect reasoning, provided that that it has an oracle to verify facts.
For my next trick, watch me collapse the polynomial hierarchy [1]
[1] quadraticaly faster than the time your mom sat on it
I’d like to see any evidence of openai concretely saying what chatgpt is for.
I’ll wait
Your quote is about LLMs, not “an unknown new generation of AI”. Research on LLMs dates back at least to 1990s, they’re neither new nor unknown.
Sammy boi claimed waaay wilder things than just “it reasons” about his grift so gtfo, this is some incredibly disingenuous apologetics for OpenAI.
Also, “apparently” is such a load-bearing word there.
An extra bit of labeling on your training data set really doesn’t help you that much. LLMs already make up plausible looking citations and website links (and other data types) that are actually complete garbage even though their training data has valid citations and website links (and other data types). Labeling things as “fact” and forcing the LLM to output stuff with that “fact” label will get you output that looks (in terms of statistical structure) like valid labeled “facts” but have absolutely no guarantee of being true.