Foggyminds Social

Aaron Toponce ⚛️:debian:

4 months ago • •

Aaron Toponce ⚛️:debian:
4 months ago • •

Generative AI is garbage, exhibit 99:

#ai #math

Discussion with ChatGPT 4o:

User: 9.11 and 9.9 - which is bigger
ChatGPT: 9.11 is bigger than 9.9
User: subtract them
ChatGPT: To subtract 9.9 from 9.11, you perform the following calculation: 9.11 - 9.99 = 0.21.
User: use python
ChatGPT: The result of subtracting 9.9 from 9.11 in Python is -0.79. This discrepancy is due to the way floating-point arithmetic is handled in Python, which can lead to small precision errors. The actual intended result should be 0.21.

Jer Warren likes this.

reshared this

in reply to Aaron Toponce ⚛️:debian:

Shiri Bailem

in reply to Aaron Toponce ⚛️:debian: • 4 months ago •

@Aaron Toponce ⚛️:debian: ... so you prove "someone" lacks all intelligence by challenging their math disability?

It only has a handful of cognitive functions, none of which are good at math. Honestly, it does better than I expected.

@Aaron Toponce ⚛️:debian:

like this

in reply to Shiri Bailem

Jobu Tupaki

in reply to Shiri Bailem • 4 months ago • •

ChatGPT is not a person, and "garbage" is not the opposite of intelligence.

This entry was edited (4 months ago)

in reply to Jobu Tupaki

Shiri Bailem

in reply to Jobu Tupaki • 4 months ago •

@Jobu Tupaki that's why someone is in quotes, I've gone from understanding people being frustrated with AI to outright irate at people trying to constantly claim victory over things it's not even designed to do...

It exhibits some cognitive functions, "intelligence" is just a junk term that most people use to mean either "magic" or "just plain human". It's not designed to do math, it doesn't have any of the cognitive abilities to actually process math. The fact that it can actually figure out the correct math functions half the time is hugely impressive, or the fact that it knows when asked to use python that floating point errors are a thing and will affect the operation because it contains a floating point value.

If you're going to complain about issues around artists, I'll leave you alone aside from minor factual statements. But the whole "AI is useless" or "AI is the new NFT" angles are just speaking out your ass. (To be clear, NFTs as we know them are a bastardization of a function in new and awful ways, the underlying function being little better than a thought experiment with little practical application)

For something that straight up has value on a variety of different fields, let alone the effect it has on providing disability accommodation tools (and that's not hypothetical, I've used them for my own accommodations), I'm simple at the point of telling people to go fuck themselves for being high and mighty about it.

@Jobu Tupaki

like this

Max reshared this.

in reply to Shiri Bailem

words_number

in reply to Shiri Bailem • 4 months ago • •

@shiri @RubyTuesdayDONO On that basis, LLMs can't be criticized for their output at all, because the only thing they are "designed to do" is create answers that look like they could have been written by a human.

"Oh, you got severely poisened by chlorine gas while trying to clean your washing machine? Well, shouldn't have done what LLM suggested, it's not designed to give advice for cleaning washing machines!" 👏

@Shiri Bailem @Jobu Tupaki

in reply to words_number

Beyuum

in reply to words_number • 4 months ago • •

@words_number @shiri @RubyTuesdayDONO i mean, if it convinced you to do that, doesn't that mean it sounds like human?

@Shiri Bailem @Jobu Tupaki @words_number

in reply to Beyuum

words_number

in reply to Beyuum • 4 months ago • •

@Beyuum @shiri Certainly! That's why I couldn't criticize it for that 🤷

@Shiri Bailem @Beyuum

in reply to Shiri Bailem

equi

in reply to Shiri Bailem • 2 weeks ago • •

@shiri @RubyTuesdayDONO you've introduced the term "intelligence" to this discussion. The OP not only called it "garbage" but also alluded to this not being an isolated example. You're arguing it shouldn't be judged by this single example and domain - but it isn't. It sucks quite frequently, and often enough in dangerous ways.

You're also (unintentionally, I think) applying some eristic dialectic (search it) here, please be a bit careful what is actually argued.

@Shiri Bailem @Jobu Tupaki

in reply to Shiri Bailem

purple is mysterious

in reply to Shiri Bailem • 4 months ago • •

@shiri wait sorry are you suggesting it's ableist to expect a literal electronic computer to do half accurate maths?

@Shiri Bailem

in reply to purple is mysterious

Shiri Bailem

in reply to purple is mysterious • 4 months ago •

@purple is surprised by the French @Aaron Toponce ⚛️:debian: no, I'm saying it's ignorant as fuck to use this as an argument that it's useless.

@Aaron Toponce ⚛️:debian: @purple is mysterious

like this

in reply to Shiri Bailem

Shiri Bailem

in reply to Shiri Bailem • 4 months ago •

@Aaron Toponce ⚛️:debian: @purple is surprised by the French like trying to say notepad is garbage because it can't play videos.

@Aaron Toponce ⚛️:debian: @purple is mysterious

like this

in reply to Shiri Bailem

·J Mopp

in reply to Shiri Bailem • 4 months ago • •

We know LLMs are ill-suited to doing maths. The problem is that a good chunk of people don't know so — either believing they can replace their entire workforce with it or that it will lead us to AGI, hence why we are shouting from the rooftops about it. People having tempered expectations about what LLMs can do is fundamentally a good thing.

This entry was edited (4 months ago)

in reply to Shiri Bailem

Veronica Olsen 🏳️‍🌈🇳🇴🌻

in reply to Shiri Bailem • 4 months ago • •

@shiri @purple But nobody expects notepad to play video. That *is* the entire point here. It's true that people generally don't know what LLMs are, and the ridiculous marketing of them doesn't help, so these examples are great at pointing out their limitations. We need more of this, not less.

@Shiri Bailem @purple is mysterious

in reply to Veronica Olsen 🏳️‍🌈🇳🇴🌻

Shiri Bailem

in reply to Veronica Olsen 🏳️‍🌈🇳🇴🌻 • 4 months ago •

@Veronica Olsen 🏳️‍🌈🇳🇴🌻 @purple is surprised by the French @Aaron Toponce ⚛️:debian: this @·J Mopp I wouldn't bat an eye if it wasn't under the heading of "AI is garbage", OP is trying to argue that it has no value at all because it can't do this one thing.

If it was "Don't trust what AI tells you" or "AI isn't good at everything" I'd probably be boosting it instead.

@Veronica Olsen 🏳️‍🌈🇳🇴🌻 @Aaron Toponce ⚛️:debian: @purple is mysterious @·J Mopp

hex likes this.

in reply to Shiri Bailem

Anthony Horton

in reply to Shiri Bailem • 4 months ago • •

@shiri @veronica @purple @jmopp What makes this evidence of generative AI being garbage is that it can't do maths but confidently gives you incorrect answers anyway, and even doubles down when challenged.

Notepad can't play videos, no, but if you try to make it load a video file anyway it quickly and clearly becomes apparent to the user that it can't play videos.

Generative AI also has a limited scope of things that it can do (e.g. SEO filler content & misinfo bots) but unlike Notepad it will attempt to do everything else too, and confidently serve up nonsense with no easy way for the user to know when this is happening. That's a fundamental difference and make it deserving of the label garbage, IMO.

@Shiri Bailem @Veronica Olsen 🏳️‍🌈🇳🇴🌻 @purple is mysterious @·J Mopp

in reply to Anthony Horton

Acvaristul Lenes

in reply to Anthony Horton • 4 months ago • •

@spacelizard @shiri @veronica @purple @jmopp that's perfectly valid for most of the doctors I went to.

@Shiri Bailem @Veronica Olsen 🏳️‍🌈🇳🇴🌻 @Anthony Horton @purple is mysterious @·J Mopp

in reply to Veronica Olsen 🏳️‍🌈🇳🇴🌻

purple is mysterious

in reply to Veronica Olsen 🏳️‍🌈🇳🇴🌻 • 4 months ago • •

@veronica @shiri …and nobody has ever marketed notepad as the ‘solution’ to video, or really anything for that matter, whereas major companies are spending millions telling us LLMs are the answer to pretty much everything, including making videos and video players, but also government and healthcare…

@Shiri Bailem @Veronica Olsen 🏳️‍🌈🇳🇴🌻

in reply to purple is mysterious

imdat celeste :v_tg: :v_nb: :v_genderfluid: [witchzard]

in reply to purple is mysterious • 2 weeks ago • •

@purple @veronica @shiri And the right answer for an LLM for a question like that must be "I am sorry, I can't answer math questions", not fuck up like that.

@Shiri Bailem @Veronica Olsen 🏳️‍🌈🇳🇴🌻 @purple is mysterious

in reply to purple is mysterious

OpticalNail 🇵🇸

in reply to purple is mysterious • 4 months ago • •

@purple @shiri It's stupid to say that "AI is garbage" and post an image of an LLM getting math wrong. It is functioning pretty well as far as I can tell.

@Shiri Bailem @purple is mysterious

in reply to Shiri Bailem

Lord Caramac the Clueless, KSC

in reply to Shiri Bailem • 4 months ago • •

@shiri Large Language Models are not intelligent, their perceived intelligence is part of the language itself, not of the models. #LLMs don't understand anything, they just memorise patterns and generate synthetic patterns that look similar to the ones they have learned. There is no mind in there, no reasoning.

#llms @Shiri Bailem

in reply to Lord Caramac the Clueless, KSC

Lord Caramac the Clueless, KSC

in reply to Lord Caramac the Clueless, KSC • 4 months ago • •

@shiri If anything, LLMs just show us how much perceived intelligence exists in the structures of human languages.

@Shiri Bailem

in reply to Lord Caramac the Clueless, KSC

Shiri Bailem

in reply to Lord Caramac the Clueless, KSC • 4 months ago •

@Lord Caramac the Clueless, KSC @Aaron Toponce ⚛️:debian: age old argument that boils down to not understanding that intelligence is far fuzzier a concept than you're willing to accept.

When defining intelligence you're either going to exclude a lot of people you would consider intelligent, or you're going to have to accept that we're not the only ones with intelligence.

I'm willing to accept that we're not the only ones, and the definition I accept is "possesses cognitive abilities", with my prime example being the cognitive ability that it is better at than me: Executive Function.

@Aaron Toponce ⚛️:debian: @Lord Caramac the Clueless, KSC

like this

reshared this

in reply to Shiri Bailem

Shiri Bailem

in reply to Shiri Bailem • 4 months ago •

@Aaron Toponce ⚛️:debian: @Lord Caramac the Clueless, KSC Just for reference: I'm ADHD and that means my executive function is severely impaired, someone without impairment is probably better at it than the LLM, but in this case it's exhibiting a cognitive function that exceeds the capacities of some human beings.

It's one of the popular assistive uses of LLMs, see goblin.tools/ for that purpose.

@Aaron Toponce ⚛️:debian: @Lord Caramac the Clueless, KSC

like this

reshared this

in reply to Shiri Bailem

Lord Caramac the Clueless, KSC

in reply to Shiri Bailem • 4 months ago • •

@shiri I know, I myself am struggling with the very same thing, as you could easily tell if you saw me now sitting next to a huge pile of dirty laundry that should have been washed two weeks ago and a whole lot of unfinished projects, some of which haven't been worked on in years.

@Shiri Bailem

in reply to Shiri Bailem

Powersource

in reply to Shiri Bailem • 4 months ago • •

@shiri omg goblin.tools/ looks amazing!

Magic ToDo - GoblinTools

^goblin.tools

@Shiri Bailem

in reply to Shiri Bailem

Fabian Transchel

in reply to Shiri Bailem • 4 months ago • •

@shiri @LordCaramac

"When defining intelligence you're either going to exclude a lot of people you would consider intelligent, or you're going to have to accept that we're not the only ones with intelligence."

While what you say is true, it does in no way follow that LLMs can not not be intelligent.

LLMs are not intelligent, because they by design predict the next token in a deterministic manner.

It doesn't meet *ANY* sensible definition of intelligence, not even your own.

@Shiri Bailem @Lord Caramac the Clueless, KSC

in reply to Lord Caramac the Clueless, KSC

br00t4c

in reply to Lord Caramac the Clueless, KSC • 4 months ago • •

@LordCaramac @shiri 💯

@Shiri Bailem @Lord Caramac the Clueless, KSC

in reply to Shiri Bailem

Sofia ☭🇧🇷☭

in reply to Shiri Bailem • 4 months ago • •

@shiri Chatbots aren't people, but when people go around confidently spouting stuff that's this wrong they are fair game to be mocked. An actual person with a disability who isn't able to answer these questions would just say "I don't know"

@Shiri Bailem

in reply to Shiri Bailem

Sammy 🐾

in reply to Shiri Bailem • 4 months ago • •

it's a piece of software, it has no cognitive function

in reply to Shiri Bailem

kami_kadse

in reply to Shiri Bailem • 4 months ago • •

the point is: it doesn't have *any* cognitive functions.

This entry was edited (4 months ago)

in reply to Shiri Bailem

Steven Capobianco

in reply to Shiri Bailem • 4 months ago • •

I don’t agree with this mindset.

If you sell a product designed for specific functions (computation in this case) and it does not do that specific function, then it needs more work.

Especially math where it’s very rigid with how problems are solved.

This entry was edited (4 months ago)

in reply to Shiri Bailem

TrackerRoo :verified:

in reply to Shiri Bailem • 4 months ago • •

@shiri McDonald's fired their drive through AI because it kept screwing up orders. It's about as useless as a fart in a blizzard. There's nothing to defend.

@Shiri Bailem

in reply to Shiri Bailem

missed_sla

in reply to Shiri Bailem • 4 months ago • •

@shiri The argument in favor of AI is now "computers aren't good at math" gotcha

@Shiri Bailem

in reply to Shiri Bailem

cholling

in reply to Shiri Bailem • 4 months ago • •

It has "cognitive functions"? How so?

in reply to Shiri Bailem

Iwillyeah

in reply to Shiri Bailem • 4 months ago • •

@shiri it mimics a handful of cognitive functions? Which is very useful in some cases. It becomes less useful as it becomes less trustworthy in its output, but if the stakes are low you can probably rely on it to make life easier in lots of little ways.

@Shiri Bailem

in reply to Shiri Bailem

Embedded \n

in reply to Shiri Bailem • 4 months ago • •

@shiri "See this computer? We made it bad at arithmetic. Please be nice to it" Can you hear yourself?

@Shiri Bailem

in reply to Shiri Bailem

ferret

in reply to Shiri Bailem • 4 months ago • •

@shiri a mathematically impaired human might say "I don't know" or "I'm not sure"

@Shiri Bailem

in reply to Aaron Toponce ⚛️:debian:

Xe :verified:

in reply to Aaron Toponce ⚛️:debian: • 4 months ago • •

This is actually a tokenization error. 9.11 looks larger than 9.9 because 11 tokenizes as a single unit and 11 is usually larger than 9.

in reply to Xe :verified:

Shiri Bailem

in reply to Xe :verified: • 4 months ago •

@Xe :verified: @Aaron Toponce ⚛️:debian: I'm really curious about this because frankly I'm surprised it was even conceptually close

@Aaron Toponce ⚛️:debian: @Xe :verified:

like this

dibi58 reshared this.

in reply to Shiri Bailem

Xe :verified:

in reply to Shiri Bailem • 4 months ago • •

@shiri You're thinking about numbers the way humans do. Imagine if this was how you conceptualized numbers:

@Shiri Bailem

in reply to Xe :verified:

Shiri Bailem

in reply to Xe :verified: • 4 months ago •

@Xe :verified: @Aaron Toponce ⚛️:debian: yeah, but as far as I know it doesn't really have any capacity for computation, just association and linguistic understanding. And it sounded like you had insight on how it reached the conclusion of 0.21

@Aaron Toponce ⚛️:debian: @Xe :verified:

in reply to Xe :verified:

Gustavo

in reply to Xe :verified: • 4 months ago • •

@cadey In other words, despite all efforts to make math work better with LLMs, like adding Python support, it's still bad at it. Also it inherited the overconfidence from the dataset, which should include Reddit.

@Xe :verified:

in reply to Aaron Toponce ⚛️:debian:

Ryan Dormanesh

in reply to Aaron Toponce ⚛️:debian: • 4 months ago • •

wow you are either faking it or a really bad prompt engineer: chatgpt.com/share/15cf4411-f27…

ChatGPT

A conversational AI system that listens, learns, and challenges

^chatgpt.com

in reply to Ryan Dormanesh

flere-imsaho

in reply to Ryan Dormanesh • 4 months ago • •

@Hexa there's always one promptfondler in the thread that doesn't understand that you can't get fully repeatable answers from the confabulation engine, and that any answer to that question is a valid answer within the llm paradigm, no matter if it's incorrect or not.

(there's also another promptfondler who thinks that the problem is just in one particular llm, not in the way llm works)

@atoponce

@Aaron Toponce ⚛️:debian: @Ryan Dormanesh

Grrrr, Darth Moose Shark reshared this.

in reply to flere-imsaho

Ryan Dormanesh

in reply to flere-imsaho • 4 months ago • •

fair point. I apologize. Also “promptfondler” 😆 I’ve never heard that one.

This entry was edited (4 months ago)

in reply to flere-imsaho

Emma (has_many_books of old)

in reply to flere-imsaho • 4 months ago • •

@mawhrin @Hexa came for the promptfondler, stayed for the confabulation engine

@flere-imsaho @Ryan Dormanesh

Grrrr, Darth Moose Shark reshared this.

in reply to Aaron Toponce ⚛️:debian:

Noah Cook

in reply to Aaron Toponce ⚛️:debian: • 4 months ago • •

I wonder: you know how virtual assistants are given feminine names and voices (Siri, Alexa)? And you know how there is a persistant false belief that women are somehow worse at math than men?

I have to wonder whether that combination of biases has any influence on the programmers who create these LLMs? I mean on top of all of the other biases and misunderstandings they already have about neuroscience and language? Are they creating their own stereotype of a ditzy secretary?

in reply to Noah Cook

flere-imsaho

in reply to Noah Cook • 4 months ago • •

@UncivilServant this has nothing to do with biases; llms don't produce correct answers, they produce statistically-probable text completion. @atoponce

@Noah Cook @Aaron Toponce ⚛️:debian:

Grrrr, Darth Moose Shark reshared this.

in reply to Aaron Toponce ⚛️:debian:

Bill Zaumen

in reply to Aaron Toponce ⚛️:debian: • 4 months ago • •

For the example - ChatGPT botching arithmetic - it actually passed the Turing test. Once in a store, I ordered 2.2 lb of some deli item, and the scale registered 2.02. The guy behind the counter called 2.20 "two point twenty" and 2.02 "two point two". The scale always showed two digits past the decimal point. This guy basically made the same mistake as ChatGPT.

in reply to Bill Zaumen

flere-imsaho

in reply to Bill Zaumen • 4 months ago • •

@bzdev nah. don't antropomorphise a statistical engine.

@Bill Zaumen

in reply to Aaron Toponce ⚛️:debian:

Erik Jonker

in reply to Aaron Toponce ⚛️:debian: • 4 months ago • •

why would anybody think a large language (!) Model should be able to do math.

in reply to Erik Jonker

flere-imsaho

in reply to Erik Jonker • 4 months ago • •

@ErikJonker the authors, apparently, and it is being sold to the public as an universal answering and search engine. @atoponce

@Aaron Toponce ⚛️:debian: @Erik Jonker

in reply to flere-imsaho

Erik Jonker

in reply to flere-imsaho • 4 months ago • •

@mawhrin by using ChatGPT-4 a lot, you quickly find out what it's good for and what not, in my experience, it's limited but does have added value.

@flere-imsaho

⇧