Flutterby™! : Apple & LLM reasoning

Next unread comment / Catchup all unread comments User Account Info | Logout | XML/Pilot/etc versions | Long version (with comments) | Weblog archives | Site Map | | Browse Topics

Apple & LLM reasoning

2024-10-12 16:46:36.108054+02 by Dan Lyke 0 comments

Apple Machine Learning Research: GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

Furthermore, we investigate the fragility of mathematical reasoning in these models and show that their performance significantly deteriorates as the number of clauses in a question increases. We hypothesize that this decline is because current LLMs cannot perform genuine logical reasoning; they replicate reasoning steps from their training data. Adding a single clause that seems relevant to the question causes significant performance drops (up to 65%) across all state-of-the-art models, even though the clause doesn't contribute to the reasoning chain needed for the final answer.

Via Charlie Stross @cstross@wandering.shop

Here in one paper is the probable reason why Apple abruptly pulled out of OpenAI's current funding round a week ago, after previously being expected to buy at least a billion bucks of equity.

(AI is peripheral to Apple's business model and not tarnishing their brand in the long term is more important than jumping on a passing fad.)
https://appdot.net/@jgordon/113294630427550275

Marcus on AI: LLMs don’t do formal reasoning - and that is a HUGE problem

[ related topics: Apple Computer Theater & Plays Art & Culture Mathematics Macintosh Education Artificial Intelligence ]

comments in ascending chronological order (reverse):

Add your own comment:

(If anyone ever actually uses Webmention/indie-action to post here, please email me)




Format with:

(You should probably use "Text" mode: URLs will be mostly recognized and linked, _underscore quoted_ text is looked up in a glossary, _underscore quoted_ (http://xyz.pdq) becomes a link, without the link in the parenthesis it becomes a <cite> tag. All <cite>ed text will point to the Flutterby knowledge base. Two enters (ie: a blank line) gets you a new paragraph, special treatment for paragraphs that are manually indented or start with "#" (as in "#include" or "#!/usr/bin/perl"), "/* " or ">" (as in a quoted message) or look like lists, or within a paragraph you can use a number of HTML tags:

p, img, br, hr, a, sub, sup, tt, i, b, h1, h2, h3, h4, h5, h6, cite, em, strong, code, samp, kbd, pre, blockquote, address, ol, dl, ul, dt, dd, li, dir, menu, table, tr, td, th

Comment policy

We will not edit your comments. However, we may delete your comments, or cause them to be hidden behind another link, if we feel they detract from the conversation. Commercial plugs are fine, if they are relevant to the conversation, and if you don't try to pretend to be a consumer. Annoying endorsements will be deleted if you're lucky, if you're not a whole bunch of people smarter and more articulate than you will ridicule you, and we will leave such ridicule in place.


Flutterby™ is a trademark claimed by

Dan Lyke
for the web publications at www.flutterby.com and www.flutterby.net.