Flutterby™! : LLMonday

Next unread comment / Catchup all unread comments User Account Info | Logout | XML/Pilot/etc versions | Long version (with comments) | Weblog archives | Site Map | | Browse Topics

LLMonday

2024-09-09 18:39:35.138866+02 by Dan Lyke 0 comments

Baldur Bjarnason: The LLM honeymoon phase is about to end

The usefulness of LLMs was always overblown, but unless the AI vendors discover a new kind of maths to fix the problem, they’re about to have an AltaVista moment.

GPT-fabricated scientific papers on Google Scholar: Key features, spread, and implications for preempting evidence manipulation. Or: yep, LLMs are being used to generate exactly the sort of propaganda masquerading as research that you expected.

RT Matteo Collina @mcollina@fosstodon.org

Finally it happened to me as well: developers complaining that the behavior of my OSS libraries does not match what ChatGPT explains to them. 🤦‍♂️

In the replies, shrimp eating mammal 🦐 @walruslifestyle@octodon.social observes

this is a power game, and openai has the upper hand. if it's not already true, one day there will be "open source developers" who argue that they should modify their project to do what chatgpt says they should do. it'll help adoption, they'll say, it'll help accessibility, they'll say. user first, they'll say.

Which also means that it's going to be interesting to get adoption on new approaches to problem, because the frameworks by which people "understand" concepts will be limited by LLM behavior (this is especially already a problem that we see with people who use LLMs to "summarize" documents, because the LLM most certainly is not doing that).

Governor Newsom seeks to harness the power of GenAI to address homelessness, other challenges. Given that Newsom has gone full on "let's inflict more trauma to people experiencing trauma response", this bodes ill. (Via)

Edit: Pivot to AI: Promptfondler drama: Shumer’s ‘no-hallucination’ Reflection 70B turns out to be two other models in a trenchcoat

Others tested this new model — but Shumer’s claims didn’t check out. Reflection 70B had similar benchmark scores to Facebook’s LLaMA 3 70B — and lower than LLaMA 3.1, which Shumer had said it was based on. Reddit r/LocalLLaMA concurred — Reflection 70B was just LLaMA 3 with some extra tuning. [Twitter, archive; Reddit; VentureBeat]

Further testing suggested that Reflection 70B was, in fact, a front-end to Anthropic’s Claude 3.5 Sonnet using LLaMA 3 weights. HyperWrite filtered the string “Claude” in an attempt to hide this. [Twitter, archive; Twitter, archive; Reddit]

[ related topics: Free Software Interactive Drama Games Invention and Design Mathematics Artificial Intelligence ]

comments in ascending chronological order (reverse):

Add your own comment:

(If anyone ever actually uses Webmention/indie-action to post here, please email me)




Format with:

(You should probably use "Text" mode: URLs will be mostly recognized and linked, _underscore quoted_ text is looked up in a glossary, _underscore quoted_ (http://xyz.pdq) becomes a link, without the link in the parenthesis it becomes a <cite> tag. All <cite>ed text will point to the Flutterby knowledge base. Two enters (ie: a blank line) gets you a new paragraph, special treatment for paragraphs that are manually indented or start with "#" (as in "#include" or "#!/usr/bin/perl"), "/* " or ">" (as in a quoted message) or look like lists, or within a paragraph you can use a number of HTML tags:

p, img, br, hr, a, sub, sup, tt, i, b, h1, h2, h3, h4, h5, h6, cite, em, strong, code, samp, kbd, pre, blockquote, address, ol, dl, ul, dt, dd, li, dir, menu, table, tr, td, th

Comment policy

We will not edit your comments. However, we may delete your comments, or cause them to be hidden behind another link, if we feel they detract from the conversation. Commercial plugs are fine, if they are relevant to the conversation, and if you don't try to pretend to be a consumer. Annoying endorsements will be deleted if you're lucky, if you're not a whole bunch of people smarter and more articulate than you will ridicule you, and we will leave such ridicule in place.


Flutterby™ is a trademark claimed by

Dan Lyke
for the web publications at www.flutterby.com and www.flutterby.net.