Flutterby™! : Gemini lying about arithmetic

Next unread comment / Catchup all unread comments User Account Info | Logout | XML/Pilot/etc versions | Long version (with comments) | Weblog archives | Site Map | | Browse Topics

Gemini lying about arithmetic

2025-06-18 21:25:05.919478+02 by Dan Lyke 0 comments

This discussion about Google Gemini 2.5 giving a wrong answer with some arithmetic is interesting, especially in discussions about trying it multiple times, and how sometimes it'll use an actual calculator and get the correct answer, and how sometimes it'll say it used a calculator and still give incorrect answers.

Via.

I've just started passing the "okay, you can use external tools" flag to various LLM APIs (I'm using an abstraction layer written by a coworker), and this sense of what it tells us what it's doing vs what it's actually doing, or not doing, is only gonna get worse.

[ related topics: Mathematics Community ]

comments in ascending chronological order (reverse):

Add your own comment:

(If anyone ever actually uses Webmention/indie-action to post here, please email me)




Format with:

(You should probably use "Text" mode: URLs will be mostly recognized and linked, _underscore quoted_ text is looked up in a glossary, _underscore quoted_ (http://xyz.pdq) becomes a link, without the link in the parenthesis it becomes a <cite> tag. All <cite>ed text will point to the Flutterby knowledge base. Two enters (ie: a blank line) gets you a new paragraph, special treatment for paragraphs that are manually indented or start with "#" (as in "#include" or "#!/usr/bin/perl"), "/* " or ">" (as in a quoted message) or look like lists, or within a paragraph you can use a number of HTML tags:

p, img, br, hr, a, sub, sup, tt, i, b, h1, h2, h3, h4, h5, h6, cite, em, strong, code, samp, kbd, pre, blockquote, address, ol, dl, ul, dt, dd, li, dir, menu, table, tr, td, th

Comment policy

We will not edit your comments. However, we may delete your comments, or cause them to be hidden behind another link, if we feel they detract from the conversation. Commercial plugs are fine, if they are relevant to the conversation, and if you don't try to pretend to be a consumer. Annoying endorsements will be deleted if you're lucky, if you're not a whole bunch of people smarter and more articulate than you will ridicule you, and we will leave such ridicule in place.


Flutterby™ is a trademark claimed by

Dan Lyke
for the web publications at www.flutterby.com and www.flutterby.net.