-

2025-02-22 11:02:35

annam@nerdculture.de

Discussing using AI to generate alternative texts again. Does anyone have good examples for bad generation? #Accessibility #altText

2025-02-22 15:12:13

Jupiter Rowland

jupiter_rowland@hub.netzgemeinde.eu

@Anna Maier I don't know what constitutes a "good" example in your opinion, but I've got two examples of how bad AI is at describing images with extremely obscure niche content, much less explaining them.

In both cases, I had the Large Language and Vision Assistant describe one of my images, always a rendering from within a 3-D virtual world. And then I compared it with a description of the same image of my own.

That said, I didn't compare the AI description with my short description in the alt-text. I went all the way and compared it with my long description in the post, tens of thousands of characters long, which includes extensive explanations of things that the average viewer is unlikely to be familiar with. This is what I consider the benchmark.

Also, I fed the image at the resolution at which I posted it, 800x533 pixels, to the AI. But I myself didn't describe the image by looking at the image. I described it by looking around in-world. If an AI can't zoom in indefinitely and look around obstacles, and it can't, it's actually a disadvantage on the side of the AI and not an unfair advantage on my side.

So without further ado, exhibit A:

This post contains

an image with an alt-text that I've written myself (1,064 characters, including only 382 characters of description and 681 characters of explanation where the long description can be found),
the image description that I had LLaVA generate for me (558 characters)
my own long and detailed description (25,271 characters)

The immediate follow-up comment dissects and reviews LLaVA's description and reveals where LLaVA was too vague, where LLaVA was outright wrong and what LLaVA didn't mention although it should have.

If you've got some more time, exhibit B:

Technically, all this is in one thread. But for your convenience, I'll link to the individual messages.

Here is the start post with

an image with precisely 1,500 characters of alt-text, including 1,402 characters of visual description and 997 characters mentioning the long description in the post, all written by myself
my own long and detailed image description (60,553 characters)

Here is the comment with the AI description (1,120 characters; I've asked for a detailed description).

Here is the immediate follow-up comment with my review of the AI description.

#Long #LongPost #CWLong #CWLongPost #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #AI #LLaVA #AIVsHuman #HumanVsAI