Jupiter Rowland

Jupiter Rowland

jupiter_rowland@hub.netzgemeinde.eu

Duothematic channel. Primary topic is virtual worlds/OpenSim, secondary topic is the Fediverse beyond Mastodon. This channel is NOT about real life!

Kategorien

Alles
(streams)
Fediverse
How to
Image description meta
Image descriptions
Metaverse in general
OpenSim
Second Life

Image descriptions in the Fediverse

2025-03-17 19:13:51

Profil ansehen

Jupiter Rowland

jupiter_rowland@hub.netzgemeinde.eu

I have learned a lot about describing images according to Mastodon's standards, and I want to share my knowledge, but I haven't learned enough

Artikel ansehen

Zusammenfassung ansehen

It must have been two years ago that I've learned about the importance of describing images in the Fediverse.

Now, I'm not someone who's easily satisfied with the absolute bare minimum. If I have to do it, I want to do it right. I want to do it the best I can. "Better than nothing" isn't good enough. In fact, this already holds true for the alt-text police and the Mastodon HOA. And if I have to describe my images, I want to be way ahead of them. I don't want my image descriptions to suddenly be sub-standard because Mastodon has kept raising its standards, but I haven't.

Perfectioning image descriptions on Hubzilla because Mastodon requires them

So I've spent these past years educating myself about alt-text and image descriptions and researching about what Mastodon users require, what Mastodon users want, what Mastodon users don't want. "Mastodon users" because, seriously, Mastodon is pretty much the only place in the Fediverse where image descriptions matter. Or used to be until two months ago when people who were on Mastodon and Instagram suddenly started escaping from Meta Platforms, Inc. and flocking to Pixelfed and brought Mastodon's accessibility rules with them. But that was only two months ago.

Until then, just about nobody outside Mastodon knew or cared about image accessibility. But if your content has a chance of ending up on some Mastodon timeline, it's pretty much mandatory.

For my research, I've had lots of sources of information. Various hashtags on mastodon.social, sometimes also on Mastodon instances targetted at disabled users. A whole number of webpages and blog articles about alt-text and image descriptions, even though they're mostly geared towards static commercial websites or HTML-formatted blogs. These webpages and articles keep contradicting what's happening on Mastodon, and what Mastodon users tend to love, but then again, they also contradict each other, e.g. mentioning a person's race vs never mentioning a person's race because that's racist.

Still, I've learned a lot.

Explanations matter

Early on during my research, I've learned that Mastodon users love alt-text as sources of additional information on the topic of the image. What really got me to re-think my way of describing images was this toot by @Stormgren:

Stormgren schrieb den folgenden Beitrag Mon, 03 Jul 2023 18:20:44 +0200

Alt-text doesn't just mean accessibility in terms of low -vision or no-vision end users.

Done right also means accessibility for people who might not know much about your image's subject matter either.

This is especially true for technical topic photos. By accurately describing what's in the picture, you give context to non-technical viewers, or newbies, as to exactly what they're looking at, and even describe how it works or why it matters.

#AltText is not just an alternate description to a visual medium, it's an enhancement for everyone if you do it right.

(So I can't find any prior post of mine on this, so if I've actually made this point before, well, you got to hear a version of it again.)

This mattered. A lot. Because I didn't want to post real-life cat photos.

What I wanted to post images about, and what I actually had already started posting images about, was 3-D virtual worlds. Super-obscure 3-D virtual worlds based on OpenSimulator. Something that only maybe one in 200,000 Fediverse users knows anything about. Everyone else, so I figured, would require tons of explanation to be able to understand my image posts.

Something else I took away from this is that it's better to give people all information they may not have but need to understand your image post on a silver platter right away than to expect them to look up even only some of this information themselves. This goes doubly when you know that they won't even find that information.

And in fact, I also learned that neurodivergent people may require more extensive explanations than neurotypical people. I've actually had a neurodivergent Mastodon user thank me for an absolutely monstrous image description with an absolute info dump of explanations in it.

That early already, I also learned other things. For example, the rule that alt-text must not exceed 200 characters (or even only 125 characters) does not exist on Mastodon. Instead, many Mastodon users love long, extensive, detailed image descriptions. Well, if they want them, they shall have them.

Inclusion means the same chances for everyone, no matter how

Another thing was that accessibility means that blind or visually-impaired users must have the very same chances to experience an image as sighted users. There's no arguing that, I guess.

Again, my images are about 3-D virtual worlds. A kind of 3-D virtual worlds that have been referred to by the buzzword "metaverse" or "metaverses" as early as 2007, 14 years before Zuckerberg used that word, and I can prove that. What my images show may be referred to as "metaverse". Not an artistic impression of a metaverse, not an AI rendering of a metaverse, but actually existing, living, breathing 3-D virtual worlds that are being referred to as "metaverses", or whose network is being referred to as "the metaverse".

In short: The metaverse exists. And my images show it. They show the actually, really existing metaverse.

Chances are that this has sighted users on the edges of their seats in excitement. What do they do then? Only look at what matters in the image within the context of the post? Of course not! Instead, they go exploring this exciting, recently discovered, whole new universe by taking in all the big and small details in the image.

Now allow me to re-iterate: Accessibility means that blind or visually-impaired users must have the very same chances to experience an image as sighted users. Anything else equals ableism.

In this context, it means that blind or visually-impaired users must have the very same chances to take in all the big and small details in my images just the same as sighted users. But they can't see them. So I have to sit down and describe all the details in the image to them. And explain them, of course, if they don't understand them.

This was when my image descriptions really grew to titanic sizes.

Text transcripts in edge-cases

Also, if there is text in an image, it must be transcribed verbatim. My understanding of that is that any and all text that is anywhere within the borders of an image must be transcribed absolutely identically to the original. Now, I'm not talking about what's called flattened copy. I'm talking about signs or posters or logos or box art or the like strewn about the image.

This rule does not cover a number of edge-cases, though. For example, it does not cover text which is unreadable in the image as it is posted, but which whoever posts the image can source and thereby transcribe verbatim nonetheless. I figured that if no exception is explicitly made, then there is no exception for such text. If it can be transcribed, it must be transcribed. So my first long image description ended up with 22 individual text transcripts of various lengths.

The location where an image was taken, so I learned, should be mentioned, too, unless very good reasons speak against it. None of these reasons apply to my images from virtual worlds. Cue me not only mentioning where an image is from, but explaining it in more and more characters so that everyone understands it with no prior special knowledge required.

Alt-text for the image plus image description in the post

Now the question was where to put all that information. Into the post itself (which would inflate it to ridiculous lengths)? Into the alt-text like everyone on Mastodon (for which it would be too long at several thousand characters)? Into a reply (which would be inconvenient and stay entirely unnoticed by Mastodon users)?

I actually did a test run in the shape of (content warning: eye contact, alcohol) a thread with four times the same post, but different ways of describing the image in it. I cross-posted it to Lemmy to have people vote on which is the best place to describe an image. The poll wasn't really representative although describing the image in the post itself technically won: It only got five votes.

Then I got into an argument with @Deborah, a user with a physical disability that makes it impossible for her to access alt-text. Money quote from way down this comment thread:

Deborah schrieb den folgenden Beitrag Mon, 10 Jul 2023 23:30:45 +0200

@Jupiter Rowland

I have a disability that prevents me from seeing alt text, because on almost all platforms, seeing the alt requires having a screenreader or working hands. If you post a picture, is there info that you want somebody who CAN see the picture but DOESN’T have working hands to know? Write that in visible text. If you put that in the alt, you are explicitly excluding people like me.

But you don’t have to overthink it. The description of the image itself is a simple concept.

Her point was clear: Information only available in alt-text, but neither in the post text body nor in the image itself, is inaccessible and therefore lost to all those who cannot access alt-text. And not everyone can access alt-text.

From elsewhere, I learned that externally linked information is inconvenient and potentially inaccessible. Conclusion: It's generally best to provide all information necessary for understanding a post in the post itself.

Okay, so when I describe and explain my images at the level of detail that I deem necessary (and that level is sky high), the description, complete with included explanations, must go into the post text body.

But then there was the alt-text police as a department of the Mastodon HOA. At least some of them demand every image in the Fediverse have a useful (as in sufficiently detailed and accurate) alt-text. Yes, even if there's already an image description in the post. That is, if they can't see the image description in the post right off the bat because the post is hidden behind a summary and CW, then of course that image description doesn't count.

When I realised that, I started describing all my original images twice. Once with a long and detailed image description in the post itself. Once with a shorter, but still extensive image description in the alt-text. That said, I often had to cut text transcripts because multiple dozen text transcripts wouldn't all fit into a maximum of 1,500 characters, especially not including the descriptions necessary for people to even find them.

The longest image descriptions in the Fediverse

Even though Hubzilla doesn't really have a character limit for alt-text, I have to limit myself because I've long since learned that Mastodon cuts alt-texts from outside off at the 1,500-character mark if they're longer than 1,500 characters. I was told that Misskey does the same. And I figured that all their respective forks do that, too. Also, (content warning: eye contact, alcohol) even Hubzilla can only display so many characters of alt-text.

By the way: I've yet to see anyone on Mastodon sanction someone for an alt-text that's too long or too detailed. As long as I don't, I'll suppose it doesn't happen. I'll suppose that Mastodon is perfectly happy with 1,500-character alt-texts.

As for my long descriptions, they've started humongous already. The first one was already most likely the longest image description in the Fediverse. It started out at over 11,000 characters that took me three hours. More research, one edit and another round about two hours later, it stood at (content warning: eye contact, alcohol) over 13,000 characters. Then came (content warning: eye contact, food, tobacco, weapons, elevated point of view) over 37,000 characters for one image. Then came (content warning: eye contact, food) over 40,000 characters for one image. Then came over 60,000 characters for one image which took me two whole days, morning to evening. And I even consider that image obsolete and insufficient nowadays.

My image descriptions have grown so long that they have headlines, often on multiple levels.

I barely get any feedback for these image descriptions, but it doesn't look like I get more criticism than praise.

More things to learn

Still, my learning process continued.

I learned that it's actually good to have both an alt-text and a long image description.

I learned that "picture of", "image of" and "photo of" are very bad style. The photograph, more specifically the digital photograph, can be considered a default nowadays. All other media, however, must be mentioned. So if I have a shaded, but not ray-traced digital 3-D rendering, I have to say so.

I learned that people may want to know about the camera position (its height above the ground in particular) and orientation. And so I mention both if there are enough references in the image to justify them. (For example, it probably isn't worth mentioning that the camera is oriented a few degrees south of west if the background of the image is plain white and absolutely featureless otherwise.)

I learned that technical terms and jargon which not everyone may be familiar with must be avoided if anyhow possible and explained if not. Since I can't constantly write around any and all terms specific to virtual worlds in general and OpenSim in particular in everyday words, this alone added thousands upon thousands of characters of explanations to my long image descriptions.

I learned that abbreviations of any kind must be avoided like the plague if anyhow possible. At the very least, they must be spelled out in full and then associated with their own abbreviation at first. Then, initialisms that are spelled letter by letter must have their latters separated with full stops whereas acronyms that are pronounced like words must not have these full stops.

For example, the proper way to use "OAR" is by first spelling it out: "OpenSimulator Archive," followed by the initialism in parentheses with full stops between the letters, "(O.A.R.)", then explaining what an OAR is without requiring any prior knowledge except for what has already been explained in the image description. Later on, the initialism "O.A.R." may be used unless it is so far down the image description that it has to be spelled out again to remind people of what it means.

I learned that not only the sizes of objects in the image belong into the image description, but they must be explained using references to either what else is in the image or to what people are easily familiar with like the size of body parts. I only have one image post that actually takes care of this.

I learned that not only colours belong into the image description, but they must be described using a small handful of basic colours plus brightness plus saturation. After all, what does a blind person know what sepia or fuchsia or olive green or Prussian blue or Burgundy red is?

How to describe what amounts to people

I learned that, when describing a person or anything in the image akin to a person (avatar, non-player character, animesh figure, static figure etc.), their gender must never be mentioned unless either the gender is clearly demonstrated, or it has been verified, or it is clearly and definitely known otherwise. I do mention the gender of my own avatar because I've created him, and I've defined him as male. I also mention @Juno Rowland's gender because I've created her, too, and I've defined her as female.

Similarly, I learned that, when describing a person (etc. etc.), their race or ethnicity must never be mentioned although some sources say otherwise. Rather, the skin tone must be mentioned, more specifically, one out of five (dark, medium-dark, medium, medium-light, light; I may expand this to nine with another four tones in-between).

Beyond that, I learned that the following may belong into the description of a person:

the identity of the person if it's of importance
age range or apparent age range
body shape
hair colour
hair length
hair shape/texture
hairstyle
ditto for a beard if applicable
clothes, footwear, jewellery and accessories including:
- shape and fit including details such as sleeve length
- colours
- colour patterns and probably also print patterns
- materials (although blind/visually-impaired users commented in this thread that I do not have to describe what certain fabric weave patterns, e.g. herringbone, look like; I'm still not sure whether I should or should not explain what the toe cap of a specific full brogue shoe looks like)
- designer/creator (although not always, plus it'd lead to even more massive explanations if I were to tell people who made which clothes on which avatar because I'd have to explain and often research who they are in the first place)
facial expression, posture and gestures

Normally, I avoid having anything that looks like a person in my images. One reason was the eye contact trigger which I can't avoid for Mastodon users when posting on Hubzilla. I've since moved all my image posting to (streams) which can actually make Mastodon blank out all images in a post. The other reason is because it's an enormous effort to not only describe an avatar appropriately, but to also explain avatars in OpenSim in general and that one avatar in particular so that everyone understands the image and the post.

I prefer portraits nowadays, especially with a background that's as minimalist as possible. It's enough of an effort to describe the avatar; it'd go completely out of hand if I also had to describe the entire surrounding.

Similarly, I still avoid having realistic-looking buildings in my images. And the last non-realistic building required up over 40,000 characters of description alone. Granted, it's both gigantic and highly complex, not to mention that it mostly has glass panes for walls so that much of its inside is visible. But if there was a realistic-looking building in one of my images, I'd first have to spend days researching English architectural terms, and then I'd have to explain all these terms for the laypeople who will actually come across the image.

Details on alt-texts and image descriptions themselves

Style-wise, I learned that alt-text must not contain line breaks. Hubzilla and (streams) themselves showed me that using the quotation marks on your keyboard in alt-text is a bad idea, too. I've never done the former, and I've stopped doing the latter.

Other things of which I know that they don't belong into alt-text are hashtags, hyperlinks (both embedded links and plain URLs), emoji, other Unicode characters which screen readers won't necessarily parse as letters, digits or interpunction, image credits and license information (the latter two must be in plain sight if they are required).

I learned that screen readers may or may not misinterpret all-caps. It's actually better to transcribe text in all-caps without the all-caps and mention in the image description that the original text is in all-caps.

I also learned recently that, in fact, extremely long image descriptions are not necessarily bad, not even in social media. Fortunately, I don't have to deal with a character limit for my posts. Only two limits matter: 1,500 characters for alt-text because Mastodon cuts off everything that goes beyond. And the 100,000 characters of post length above which Mastodon probably rejects posts altogether, rendering the image description efforts that has inflated the posts beyond these sizes moot. And yes, I can post over 100,000 characters on Hubzilla.

Whenever I learned something new, I declared all my image descriptions in which I hadn't implemented it yet obsolete.

Lack of information and lack of communication lead to assumptions

But I still don't know enough.

I dare say I have learned a whole lot. But it's all more or less basic concepts. What I still don't know enough about is what the general guidelines are when it comes to applying these concepts to such extremely obscure edge-cases as my virtual world images.

What I'm doing is a first. People have posted virtual world images in the Fediverse before, even on Mastodon. It happens all the time. A few have also added basic alt-text. But I'm the first to actually put some thought into how this has to be done if it shall be done properly.

I've still got a lot of unanswered questions. And truth be told, if one person tries to answer them, they're still unanswered. I don't need one answer from one person. I need a general community consensus for an answer.

When I ask a question on how to do a certain thing when describing my virtual world images, I don't want one person to answer. I don't want one person to answer, another person to answer the exact opposite and these two persons not knowing about each other either. But this is Mastodon's standard modus operandi because people generally can't see who has replied what to any given post before or after them.

I want to ask that question, and then I want one or several dozen people to discuss that question. Not only with me, but even more with each other. Mastodon semi-veterans who live and breathe Mastodon's accessibility culture, non-Mastodon Fediverse veterans who can wrap their minds around having no character limit, accessibility experts, actually blind or visually-impaired people, neurodivergent people who need the kind of info dumps that I provide. Plus myself as the only one of the bunch who knows a thing about these virtual worlds.

Alas, this is impossible in the Fediverse. Mastodon is too limited and too much "microblogging" and "social media" for it. And while the Fediverse does have places that are much better for discussions, Mastodon users don't congregate there, and those who do populate these places know nothing about accessibility or Mastodon's culture.

It doesn't help that I rarely post images, and when I do, I rarely get any feedback. The reasons why I rarely post images are because describing them has become such a huge effort, and many motives that I'd like to show are too complex to realistically be described appropriately.

So I have to replace a whole lot of detail knowledge with assumptions based on what I know, what I've experienced, what I can deduce from all this and what appears logical to me.

In fact, a lot of what I do in my image descriptions is based on the idea that if I mention something in an image, and a blind or visually-impaired person doesn't know what it looks like, chances are that they want to know it, and that they expect to be told what it looks like. No matter what it might be. However, it's my assumption that this may actually extend to just about everything in an image.

Educate yourself, then educate others

Still, I think I have amassed a whole lot of knowledge about alt-text in particular and image descriptions in general.

Now I'd really like to share this knowledge with others. For one, I want to give them a chance to have a very big edge over the ever-increasing requirements for good enough alt-text. Besides, I actually keep seeing people making the same glaring mistakes over and over and over again.

On top of all that, what few image description guides there are that touch the Fediverse only cover Mastodon. There are none that disregard post character limits, or at least that don't take triple-digit character limits as a given. This is the only guide for long image descriptions in social media/social networks that deals with long image descriptions at all. But even that guide doesn't take into account the possibility of being able to post tens of thousands, hundreds of thousands, millions of characters at once. Being able to describe one image in over 60,000 characters and then drop these over 60,000 characters all into the same post as the image itself. Not needing the extra capacity of alt-text for information that doesn't fit into the post itself anymore.

Most other image description guides are only for static websites and/or blogs. However, not only does most of the Fediverse not have any HTML, and not only do SEO keywords make no sense in Fediverse alt-text, but Mastodon's alt-text culture which dominates the whole Fediverse is vastly different from what accessibility experts and Web designers have cooked up for static websites and blogs. On a website, an alt-text of 300 characters is way too long. On Mastodon, it may actually be too short.

So, after studying various alt-text guides as well as Mastodon's alt-text culture, I felt the need to write down what I know so that others can learn from it.

For a while, I have toyed with the idea of starting yet another wiki on this Hubzilla channel of mine. This would be my first wiki about the Fediverse after two wikis about OpenSim, one of which is still very incomplete. The downside might be that it'd be hard to find unless I keep pointing individual people to it.

Then, a few months ago, I discovered a draft for an article on image descriptions in the Join the Fediverse Wiki. So I started expanding it last week with what I know. But as it seems, most of the information I've added isn't even welcome in the wiki. This was probably meant to be a rather simple alt-text guide.

Now I may actually create that wiki on Hubzilla. What I want to write won't fit onto one single page anyway, and I need some more structure.

I'm also wondering what to do with the knowledge I've gathered about content warnings, including a massive list of things that people may be warned about.

Image description meta

Link zur Quelle

"Nothing About Us Without Us", only it still is without them most of the time

2024-08-21 22:53:28

Profil ansehen

Jupiter Rowland

jupiter_rowland@hub.netzgemeinde.eu

When disabled Fediverse users demand participation in accessibility discussions, but there are no discussions in the first place, and they themselves don't even seem to be available to give accessibility feedback

Artikel ansehen

Zusammenfassung ansehen

"Nothing about us without us" is the catchphrase used by disabled accessibility activists who are trying to get everyone to get accessibility right. It means that non-disabled people should stop assuming what disabled people need. Instead, they should listen to what disabled people say they need and then give them what they need.

Just like accessibility in the digital realm in general, this is not only targetted at professional Web or UI developers. This is targetted at any and all social media users just as well.

However, this would be a great deal easier if it wasn't still "without them" all the time.

Lack of necessary feedback

Alt-text and image descriptions are one example and one major issue. How are we, the sighted Fediverse users, supposed to know what blind or visually-impaired users really need and where they need it if we never get any feedback? And we never get any feedback, especially not from blind or visually-impaired users.

Granted, only sighted users can call us out for an AI-generated alt-text that's complete rubbish because non-sighted users can't compare the alt-text with the image.

But non-sighted users could tell us whether they're sufficiently informed or not. They could tell us whether they're satisfied with an image description mentioning that something is there, or whether they need to be told what this something looks like. They could tell us which information in an image description is useful to them, which isn't, and what they'd suggest to improve its usefulness.

They could tell us whether certain information that's in the alt-text right now should better go elsewhere, like into the post. They could tell us whether extra information needed to understand a post or an image should be given right in the post that contains the image or through an external link. They could tell us whether they need more explanation on a certain topic displayed in an image, or whether there is too much explanation that they don't need. (Of course, they should take into consideration that some of us do not have a 500-character limit.)

Instead, we, the sighted users who are expected to describe our images, receive no feedback for our image descriptions at all. We're expected to know exactly what blind or visually-impaired users need, and we're expected to know it right off the bat without being told so by blind or visually-impaired users. It should be crystal-clear how this is impossible.

What are we supposed to do instead? Send all our image posts directly to one or two dozen people who we know are blind and ask for feedback? I'm pretty sure I'm not the only one who considers this very bad style, especially in the long run, not to mention no guarantee for feedback.

So with no feedback, all we can do is guess what blind or visually-impaired users need.

Common alt-text guides are not helpful

Now you might wonder why all this is supposed to be such a big problem. After all, there are so many alt-text guides out there on the Web that tell us how to do it.

Yes, but here in the Fediverse, they're all half-useless.

The vast majority of them is written for static Web sites, either scientific or technological or commercial. Some include blogs, again, either scientific or technological or commercial. The moment they start relying on captions and HTML code, you know you can toss them because they don't translate to almost anything in the Fediverse.

What few alt-text guides are written for social media are written for the huge corporate American silos. ?, Facebook, Instagram, LinkedIn. They do not translate to the Fediverse which has its own rules and cultures, not to mention much higher character limits, if any.

Yes, there are one or two guides on how to write alt-text in the Fediverse. But they're always about Mastodon, only Mastodon and nothing but Mastodon. They're written for Mastodon's limitations, especially only 500 characters being available in the post itself versus a whopping 1,500 characters being available in the alt-text. And they're written with Mastodon's culture in mind which, in turn, is influenced by Mastodon's limitations.

Elsewhere in the Fediverse than Mastodon, you have much more possibilities. You have thousands of characters to use up in your post. Or you don't have any character limit to worry about at all. You don't have all means at hand that you have on a static HTML Web site. Even the few dozen (streams) users who can use HTML in social media posts don't have the same influence on the layout of their posts as Web designers have on Web sites. Still, you aren't bound to Mastodon's self-imposed limitations.

And yet, those Mastodon alt-text guides tell you you have to squeeze all information into the alt-text as if you don't have any room in the post. Which, unlike most Mastodon users, you do have.

It certainly doesn't help that the Fediverse's entire accessibility culture comes from Mastodon, concentrates on Mastodon and only takes Mastodon into consideration with all its limitations. Apparently, if you describe an image for the blind and the visually-impaired, you must describe everything in the alt-text. After all, according to the keepers of accessibility in the Fediverse, how could you possibly describe anything in a post with a 500-character limit?

In addition, all guides always only cover their specific standard cases. For example, an image description guide for static scientific Web sites only covers images that are typical for static scientific Web sites. Graphs, flowcharts, maybe a portrait picture. Everything else is an edge-case that is not covered by the guide.

There are even pictures that are edge-cases for all guides and not sufficiently or not at all covered by any of them. When I post an image, it's practically always such an edge-case, and I can only guess what might be the right way to describe it.

Discussing Fediverse accessibility is necessary...

Even single feedback for image descriptions, media descriptions, transcripts etc. is not that useful. If one user gives you feedback, you know what this one user needs. But you do not know what the general public with disabilities needs. And what actually matters is just that. Another user might give you wholly different feedback. Two different blind users are likely to give you two different feedbacks on the same image description.

What is needed so direly is open discussion about accessibility in the Fediverse. People gathering together, talking about accessibility, exchanging experiences, exchanging ideas, exchanging knowledge that others don't have. People with various disabilities and special requirements in the Fediverse need to join this discussion because "nothing about them without them", right? After all, it is about them.

And people from outside of Mastodon need to join, too. They are needed to give insights on what can be done on Pleroma and Akkoma, on Misskey, Firefish, Iceshrimp, Sharkey and Catodon, on Friendica, Hubzilla and (streams), on Lemmy, Mbin, PieFed and Sublinks and everywhere else. They are needed to combat the rampant Mastodon-centricism and keep reminding the Mastodon users that the Fediverse is more than Mastodon. They are needed to explain that the Fediverse outside of Mastodon offers many more possibilities than Mastodon that can be used for accessibility. They are needed for solutions to be found that are not bound to Mastodon's restrictions. And they need to learn about there being accessibility in the Fediverse in the first place because it's currently pretty much a topic that only exists on Mastodon.

There are so many things I'd personally like to be discussed and ideally brought to a consensus of sorts. For example:

Explaining things in the alt-text versus explaining things in the post versus linking to external sites for explanations.
The first is the established Mastodon standard, but any information exclusively available in the alt-text is inaccessible to people who can't access alt-text, including due to physical disabilities.
The second is the most accessible, but it inflates the post, and it breaks with several Mastodon principles (probably over 500 characters, explanation not in the alt-text).
The third is the easiest way, but it's inconvenient because image and explanation are in different places.
What if an image needs a very long and very detailed visual description, considering the nature of the image and the expected audience?
Describe the image only in the post (inflates the post, no image description in the alt-text, breaks with Mastodon principles, impossible on vanilla Mastodon)?
Describe it externally and link to the description (no image description anywhere near the image, image description separated from the image, breaks with Mastodon principles, requires an external space to upload the description)?
Only give a description that's short enough for the alt-text regardless (insufficient description)?
Refrain from posting the image altogether?
Seeing as all text in an image must always be transcribed verbatim, what if text is unreadable for some reason, but whoever posts the image can source the text and transcribe it regardless?
Must it be transcribed because that's what the rule says?
Must it be transcribed so that even sighted people know what's written there?
Must it not be transcribed?

...but it's nigh-impossible

Alas, this won't happen. Ever. It won't happen because there is no place in the Fediverse where it could sensibly happen.

Now you might wonder what gives me that idea. Can't this just be done on Mastodon?

No, it can't. Yes, most participants would be on Mastodon. And Mastodon users who don't know anything else keep saying that Mastodon is sooo good for discussions.

But seriously, if you've experienced anything in the Fediverse that isn't purist microblogging like Mastodon, you've long since have come to the realisation that when it comes to discussions with a certain number of participants, Mastodon is utter rubbish. It has no concept of conversations whatsoever. It's great as a soapbox. But it's outright horrible at holding a discussion together. How are you supposed to have a meaningful discussion with 30 people if you burn through most of your 500-character limit mentioning the other 29?

Also, Mastodon has another disadvantage: Almost all participants will be on Mastodon themselves. Most of them will not know anything about the Fediverse outside Mastodon. At least some will not even know that the Fediverse is more than just Mastodon. And that one poor sap from Friendica will constantly try to remind people that the Fediverse is not only Mastodon, but he'll be ignored because he doesn't always mention all participants in this thread. Because mentioning everyone is not necessary on Friendica itself, so he isn't used to it, but on Mastodon, it's pretty much essential.

Speaking of Friendica, it'd actually be the ideal place in the Fediverse for such discussions because users from almost all over the place could participate. Interaction between Mastodon users and Friendica forums is proven to work very well. A Friendica forum can be moderated, unlike a Guppe group. And posts and comments reach all members of a Friendica forum without mass-mentioning.

The difficulty here would be to get it going in the first place. Ideally, the forum would be set up and run by an experienced Friendica user. But accessibility is not nearly as much an issue on Friendica as it is on Mastodon, so the difficult part would be to find someone who sees the point in running a forum about it in the first place. A Mastodon user who does see the point, on the other hand, would have to get used to something that is a whole lot different from Mastodon while being a forum admin/mod.

Lastly, there is the Threadiverse, Lemmy first and foremost. But Lemmy has its own issues. For starters, it's federated with the Fediverse outside the Threadiverse only barely and not quite reliably, and the devs don't seem to be interested in non-Threadiverse federation. So everyone interested in the topic would need a Lemmy account, and many refuse to make a second Fediverse account for whichever purpose.

If it's on Lemmy, it will naturally attract Lemmy natives. But the vast majority of these have come from Reddit straight to Lemmy. Just like most Mastodon users know next to nothing about the Fediverse outside Mastodon, most Lemmy users know next to nothing about the Fediverse outside Lemmy. I am on Lemmy, and I've actually run into that wall. After all, they barely interact with the Fediverse outside Lemmy. As accessibility isn't an issue on Lemmy either, they know nothing about accessibility on top of knowing nothing about most of the Fediverse.

So instead of having meaningful discussions, you'll spend most of the time educating Lemmy users about the Fediverse outside Lemmy, about Mastodon culture, about accessibility and about why all this should even matter to people who aren't professional Web devs. And yes, you'll have to do it again and again for each newcomer who couldn't be bothered to read up on any of this in older threads.

In fact, I'm not even sure if any of the Threadiverse projects are accessible to blind or visually-impaired users in the first place.

Lastly, I've got some doubts that discussing accessibility in the Fediverse would even possible if there was a perfectly appropriate place for it. I mean, this Fediverse neither gives advice on accessibility within itself beyond linking to always the same useless guides, nor does it give feedback on accessibility measures such as image descriptions.

People, disabled or not, seem to want perfect accessibility. But nobody wants to help others improve their contributions to accessibility in any way. It's easier and more convenient to expect things to happen by themselves.

Fediverse Image description meta

Link zur Quelle

AI superiority at describing images, not so alleged?

2024-09-06 16:04:42

Profil ansehen

Jupiter Rowland

jupiter_rowland@hub.netzgemeinde.eu

Could it be that AI can image-describe circles even around me? And that the only ones whom my image descriptions satisfy are Mastodon's alt-text police?

Artikel ansehen

Zusammenfassung ansehen

I think I've reached a point at which I only describe my images for the alt-text police anymore. At which I only keep ramping up my efforts, increasing my description quality and declaring all my previous image descriptions obsolete and hopelessly outdated only to have an edge over those who try hard to enforce quality image descriptions all over the Fediverse, and who might stumble upon one of my image posts in their federated timelines by chance.

For blind or visually-impaired people, my image descriptions ought to fall under "better than nothing" at best and even that only if they have the patience to have them read out in their entirety. But even my short descriptions in the alt-text are too long already, often surpassing the 1,000-character mark. And they're often devoid of text transcripts due to lack of space.

My full descriptions that go into the post are probably mostly ignored, also because nobody on Mastodon actually expects an image description anywhere that isn't alt-text. But on top of that, they're even longer. Five-digit character counts, image descriptions longer than dozens of Mastodon toots, are my standard. Necessarily so because I can't see it being possible to sufficiently describe the kind of images I post in significantly fewer characters, so I can't help it.

But it isn't only about the length. It also seems to be about quality. As @Robert Kingett, blind points out in this Mastodon post and this blog post linked in the same Mastodon post, blind or visually-impaired people generally prefer AI-written image descriptions over human-written image descriptions. Human-written image descriptions lack effort, they lack details, they lack just about everything. AI descriptions, in comparison, are highly detailed and informative. And I guess when they talk about human-written image descriptions, they mean all of them.

I can upgrade my description style as often as I want. I can try to make it more and more inclusive by changing the way I describe colours or dimensions as much as I want. I can spend days describing one image, explaining it, researching necessary details for the description and explanation. But from a blind or visually-impaired user's point of view, AI can apparently write circles around that in every way.

AI can apparently describe and even explain my own images about an absolutely extreme niche topic more accurately and in greater detail than I can. In all details that I describe and explain, with no exception, plus even more on top of that.

If I take two days to describe an image in over 60,000 characters, it's still sub-standard in terms of quality, informativity and level of detail. AI only takes a few seconds to generate a few hundred characters which apparently describe and explain the self-same image at a higher quality, more informatively and at a higher level of detail. It may even be able to not only identify where exactly an image was created, even if that place is only a few days old, but also explain that location to someone who doesn't know anything about virtual worlds within no more than 100 characters or so.

Whenever I have to describe an image, I always have to throw someone in front of the bus. I can't perfectly satisfy everyone all the same at the same time. My detailed image descriptions are too long for many people, be it people with a short attention span, be it people with little time. But if I shortened them dramatically, I'd have to cut information to the disadvantage of not only neurodiverse people who need things explained in great detail, but also blind or visually-impaired users who want to explore a new and previously unknown world through only that one image, just like sighted people can let their eyes wander around the image.

Apparently, AI is fully capable of actually perfectly satisfying everyone all the same at the same time because it can convey more information with only a few hundred characters.

Sure, AI makes mistakes. But apparently, AI still makes fewer mistakes than I do.

#AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #AI #AIVsHuman #HumanVsAI

Fediverse Image description meta