Super-long rant about accessibility, the length of alt-texts for pictures taken in virtual worlds and incompatibility issues between Mastodon and Hubzilla
Posts with pictures can really be a pain for me.
This has little to do with the countless #
Mastodon users who make up most of my followers and their followers by a gigantic margin, and with how mangled my
Hubzilla posts arrive on their timelines. Pictures ripped out of the text where they should be embedded, reversed in their order and all but the last four discarded entirely. I can't do much about that, and I don't think I'll limit myself to only one picture per post with no text following it so that the post looks nearly the same on Mastodon as on #
Hubzilla.
No, the pain comes from #
AltText which, as it turned out, remains on pictures even after Mastodon converts them from embedded to file attachments.
Also, this doesn't have so much to do with what a hassle it is to add alt-texts to pictures on Hubzilla. We don't have a one-click GUI for that. We have to edit the BBcode for the embedding of the picture. So it's some more work. But this still isn't my main issue.
It's said that alt-text is required to be clear and concise, but informative. It shall be short enough to not overwhelm screen reader users who can't navigate alt-texts, and who are forced to have their screen readers rattle down the whole alt-text in one go. At the same time, however, it shall be informative enough for everyone, absolutely everyone to understand the picture.
This is easy to do with pictures of something that people are at least halfway familiar with. Which usually means real-life photographs.
To pictures taken in #
VirtualWorlds, however, this does not apply. Even pictures from within virtual worlds which show something that tries to mimick real life require more explanation, not only, but also because they have a tendency of being even more detailed. And unlike real-life photographs, they are much more likely to be informative and much less likely to be pure art.
On top of that even, pictures from within virtual worlds are also very likely to contain things which people not familiar with virtual worlds in general and #
SecondLife or #
OpenSimulator in particular, or sometimes actually #
OpenSim specifically, won't get. People who can see won't understand them from just seeing them, and visually-impaired people will understand them even less without alt-text.
In fact, I'd have to start the alt-text of each in-world picture with a declaration that this picture was taken in a virtual world and not in real life. For the alt-text to really be informative, I'd have to mention that it was taken in something based on OpenSim. This still won't be enough because nobody knows OpenSim. So I'd have to explain what OpenSim is, and I'd have to explain it in such a way that even people who have never even heard of Second Life know and understand what it is.
Second Life users are more likely to get away with just simply saying their picture was taken in Second Life. Most of their followers are Second Life users themselves whose followers are Second Life users in turn, and so forth. They could drop names of regions, creators, mesh bodies or in-world DJs, and just about everyone who follows them knows what they're talking about with no further explanation.
Most of
my followers expect me to explain the Fediverse beyond Mastodon to them and follow me for that. They probably believe that the #
Metaverse equals #
HorizonWorlds ("Facebook's Metaverse") until they've read a few posts of mine in which I talk about OpenSim. The same goes even more for those who may receive boosts of my posts. Even then, if I only drop the term "OpenSim" in an alt-text, they won't necessarily know what it is. Hence, I have to explain it.
It's often necessary to also mention in alt-text where a picture was taken. I guess everyone knows what London or Paris or Tokyo or Los Angeles is. No need for explanation, right? Even if it's London, Ontario, adding "Ontario" should be sufficient. Or maybe adding "Canada" behind that.
But how many of you know what "the Metropolis welcome sim" was where I took
this picture? Okay, it's the welcome sim of the now-defunct Metropolis Metaversum.
But what's, or rather, was the Metropolis Metaversum? A grid? What's a grid? And what's a welcome sim? In fact, what's a sim? OpenSim users know. Second Life users know, although they may be unfamiliar with there being "a grid" which implies there being multiple grids and not just one. Even users of other virtual worlds don't know either, and most people don't know anything about virtual worlds except for what they've learned about Horizon Worlds through mass media.
So I have to explain what a grid is in OpenSim. Only after doing so and thus educating people about the existence of regions, I can explain what a sim is, and I have to do that, too. Mind you, I'm still within the same alt-text of only one picture.
It is not before then that I can start describing what actually is in this picture. For those knowing OpenSim, I could simply say the picture shows my Metropolis avatar in the outside area of the Metropolis welcome building, waving a last farewell to everyone before the grid shuts down. That's all there is to know to them.
To everyone else,
every detail in the picture may be of interest. Particularly, people with poor vision won't be able to clearly see all the details in the fairly small picture under the fairly bad lighting (the sim was permanently set to sunset), and they won't be able to judge what's interesting unless they know what it is that may or may not be interesting.
So I'd have to rattle it all down.
- The rusty steel girder that makes up the floor I'm standing on.
- The rusty steel railing in front of me.
- The steel-and-glass dome that covers everything behind me.
- The four-seat bench to the left of me.
- The info desk right behind the bench.
- The animated NPC model of the robot Maria from Fritz Lang's 1928 silent film Metropolis standing behind the info desk as a greeter plus an explanation to Second Life users that, yes, this is actually an NPC and not an avatar used as a bot.
- The rotating heart to the left of the greeter and what it does.
- The rocks and the vegetation, especially the tree in the middle.
- The tables and chairs around the tree.
- The old greeter, Bertha senior, sitting at one of the tables as a non-functional statue. Including that she used to be a greeter NPC until someone created robotic Maria from Metropolis as a complete avatar (what does "complete avatar" even mean), and someone else made an NPC out of this complete avatar. Maybe including what Bertha senior looks like, and what she is wearing.
- The walk-in teleporter in the background on the right. Where it takes you (level 2, this is level 3). What level 2 is (the teleporter room). What's shown on the screen (level 2, the teleporter room, with several teleporters to specific places to both sides, probably even where they would take you, and the big teleporter in the middle where you can select a destination).
- The red and black, circular, rotating Metropolis Metaversum logo above the walk-in teleporter. A transcription of the writing on the black ring. The big white M in the middle, The "Metropolis" writing above the big white M, and that it was lifted straight from the film poster in style. Including what this writing on the film poster looks like.
- The big black sign right behind me. Ideally including a full transcription of what's written on it, more so if I mention somewhere that a copy of this place still exists, so I can go see and transcribe the whole sign. It's actually a rule: If there's any writing in a picture, it has to be transcribed in alt-text in its entirety.
- The white screens to both sides of the big black sign, what they show, what they do, what's written on them.
- The three screens or signs to the right of the white screens, what they show, what they (used to) do, what's written on them.
- Maybe even my avatar. Including explaining what #Roth2 v2 is, what a mesh body is, what mesh is, what rigged mesh is etc.
Absolutely nothing of this matters in the context of the picture. Many may say they don't want to know about any of this. They can't take in that much information in one go.
Others, however, may
demand to know everything about all this so they can judge for themselves whether it's important to them or not. Leaving out any detail that
may be of importance to
someone is not inclusive and potentially ableist. #
Accessibility, or #
a11y, means giving everyone the same chance of experiencing a picture in all details which may or may not matter to anyone.
With all this, I expect a fully informative alt-text for the picture linked above to have at least 5,000 characters.
And this becomes problematic. Mastodon has a hard cap of 1,500 characters for alt-texts; I'm not sure if it can display anything longer when it comes in from Hubzilla or elsewhere outside. For comparison, Twitter has a hard cap of 1,000 characters.
Interestingly, there were two different issues filed on Mastodon's Github repository.
One demands a cap at 15,000 or 150,000 characters because full transcripts of screenshots of lots of text, which a11y demands be made, no matter what, quickly exceed 1,500 characters.
The other one comes from a visually-impaired user with a screen reader and
demands the hard cap be lowered back to the 420 characters which Mastodon had before version 3.0. Existing screen readers can't navigate alt-texts, they can only rattle them down in one go, that's why. If the user in question knew that the Fediverse is not only Mastodon, they'd probably demand either all alt-texts coming in from outside Mastodon be cut at 420 characters, or every last other Fediverse project introduce the same 420-character hard cap mandatorily.
A sufficiently informative description of a picture from within a virtual world is not possible within 420 characters. Not without leaving out information that's essential for understanding the picture, regardless of whether the reader of a post is visually impaired or not.
Oh, by the way: I would have to include such a monstrous alt-text
So, how can this be solved, if at all?
I could put the image description outside the picture instead of embedding it as alt-text. Sounds good at first glance.
But for one, this would limit me to only one picture per post. Even though I'm not on Mastodon, I have to work around Mastodon's limitations, for most of my readers are on Mastodon. Virtual travel reports which tend to come with lots and lots of pictures would have to be split into tiny chunks because it's impossible on Mastodon to have any text near more than one picture. Whether a post comes in from outside, or it's a native toot, pictures are always put at the end with no text in-between.
Hubzilla could handle this much more elegantly. Pictures can be embedded between paragraphs, descriptions could be placed above or below pictures, I can have as many pictures in one Hubzilla post as I want to. None of this matters if the huge majority of Mastodon users can't properly see any of this. The only way to circumvent this would be to write articles instead of posts and then putting links to the articles into posts. This would come with the inconvenience for especially mobile users that the article link would open their Web browser.
Nonetheless, there's still the size of the image description. If I chose a post format that is as Mastodon-compliant as possible, I'd first have the actual post text, then the description, then the picture. In this case, Mastodon users would first see the little bit of text I actually intended to write, then they would have to go through an over-5,000-character wall of text, the image description, before they finally get to the picture itself. Who'd want to put up with this?
Besides, the image wouldn't have an alt-text, full stop. even if I give an image description of over 5,000 characters
before the picture itself instead of an alt-text, and even if I state explicitly that this description serves the same purpose as an alt-text, but it's too long for being an alt-text, this would still trigger the "no alt-text, no boosts" crowd. Because there's no alt-text. An over-5000-character rant right before a picture is not an alt-text, end of discussion.
Next possible solution: Stop writing about virtual worlds altogether. Most people don't follow me for that anyway, and I guess they're highly irritated whenever they receive a post from me on their timelines in which I don't explain the Fediverse outside of Mastodon.
Won't do that, sorry. OpenSim is and remains the primary topic of this channel, whether you want or not.
Okay, milder version: Stop posting pictures about virtual worlds. That'd save everyone a whole lot of hassle.
Won't do that either. Virtual worlds are a very visual medium. Some things can only be explained with pictures along with text, and virtual worlds are among these things. Maybe OpenSim as a whole can be described sufficiently in text only. A place in OpenSim, an interesting object or an avatar outfit can't. Imagine National Geographic or Vogue abandoning pictures altogether and going all text. It's the same.
Well, then how about simplifying my pictures so that there isn't so much I have to write about anymore?
Won't work. I could take portrait pictures of avatars in so-called photo studios with a neutral, single-colour, unshaded, otherwise featureless background. I wouldn't have to describe the background much. I wouldn't have to explain where the picture was taken either because the same photo studio looks the same wherever it was placed. Still, the picture would require a lengthy description of the avatar.
And before that, it would also require mentioning that the picture was made in OpenSim. Plus the lengthy explanation what OpenSim is. Any one of my pictures may be the first one from OpenSim that someone comes across. I can't count on people already knowing all this from earlier posts. Not everyone who may see posts from me is a long-time follower. I can never know who likes which picture why and then boosts it to whom.
As for virtual travel pictures, I guess it's obvious that and why I can't simplify them. I simply cannot change what's standing in-world and thus shown in the picture.
All this alt-text trouble sometimes makes me wonder if it's worth going through it in the first place. As I've already said, OpenSim is a very very visual medium. Normally, visually-impaired people wouldn't even be interested in it, especially not in what goes on in-world.
But then I remember that it's technically possible for absolutely blind people to use OpenSim with a special viewer. I guess they can't build in-world, and they may need help with making their avatars look good to the seeing, but they can navigate avatars with few problems. A text-to-speech system delivers the name of whichever object is ahead of them to them amongst other things. And if there is no object in front of them, they usually know they can walk in that direction.
This means, somewhere out there, there could be visually-impaired Fediverse users who are indeed interested in what there is to find and to do on the grids of OpenSim, and what is going on there. If I didn't provide sufficiently informative alt-texts, that'd be a big obstacle for them.
So this whole problem stays unsolved. And the length of my alt-texts is likely to increase to stay informative.