Accessibility issues with stylized unicode characters
Unicode characters like ๐ฑ๐ฅ๐ข๐ฐ๐ข cause accessibility issues and have a strong association with spam and scams.
Unicode characters like ๐ฑ๐ฅ๐ข๐ฐ๐ข cause accessibility issues and have a strong association with spam and scams.
You've probably landed on this page because you clicked learn more
on the warning on one of the unicode text converter tools.
That's because the output of those tools cause accessibility issues.
This page exists to explain those issues โ and to convince you to either not use those tools, or to use them in a way that doesn't ruin people's experience of the web.
In addition to the accessibility issues โย they also have a strong association with spam and scamsย โ more information can be found at the bottom of the article.
To a computer everything is a number.
For a computer to work with the alphabet you need to give each letter a unique number.
For different computers to work together โย they all need to agree on which letters are assigned which numbers.
Unicode
is the system the world uses to make sure every computer agrees on which character
is assigned which number.
Here's some examples:
captial A
is given the number 65
capital B
is #66
lowercase a
is #97
The world has thousands of languages which have unicode
symbols too:
ะ
is #1026
แ
is #4096
ใ
is #13056
Unicode is also used for symbols
and Emoji
โฉ
is #10025
๐
is #120073
๐
is #128514
Without a system like unicode
the internet would not be possible โ this page would just be a mess of the wrong symbols.
Unicode has many symbols used in math and phonetics โย and these symbols often resemble stylised letters of the latin alphabet.
These are intended for use in mathematics, but instead people use them to write stylized text online โย often on social-media sites that don't have the option to use bold or italic text.
Here's an example of what it looks this text looks like (it may not render on your device)
แดสษชs โคโฎ ๐๐ผ๐บ๐ฒ ๐ฃ๐๐ซ๐ ๐ถ ๐ฅ๐๐ฉ๐ฅ ๐๐ฝ๐ถ๐ ๐ป๐พ๐พ๐บ๐ ๐๐๐๐๐ข ๐ ๐ ๐ ๐๐๐๐๐ .It's true. It does look funky and weird โ but it also comes with some serious accessibility issues.
Screen readers are an assistive technology that reads the contents of the web aloud.
When a screen reader reads ๐ฅ๐๐๐ค๐ ๐ค๐ช๐๐๐ ๐๐ค, it doesn't interpret them visually โ it correctly interprets them as the symbols that they are.
So a sentence like: "please buy my plastic ๐ค๐๐ฏ๐๐๐ค๐ข" is read aloud as "please buy my plastic mathematical fraktur g mathematical fraktur a mathematical fraktur r mathematical fraktur b mathematical fraktur a mathematical fraktur g mathematical fraktur e".
Which is incredibly frustrating โย and destroys any chance of the author selling their plastic garbage to screen reader users.
Unicode contains over 100000
letters and symbols and they add more every year.
To have an image saved on your device for every single one of these symbols would take up a lot of memory โ and be a lot of work for the device creators.
It's not surprising that many devices don't render these symbols at all. Instead they show a bunch of squares.
This is a familiar experience for anyone who has received a message containing a brand new emoji!
Some social media sites and messaging apps allow you write bold and italic text โ and have unintentionally made this a hidden feature.
However, many sites and apps don't allow users to style text. So the simple solution is don't use these symbols โ I hope this article has convinced you not to.
Instead allow the content of your words to speak for themselves โ or just quit social media and go outside a look at a flower.
If you're new to web development and want to stylise text โ then you need to learn some css
. You can learn more about it here โย MDN โย CSS: Cascading Style Sheets
If you need to use these symbols on your website โย like they are used on this page โย and want screen readers to read them correctlyย โ wrap them in an element with the aria-label
attribute, like so:
<span aria-label="your text">๐ช๐ ๐ฆ๐ฃ ๐ฅ๐๐ฉ๐ฅ</span>
or if the symbols you are using a purely decorative, and have no meaning for the reader. Use an aria-hidden
attribute instead.
<span aria-hidden>โฉ</span>
In addition to accessibility issues โย unicode symbols like these have a strong association with spam and online scams.
One of the ways email and text message spam filters detect a spam message is by searching the email's content for specific words or phrases.
Some spammers hope to avoid detection by replacing these words or specific letters in these words with stylized characters.
Because of this, many people associate these stylised characters with spam.
Scarier than spam. Criminal hackers use these unicode characters to trick you into giving up access to your online accounts.
For example โ lets say your example bank has a website with the example url example.com
Someone may send you an email pretending to be your bank asking you to log into your account. It looks real, so you click on the link โย and the link opens a website that looks identical to your bank's website.
Even the url
is the same โย but it's not the same! Instead of saying example.com
it says ๐๐ก๐๐๐๐๐.com
or subtler still ะตxample.com
. (this one uses a Cyrillic ะต
)
This is called a homoglyph attack โย a homoglyph is a character that looks very similar to another character.
Unicode keeps a catalogue of all these homoglyphs to help people build tools to prevent such attacks. Unicode refers to them as confusables
.
The association with spams, scams, online crime, make these unicode character extra worthwhile not to use.
You may even find that your emails and text messages go straight to people's junk mail.