You've probably landed on this page because you clicked
learn more on the warning on one of the unicode text converter tools.
That's because the output of those tools cause accessibility issues.
This page exists to explain those issues — and to convince you to either not use those tools, or to use them in a way that doesn't ruin people's experience of the web.
In addition to the accessibility issues — they also have a strong association with spam and scams — more information can be found at the bottom of the article.
- Skip to accessibility issues
- Skip to how to solve these issues
- Skip to association with online spam and scams
What is unicode?
To a computer everything is a number.
For a computer to work with the alphabet you need to give each letter a unique number.
For different computers to work together — they all need to agree on which letters are assigned which numbers.
Unicode is the system the world uses to make sure every computer agrees on which
character is assigned which number.
Here's some examples:
captial Ais given the number
The world has thousands of languages which have
unicode symbols too:
Unicode is also used for
Without a system like
unicode the internet would not be possible — this page would just be a mess of the wrong symbols.
Unicode mathematical symbols
Unicode has many symbols used in math and phonetics — and these symbols often resemble stylised letters of the latin alphabet.
These are intended for use in mathematics, but instead people use them to write stylized text online — often on social-media sites that don't have the option to use bold or italic text.
Here's an example of what it looks this text looks like (it may not render on your device)ᴛʜɪs ⒤⒮ 𝘀𝗼𝗺𝗲 𝔣𝔞𝔫𝔠𝔶 𝕥𝕖𝕩𝕥 𝓉𝒽𝒶𝓉 🄻🄾🄾🄺🅂 𝚏𝚞𝚗𝚔𝚢 🅐🅝🅓 𝒘𝒆𝒊𝒓𝒅.
It's true. It does look funky and weird — but it also comes with some serious accessibility issues.
Screen readers are an assistive technology that reads the contents of the web aloud.
When a screen reader reads 𝕥𝕙𝕖𝕤𝕖 𝕤𝕪𝕞𝕓𝕠𝕝𝕤, it doesn't interpret them visually — it correctly interprets them as the symbols that they are.
So a sentence like: "please buy my plastic 𝔤𝔞𝔯𝔟𝔞𝔤𝔢" is read aloud as "please buy my plastic mathematical fraktur g mathematical fraktur a mathematical fraktur r mathematical fraktur b mathematical fraktur a mathematical fraktur g mathematical fraktur e".
Which is incredibly frustrating — and destroys any chance of the author selling their plastic garbage to screen reader users.
Unicode contains over
100000 letters and symbols and they add more every year.
To have an image saved on your device for every single one of these symbols would take up a lot of memory — and be a lot of work for the device creators.
It's not surprising that many devices don't render these symbols at all. Instead they show a bunch of squares.
This is a familiar experience for anyone who has received a message containing a brand new emoji!
How to solve these issues
Social media and messaging apps
Some social media sites and messaging apps allow you write bold and italic text — and have unintentionally made this a hidden feature.
However, many sites and apps don't allow users to style text. So the simple solution is don't use these symbols — I hope this article has convinced you not to.
Instead allow the content of your words to speak for themselves — or just quit social media and go outside a look at a flower.
If you're new to web development and want to stylise text — then you need to learn some
css. You can learn more about it here — MDN — CSS: Cascading Style Sheets
If you need to use these symbols on your website — like they are used on this page — and want screen readers to read them correctly — wrap them in an element with the
aria-label attribute, like so:
<span aria-label="your text">𝕪𝕠𝕦𝕣 𝕥𝕖𝕩𝕥</span>
or if the symbols you are using a purely decorative, and have no meaning for the reader. Use an
aria-hidden attribute instead.
Association with spam and scams
In addition to accessibility issues — unicode symbols like these have a strong association with spam and online scams.
One of the ways email and text message spam filters detect a spam message is by searching the email's content for specific words or phrases.
Some spammers hope to avoid detection by replacing these words or specific letters in these words with stylized characters.
Because of this, many people associate these stylised characters with spam.
Confusables and homoglyph attacks
Scarier than spam. Criminal hackers use these unicode characters to trick you into giving up access to your online accounts.
For example — lets say your example bank has a website with the example url
Someone may send you an email pretending to be your bank asking you to log into your account. It looks real, so you click on the link — and the link opens a website that looks identical to your bank's website.
url is the same — but it's not the same! Instead of saying
example.com it says
𝚎𝚡𝚊𝚖𝚙𝚕𝚎.com or subtler still
еxample.com. (this one uses a Cyrillic
This is called a homoglyph attack — a homoglyph is a character that looks very similar to another character.
Unicode keeps a catalogue of all these homoglyphs to help people build tools to prevent such attacks. Unicode refers to them as
Avoid the association
The association with spams, scams, online crime, make these unicode character extra worthwhile not to use.
You may even find that your emails and text messages go straight to people's junk mail.