CWE-1007
Insufficient Visual Distinction of Homoglyphs Presented to User
Extended description
Some glyphs, pictures, or icons can be semantically distinct to a program, while appearing very similar or identical to a human user. These are referred to as homoglyphs. For example, the lowercase "l" (ell) and uppercase "I" (eye) have different character codes, but these characters can be displayed in exactly the same way to a user, depending on the font. This can also occur between different character sets. For example, the Latin capital letter "A" and the Greek capital letter "Α" (Alpha) are treated as distinct by programs, but may be displayed in exactly the same way to a user. Accent marks may also cause letters to appear very similar, such as the Latin capital letter grave mark "À" and its equivalent "Á" with the acute accent. Adversaries can exploit this visual similarity for attacks such as phishing, e.g. by providing a link to an attacker-controlled hostname that looks like a hostname that the victim trusts. In a different use of homoglyphs, an adversary may create a back door username that is visually similar to the username of a regular user, which then makes it more difficult for a system administrator to detect the malicious username while reviewing logs.
Common consequences1
- IntegrityConfidentialityOther
An attacker may ultimately redirect a user to a malicious website, by deceiving the user into believing the URL they are accessing is a trusted domain. However, the attack can also be used to forge log entries by using homoglyphs in usernames. Homoglyph manipulations are often the first step towards executing advanced attacks such as stealing a user's credentials, Cross-Site Scripting (XSS), or log forgery. If an attacker redirects a user to a malicious site, the attacker can mimic a trusted domain to steal account credentials and perform actions on behalf of the user, without the user's knowledge. Similarly, an attacker could create a username for a website that contains homoglyph characters, making it difficult for an admin to review logs and determine which users performed which actions.
Potential mitigations2
- Implementation
Use a browser that displays Punycode for IDNs in the URL and status bars, or which color code various scripts in URLs. Due to the prominence of homoglyph attacks, several browsers now help safeguard against this attack via the use of Punycode. For example, Mozilla Firefox and Google Chrome will display IDNs as Punycode if top-level domains do not restrict which characters can be used in domain names or if labels mix scripts for different languages.
- Implementation
Use an email client that has strict filters and prevents messages that mix character sets to end up in a user's inbox. Certain email clients such as Google's GMail prevent the use of non-Latin characters in email addresses or in links contained within emails. This helps prevent homoglyph attacks by flagging these emails and redirecting them to a user's spam folder.
Relationships1
- ChildOfCWE-451
CVEs referencing this CWE5
| CVE | Description | Severity | EPSS | Flags | Modified |
|---|---|---|---|---|---|
| CVE-2021-4221 | If a domain name contained a RTL character, it would cause the domain to be rendered to the right of the path. This could lead to user confusion and spoofing attacks. <br>*This bug only affects Firefox for Android. Other operating systems are unaffected.*<br>*Note*: Due to a clerical error this advisory was not included in the original announcement, and was added in Feburary 2022. This vulnerability affects Firefox < 92. | MEDIUM4.3 | 0.41%p33 | 2025-04-16 | |
| CVE-2025-0996 | Inappropriate implementation in Browser UI in Google Chrome on Android prior to 133.0.6943.98 allowed a remote attacker to spoof the contents of the Omnibox (URL bar) via a crafted HTML page. (Chromium security severity: High) | MEDIUM5.4 | 0.37%p28 | 2025-04-10 | |
| CVE-2025-27611 | base-x is a base encoder and decoder of any given alphabet using bitcoin style leading zero compression. Versions 4.0.0, 5.0.0, and all prior to 3.0.11, are vulnerable to attackers potentially deceiving users into sending funds to an unintended address. This issue has been patched in versions 3.0.11, 4.0.1, and 5.0.1. | NONE | 0.35%p27 | 2026-04-15 | |
| CVE-2026-48760 | ### Description `Symfony\Component\HtmlSanitizer\TextSanitizer\UrlSanitizer::parse()` rejects URLs containing raw Unicode explicit-direction BiDi formatting characters (U+202A–U+202E, U+2066–U+2069) as a defense against visual-spoofing of the rendered `href`. The check covers only the raw UTF-8 forms of those code points: the percent-encoded forms (`%E2%80%AE` for U+202E, `%E2%81%A6` for U+2066, etc.) are not matched by the deny regex, survive `league/uri`'s parse/build cycle, and are re-emitted unchanged in the sanitized URL. Any downstream consumer that decodes the link before display — phishing-detection filters that compare `urldecode($href)` against a domain allow-list, audit-log dashboards that show a decoded form for readability, hover-tooltip previews, federated/syndicated content where the decoder lives on the consuming side — restores the BiDi character and the visual spoof that the original defense was filed to prevent. The same `UrlSanitizer::parse()` carries an ASCII-only `/\s/` whitespace check (no `/u` modifier) intended as a backstop against malformed URLs. Without the `/u` modifier, PCRE's `\s` matches only ASCII whitespace, so Unicode whitespace characters — NBSP (U+00A0), the zero-width no-break space / BOM (U+FEFF), line/paragraph separators (U+2028, U+2029), ogham space (U+1680), the U+2000–U+200A en/em quad family, narrow / medium / ideographic spaces (U+202F, U+205F, U+3000) and NEL (U+0085) — pass through unchanged in both raw and percent-encoded forms. In hostname positions they enable lookalike spoofs (`example<NBSP>.com`); in path/query/fragment they enable allow-list drift when a downstream consumer strips whitespace before comparison. ### Resolution `UrlSanitizer::parse()` now denies BiDi formatting marks together with Unicode whitespace and the zero-width no-break space, in both the raw input and the percent-decoded form of each parsed URL component (`user`, `pass`, `host`, `path`, `query`, `fragment`). ASCII space remains tolerated in path/query/fragment via the existing percent-encoding step. The patches for this issue are available [here](https://github.com/symfony/symfony/commit/b21a626fd90f5c12d2db432c629eed3e780ba2f8) for branch 6.4 (and forward-ported to 7.4, 8.0 and 8.1). ### Credits Symfony would like to thank Scott Arciszewski (Trail of Bits) for reporting the issue and Nicolas Grekas for providing the fix. | NONE | no EPSS | 2026-06-15 | |
| CVE-2026-45064 | ### Description `Symfony\Component\HtmlSanitizer\TextSanitizer\UrlSanitizer::parse()` (used by `UrlSanitizer::sanitize()` and therefore by every `HtmlSanitizer` config that allows links or media) accepts URLs that contain Unicode explicit-direction BiDi formatting characters: U+202A–U+202E (LRE / RLE / PDF / LRO / RLO) and U+2066–U+2069 (LRI / RLI / FSI / PDI). These characters are passed through unchanged into the `href` / `src` attributes produced by `HtmlSanitizer`. When the resulting HTML is rendered in a browser, the override characters reverse or alter the visual ordering of the URL text, so the displayed link can differ arbitrarily from the actual destination: a classic visual-spoofing / phishing primitive against viewers of sanitized content. ### Resolution `UrlSanitizer::parse()` now rejects URLs containing the explicit-direction BiDi formatting code points (U+202A–U+202E, U+2066–U+2069) before invoking the underlying URL parser. As an unrelated companion fix in the same patch, spaces inside path/query/fragment are now percent-encoded rather than rejected outright, while spaces in the scheme/authority remain rejected by the post-encoding whitespace check. The patch for this issue is available [here](https://github.com/symfony/symfony/commit/743a435e948b897ef2b5564ac438d4beb95d2526) for branch 5.4. ### Credits Symfony would like to thank Himanshu Anand for reporting the issue and Nicolas Grekas for providing the fix. | NONE | no EPSS | 2026-05-27 |