UltraJSON is a fast JSON encoder and decoder written in pure C with bindings for Python 3.7+. Affected versions were found to improperly…
GitHub_M·CWE-670·Published 2022-07-05
UltraJSON is a fast JSON encoder and decoder written in pure C with bindings for Python 3.7+. Affected versions were found to improperly decode certain characters. JSON strings that contain escaped surrogate characters not part of a proper surrogate pair were decoded incorrectly. Besides corrupting strings, this allowed for potential key confusion and value overwriting in dictionaries. All users parsing JSON from untrusted sources are vulnerable. From version 5.4.0, UltraJSON decodes lone surrogates in the same way as the standard library's `json` module does, preserving them in the parsed output. Users are advised to upgrade. There are no known workarounds for this issue.
UltraJSON is a fast JSON encoder and decoder written in pure C with bindings for Python 3.7+. Affected versions were found to improperly decode certain characters. JSON strings that contain escaped surrogate characters not part of a proper surrogate pair were decoded incorrectly. Besides corrupting strings, this allowed for potential key confusion and value overwriting in dictionaries. All users parsing JSON from untrusted sources are vulnerable. From version 5.4.0, UltraJSON decodes lone surrogates in the same way as the standard library's `json` module does, preserving them in the parsed output. Users are advised to upgrade. There are no known workarounds for this issue.
### Impact _What kind of vulnerability is it? Who is impacted?_ Anyone parsing JSON from an untrusted source is vulnerable. JSON strings that contain escaped surrogate characters not part of a proper surrogate pair were decoded incorrectly. Besides corrupting strings, this allowed for potential key confusion and value overwriting in dictionaries. Examples: ```python # An unpaired high surrogate character is ignored. >>> ujson.loads(r'"\uD800"') '' >>> ujson.loads(r'"\uD800hello"') 'hello' # An unpaired low surrogate character is preserved. >>> ujson.loads(r'"\uDC00"') '\udc00' # A pair of surrogates with additional non surrogate characters pair up in spite of being invalid. >>> ujson.loads(r'"\uD800foo bar\uDC00"') 'foo bar𐀀' ``` ### Patches _Has the problem been patched? What versions should users upgrade to?_ Users should upgrade to UltraJSON 5.4.0. From version 5.4.0, UltraJSON decodes lone surrogates in the same way as the standard library's `json` module does, preserving them in the parsed output: ```python3 >>> ujson.loads(r'"\uD800"') '\ud800' >>> ujson.loads(r'"\uD800hello"') '\ud800hello' >>> ujson.loads(r'"\uDC00"') '\udc00' >>> ujson.loads(r'"\uD800foo bar\uDC00"') '\ud800foo bar\udc00' ``` ### Workarounds _Is there a way for users to fix or remediate the vulnerability without upgrading?_ Short of switching to an entirely different JSON library, there are no safe alternatives to upgrading. ### For more information If you have any questions or comments about this advisory: * Open an issue in [UltraJSON](http://github.com/ultrajson/ultrajson/issues)
### Impact _What kind of vulnerability is it? Who is impacted?_ Anyone parsing JSON from an untrusted source is vulnerable. JSON strings that contain escaped surrogate characters not part of a proper surrogate pair were decoded incorrectly. Besides corrupting strings, this allowed for potential key confusion and value overwriting in dictionaries. Examples: ```python # An unpaired high surrogate character is ignored. >>> ujson.loads(r'"\uD800"') '' >>> ujson.loads(r'"\uD800hello"') 'hello' # An unpaired low surrogate character is preserved. >>> ujson.loads(r'"\uDC00"') '\udc00' # A pair of surrogates with additional non surrogate characters pair up in spite of being invalid. >>> ujson.loads(r'"\uD800foo bar\uDC00"') 'foo bar𐀀' ``` ### Patches _Has the problem been patched? What versions should users upgrade to?_ Users should upgrade to UltraJSON 5.4.0. From version 5.4.0, UltraJSON decodes lone surrogates in the same way as the standard library's `json` module does, preserving them in the parsed output: ```python3 >>> ujson.loads(r'"\uD800"') '\ud800' >>> ujson.loads(r'"\uD800hello"') '\ud800hello' >>> ujson.loads(r'"\uDC00"') '\udc00' >>> ujson.loads(r'"\uD800foo bar\uDC00"') '\ud800foo bar\udc00' ``` ### Workarounds _Is there a way for users to fix or remediate the vulnerability without upgrading?_ Short of switching to an entirely different JSON library, there are no safe alternatives to upgrading. ### For more information If you have any questions or comments about this advisory: * Open an issue in [UltraJSON](http://github.com/ultrajson/ultrajson/issues)
UltraJSON es un codificador y decodificador JSON rápido escrito en C puro con enlaces para Python versiones 3.7+. Se encontró que las versiones afectadas descodificaban incorrectamente determinados caracteres. Las cadenas JSON que contienen caracteres suplentes escapados que no forman parte de un par suplente adecuado se decodificaron incorrectamente. Además de corromper las cadenas, esto permitió una posible confusión de claves y sobrescritura de valores en los diccionarios. Todos los usuarios que analizan JSON de fuentes que no son de confianza son vulnerables. A partir de la versión 5.4.0, UltraJSON decodifica sustitutos solitarios de la misma manera que lo hace el módulo "json" de la biblioteca estándar, conservándolos en la salida analizada. Se recomienda a los usuarios que actualicen. No se conocen mitigaciones adicionales para este problema
| Version | Type | Source | Base | Exp | Impact | Vector |
|---|---|---|---|---|---|---|
| 2.0 | Primary | NVD | 5.0 | 10.0 | 2.9 | AV:N/AC:L/Au:N/C:N/I:N/A:P |
| 3.1 | Primary | cve.org | 7.5 | — | — | CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H |
| 3.1 | Primary | NVD | 7.5 | 3.9 | 3.6 | CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H |
| 3.1 | Primary | cve.org | 7.5 | — | — | CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H |
| 3.1 | Secondary | NVD | 7.5 | 3.9 | 3.6 | CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H |
| 3.1 | Secondary | GHSA | 7.5 | — | — | CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H |