The scrapy/scrapy project is vulnerable to XML External Entity (XXE) attacks due to the use of lxml.etree.fromstring for parsing untrusted…
@huntr_ai·CWE-409·Published 2024-02-16
The scrapy/scrapy project is vulnerable to XML External Entity (XXE) attacks due to the use of lxml.etree.fromstring for parsing untrusted XML data without proper validation. This vulnerability allows attackers to perform denial of service attacks, access local files, generate network connections, or circumvent firewalls by submitting specially crafted XML data.
The scrapy/scrapy project is vulnerable to XML External Entity (XXE) attacks due to the use of lxml.etree.fromstring for parsing untrusted XML data without proper validation. This vulnerability allows attackers to perform denial of service attacks, access local files, generate network connections, or circumvent firewalls by submitting specially crafted XML data.
### Impact Scrapy limits allowed response sizes by default through the [`DOWNLOAD_MAXSIZE`](https://docs.scrapy.org/en/latest/topics/settings.html#download-maxsize) and [`DOWNLOAD_WARNSIZE`](https://docs.scrapy.org/en/latest/topics/settings.html#download-warnsize) settings. However, those limits were only being enforced during the download of the raw, usually-compressed response bodies, and not during decompression, making Scrapy vulnerable to [decompression bombs](https://cwe.mitre.org/data/definitions/409.html). A malicious website being scraped could send a small response that, on decompression, could exhaust the memory available to the Scrapy process, potentially affecting any other process sharing that memory, and affecting disk usage in case of uncompressed response caching. ### Patches Upgrade to Scrapy 2.11.1. If you are using Scrapy 1.8 or a lower version, and upgrading to Scrapy 2.11.1 is not an option, you may upgrade to Scrapy 1.8.4 instead. ### Workarounds There is no easy workaround. Disabling HTTP decompression altogether is impractical, as HTTP compression is a rather common practice. However, it is technically possible to manually backport the 2.11.1 or 1.8.4 fix, replacing the corresponding components of an unpatched version of Scrapy with patched versions copied into your own code. ### Acknowledgements This security issue was reported by @dmandefy [through huntr.com](https://huntr.com/bounties/c4a0fac9-0c5a-4718-9ee4-2d06d58adabb/).
### Impact Scrapy limits allowed response sizes by default through the [`DOWNLOAD_MAXSIZE`](https://docs.scrapy.org/en/latest/topics/settings.html#download-maxsize) and [`DOWNLOAD_WARNSIZE`](https://docs.scrapy.org/en/latest/topics/settings.html#download-warnsize) settings. However, those limits were only being enforced during the download of the raw, usually-compressed response bodies, and not during decompression, making Scrapy vulnerable to [decompression bombs](https://cwe.mitre.org/data/definitions/409.html). A malicious website being scraped could send a small response that, on decompression, could exhaust the memory available to the Scrapy process, potentially affecting any other process sharing that memory, and affecting disk usage in case of uncompressed response caching. ### Patches Upgrade to Scrapy 2.11.1. If you are using Scrapy 1.8 or a lower version, and upgrading to Scrapy 2.11.1 is not an option, you may upgrade to Scrapy 1.8.4 instead. ### Workarounds There is no easy workaround. Disabling HTTP decompression altogether is impractical, as HTTP compression is a rather common practice. However, it is technically possible to manually backport the 2.11.1 or 1.8.4 fix, replacing the corresponding components of an unpatched version of Scrapy with patched versions copied into your own code. ### Acknowledgements This security issue was reported by @dmandefy [through huntr.com](https://huntr.com/bounties/c4a0fac9-0c5a-4718-9ee4-2d06d58adabb/).
El proyecto scrapy/scrapy es vulnerable a ataques de entidades externas XML (XXE) debido al uso de lxml.etree.fromstring para analizar datos XML que no son de confianza sin la validación adecuada. Esta vulnerabilidad permite a los atacantes realizar ataques de denegación de servicio, acceder a archivos locales, generar conexiones de red o eludir firewalls enviando datos XML especialmente manipulados.
| Version | Type | Source | Base | Exp | Impact | Vector |
|---|---|---|---|---|---|---|
| 3.0 | Primary | cve.org | 7.5 | — | — | CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H |
| 3.0 | Primary | cve.org | 7.5 | — | — | CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H |
| 3.0 | Secondary | NVD | 7.5 | 3.9 | 3.6 | CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H |
| 3.1 | Secondary | GHSA | 7.5 | — | — | CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H |