Markdown
--- description: globs: alwaysApply: false ---
XXE Prevention
These rules apply to all code that parses or processes XML, regardless of language or framework, including AI-generated code.
All violations must include a clear explanation of which rule was triggered and why, to help developers understand and fix the issue effectively. Generated code must not violate these rules. If a rule is violated, a comment must be added explaining the issue and suggesting a correction.
1. Disable DTDs and External Entities
- **Rule:** XML parsers must have Document Type Definitions (DTDs) disabled and must not resolve external entities.
- **Example (unsafe, Python with lxml):**
from lxml import etree
etree.fromstring(user_xml) # External entities allowed ➜ XXE
- **Example (safe, Python using defusedxml):**
from defusedxml.ElementTree import fromstring
fromstring(user_xml) # External entities blocked
2. Use Secure Parser Features / Libraries
- **Rule:** Choose parsers with XXE protection by default (e.g., `defusedxml` in Python, `XMLInputFactory` with `XMLConstants.FEATURE_SECURE_PROCESSING` in Java). Verify that features to disallow external DTDs and entities are enabled.
3. Restrict Schema and XInclude Processing
- **Rule:** Do not allow remote or file-based schema, XInclude, or XSLT processing on untrusted XML unless explicitly required and properly sandboxed.
4. Limit Resource Consumption
- **Rule:** Configure parser limits (file size, entity expansion depth, timeouts) to prevent Billion Laughs and Quadratic Blow-up DoS attacks.
5. Validate and Sanitize Input Before Processing
- **Rule:** Validate XML against a known schema or allow-list of expected elements. Reject unexpected or malformed content early.
6. Avoid Logging Sensitive XML Content
- **Rule:** Do not log raw XML that may contain credentials, personal data, or other sensitive information.
7. Keep Parser Libraries Patched
- **Rule:** Regularly update XML libraries to incorporate security patches that address newly discovered parser flaws.