Skip to main content
Offensive360
Home / Knowledge Base / XPath Injection
High CWE-643 A03:2021 Injection

XPath Injection

XPath injection allows attackers to manipulate XML database queries to bypass authentication, extract unauthorized data, and enumerate XML document structure. Learn detection and secure parameterization.

Affects: JavaC#PythonPHPJavaScript

What is XPath Injection?

XPath injection occurs when user-supplied input is embedded into an XPath query expression without proper encoding or parameterization. XPath is a query language for navigating XML documents, similar in concept to SQL for relational databases. When an application uses XPath to query an XML-based datastore (configuration files, user databases, SOAP services), unsanitized user input can alter the query logic.

Attackers can exploit XPath injection to bypass authentication (similar to SQL injection 1=1 attacks), extract data from any part of the XML document, and in blind scenarios enumerate the entire document structure character by character.

How exploitation works

A login form queries an XML user database:

String query = "//user[name/text()='" + username + "' and password/text()='" + password + "']";

An attacker enters ' or '1'='1 as both username and password:

//user[name/text()='' or '1'='1' and password/text()='' or '1'='1']

This selects all <user> nodes, returning the first user (typically admin) and bypassing authentication entirely.

Blind XPath injection uses substring() and string-length() to extract node values one character at a time:

username: ' or substring(//user[1]/password/text(),1,1)='a

Vulnerable code examples

Java — XPath with user input

// VULNERABLE: User input directly embedded in XPath expression
public boolean login(String username, String password) throws Exception {
    XPathFactory xpf = XPathFactory.newInstance();
    XPath xpath = xpf.newXPath();
    String expr = "//user[name='" + username + "' and password='" + password + "']";
    NodeList nodes = (NodeList) xpath.evaluate(expr, xmlDoc, XPathConstants.NODESET);
    return nodes.getLength() > 0;
}

PHP — XPath query construction

// VULNERABLE: User input in XPath query
$xpath = new DOMXPath($doc);
$query = "//users/user[username='" . $_POST['username'] . "']";
$results = $xpath->query($query);

Secure code examples

Java — parameterized XPath with variable resolver

// SECURE: Use XPath variable binding to separate query from data
public boolean login(String username, String password) throws Exception {
    XPathFactory xpf = XPathFactory.newInstance();
    XPath xpath = xpf.newXPath();

    // Bind variables so user input is never concatenated into the expression
    xpath.setXPathVariableResolver(variableName -> {
        if ("username".equals(variableName.getLocalPart())) return username;
        if ("password".equals(variableName.getLocalPart())) return password;
        return null;
    });

    String expr = "//user[name=$username and password=$password]";
    NodeList nodes = (NodeList) xpath.evaluate(expr, xmlDoc, XPathConstants.NODESET);
    return nodes.getLength() > 0;
}

C# — XPathNavigator with parameterized query

// SECURE: Use XPathExpression with XsltArgumentList for variable substitution
public bool Login(XPathNavigator nav, string username, string password)
{
    // Sanitize by allowing only expected character ranges
    if (!Regex.IsMatch(username, @"^[a-zA-Z0-9_.-]{1,50}$"))
        return false;
    if (!Regex.IsMatch(password, @"^[a-zA-Z0-9!@#$%^&*]{1,100}$"))
        return false;

    // Now safe to embed in expression after allow-list validation
    string expr = $"//user[name='{username}' and password='{HashPassword(password)}']";
    return nav.SelectSingleNode(expr) != null;
}

Python — input allow-list validation

import re
from lxml import etree

def login(xml_doc, username, password):
    # SECURE: Strict allow-list validation before embedding in XPath
    if not re.match(r'^[a-zA-Z0-9_\-\.]{1,50}$', username):
        return False

    # Use parameterized Saxon-style or escape quotes
    safe_username = username.replace("'", "\\'")
    safe_password_hash = hash_password(password)

    tree = etree.fromstring(xml_doc)
    result = tree.xpath(f"//user[name='{safe_username}' and password_hash='{safe_password_hash}']")
    return len(result) > 0

What Offensive360 detects

  • String concatenation in XPath expressionsxpath.evaluate(), xpath.query(), or selectNodes() calls where the expression includes tainted data
  • User input in XPath filter predicates[field=' + userInput + '] patterns in XPath strings
  • Missing variable binding — XPath evaluation without XPathVariableResolver or equivalent parameterization mechanism
  • Absence of input validation — No allow-list or format check before user input is used in XPath

Remediation guidance

  1. Use XPath variable binding — Most XPath APIs support parameterized queries via variable resolvers (Java XPathVariableResolver, Saxon XdmValue parameters). Always prefer this over string concatenation.

  2. Apply strict allow-list validation — If parameterized queries are unavailable, restrict username, ID, and other queried fields to known-safe character sets (alphanumeric, limited punctuation).

  3. Avoid storing auth data in XML — For authentication, use a relational or document database with proper parameterized query support rather than flat XML files.

  4. Escape single quotes at minimum — If parameterization is unavailable and allow-list validation is not possible, replace ' with &apos; in string values used in XPath expressions.

  5. Validate XML document structure — Ensure XML datastores are validated against an XSD schema so injected content that alters document structure is rejected at parse time.

References

By Offensive360 Security Research Reviewed: March 2026

Detect XPath Injection automatically

Run Offensive360 SAST on your codebase to find this and 100+ other vulnerabilities.