1. Introduction
Supercharge Your XML Querying: The Ultimate XQuery Power Guide : In our exploration of XML technologies, we’ve covered structuring, validating, and transforming XML documents. Now, we turn our attention to the powerful language designed specifically for querying and retrieving information from XML data: XQuery. Standing for XML Query Language, XQuery provides a flexible and robust way to extract precisely the data you need from one or more XML documents. Think of XQuery as the SQL for XML, allowing you to formulate sophisticated queries to filter, sort, join, and otherwise manipulate XML data.
XQuery is a functional programming language built upon XPath expressions. It leverages the path-based navigation capabilities of XPath and extends them with powerful constructs for iterating, filtering, and constructing new XML structures. This makes it an indispensable tool for applications that need to process and analyze XML data, whether it resides in files, databases, or is transmitted across networks.
This ultimate guide aims to supercharge your XML querying abilities by providing a comprehensive overview of XQuery’s core concepts, syntax, and powerful expressions. We will delve into the fundamental building blocks of XQuery, including its use of XPath, its key constructs like FLWOR expressions, and its rich set of built-in functions. By the end of this guide, you will be equipped with the knowledge and skills to write effective XQuery queries that can extract, transform, and reshape XML data to meet your specific requirements, truly unlocking the power within your XML documents.
2. Core Concepts of XQuery
To effectively harness the power of XQuery, it’s essential to understand its fundamental concepts and how they build upon our existing knowledge of XML and XPath.
- XPath Foundation: XQuery is built upon XPath 2.0 (and later versions). This means that all the axes, functions, and path expressions we learned in our XPath guide are also valid in XQuery. XQuery uses XPath to select nodes from XML documents.
- Sequences: A fundamental concept in XQuery is the sequence. A sequence is an ordered collection of zero or more items. An item can be a node (from an XML document), an atomic value (like a string, number, or boolean), or even another sequence. Sequences are a key aspect of how XQuery processes and returns data.
- FLWOR Expressions: The most common and powerful construct in XQuery for querying and constructing XML data is the FLWOR expression. FLWOR is an acronym that stands for:
for
: Defines one or more variables and binds them to sequences of items (often nodes selected using XPath). This is how you iterate over parts of your XML data.let
: Allows you to define variables and assign them the result of an expression (which could be a single item or a sequence).where
: An optional clause used to filter the items generated in thefor
clause based on a boolean condition.order by
: An optional clause used to specify the order in which the results should be returned.return
: Specifies the XML structure or values that should be constructed and returned for each item that satisfies thefor
andwhere
clauses (and is ordered by theorder by
clause).
for $x in /bookstore/book
where $x/price > 30
order by $x/title
return <book-title>{$x/title/text()}</book-title>
This query iterates over all book
elements in a bookstore
document, filters those with a price greater than 30, orders them by title, and then returns a new XML element <book-title>
containing the text of the title.
- XML Construction: XQuery allows you to construct new XML structures in the
return
clause of a FLWOR expression or in other parts of your queries. You can create elements using direct XML construction (as seen in the example above with<book-title>
) or computed element constructors. You can also create attributes and text nodes. - Functions: XQuery has a rich library of built-in functions for working with strings, numbers, dates, times, booleans, and nodes. You can also define your own custom functions to encapsulate reusable logic.
- Data Sources: XQuery can query data from various sources, including XML documents stored in files, XML databases, and even data exposed through web services. You typically use a function like
doc("filename.xml")
to access an XML document from a file.
3. The Power of FLWOR Expressions
Let’s delve deeper into the anatomy and power of FLWOR expressions, which are central to writing effective XQuery queries.
- The
for
Clause: Iterating Over Sequences: Thefor
clause is used to iterate over one or more sequences. You can bind variables to each item in the sequence. Often, this involves using XPath to select a set of nodes from an XML document. You can have multiplefor
clauses in a single FLWOR expression, which effectively creates nested loops or performs joins between different sequences.
for $book in doc("bookstore.xml")/bookstore/book
for $author in $book/author
return <book-author>{$book/title/text()} - {$author/text()}</book-author>
This example iterates over each book
in the bookstore.xml
document and then, for each book, iterates over its author
elements, returning a new element combining the book’s title and author.
- The
let
Clause: Defining Variables: Thelet
clause allows you to define variables and assign them the result of any XQuery expression. This is useful for storing intermediate results, performing calculations once and reusing them, or simply making your queries more readable.
for $book in doc("bookstore.xml")/bookstore/book
let $price := $book/price
where $price > 50
return <expensive-book title="{$book/title/text()}" price="{$price/text()}"/>
Here, we define a variable $price
to store the price of each book, which is then used in the where
clause and in the construction of the result.
- The
where
Clause: Filtering Results: Thewhere
clause allows you to filter the items generated by thefor
clause based on a boolean condition. This condition can involve comparisons, function calls, or logical operators (and
,or
,not
).
for $book in doc("bookstore.xml")/bookstore/book
where $book/@category = "COOKING" and $book/year < 2000
return <old-cooking-book title="{$book/title/text()}"/>
This query selects books from the “COOKING” category that were published before the year 2000.
- The
order by
Clause: Sorting Results: Theorder by
clause allows you to specify the order in which the results should be returned. You can sort by one or more expressions and specify whether the order should be ascending (ascending
) or descending (descending
).
for $book in doc("bookstore.xml")/bookstore/book
order by $book/price descending
return <book-with-price title="{$book/title/text()}" price="{$book/price/text()}"/>
This query returns books ordered from the most expensive to the least expensive.
- The
return
Clause: Constructing the Output: Thereturn
clause specifies what should be output for each item that satisfies the conditions in the preceding clauses. You can construct new XML elements, attributes, text nodes, or simply return values from the input XML.
1. Direct XML Construction: You can create XML elements by enclosing element and attribute names in angle brackets. You can embed XQuery expressions within these elements using curly braces {}
.
<book-info>
<title>{doc("bookstore.xml")/bookstore/book[1]/title/text()}</title>
<price currency="USD">{doc("bookstore.xml")/bookstore/book[1]/price/text()}</price>
</book-info>
2. Computed Element and Attribute Constructors: You can also create elements and attributes dynamically using computed constructors. This is useful when the names of the elements or attributes are not known in advance or need to be computed.
for $name in ("book-title", "book-price")
return element {$name} { ... }
for $attrName in ("id", "class")
return attribute {$attrName} { ... }
4. Powerful XQuery Functions
XQuery provides a rich set of built-in functions for various operations. Here are some key categories and examples:
- Node Functions:
doc(uri)
: Retrieves an XML document from the specified URI.root()
: Returns the root node of the document containing the context node.name($node)
: Returns the expanded name of a node.local-name($node)
: Returns the local part of the name of a node.namespace-uri($node)
: Returns the namespace URI of a node.data($node)
: Returns the typed value of a node.
- String Functions:
string($arg)
: Returns the string representation of an argument.concat($string1, $string2, ...)
: Concatenates strings.substring($string, $start, $length?)
: Returns a substring.string-length($string?)
: Returns the length of a string.upper-case($string?)
: Converts a string to upper case.lower-case($string?)
: Converts a string to lower case.contains($string1, $string2)
: Checks if a string contains another.starts-with($string1, $string2)
: Checks if a string starts with another.ends-with($string1, $string2)
: Checks if a string ends with another.replace($input, $pattern, $replacement)
: Replaces occurrences of a pattern in a string.tokenize($input, $pattern?)
: Splits a string into a sequence of substrings based on a delimiter.
- Numeric Functions:
number($arg)
: Converts an argument to a number.sum($sequence)
: Returns the sum of numbers in a sequence.avg($sequence)
: Returns the average of numbers in a sequence.min($sequence)
: Returns the minimum value in a sequence.max($sequence)
: Returns the maximum value in a sequence.count($sequence)
: Returns the number of items in a sequence.abs($number)
: Returns the absolute value of a number.ceiling($number)
: Returns the smallest integer greater than or equal to a number.floor($number)
: Returns the largest integer less than or equal to a number.round($number)
: Returns the number rounded to the nearest integer.
- Boolean Functions:
true()
: Returns true.false()
: Returns false.not($arg)
: Returns the boolean negation of an argument.boolean($arg)
: Converts an argument to a boolean value.
- Date and Time Functions: XQuery provides a variety of functions for working with dates and times, such as
current-date()
,current-time()
,current-dateTime()
, functions to extract parts of dates and times (year, month, day, hour, minute, second), and functions to format dates and times. - Sequence Functions:
distinct-values($sequence)
: Returns a sequence containing only the distinct values from the input sequence.index-of($sequence, $item)
: Returns a sequence of integers indicating the positions of an item in a sequence.insert-before($sequence, $index, $items)
: Inserts items into a sequence before a specified index.remove($sequence, $index)
: Removes the item at a specified index from a sequence.reverse($sequence)
: Returns a sequence with the items in reverse order.
5. Examples of Powerful XQuery Queries
Let’s revisit our bookstore.xml
example and write some powerful XQuery queries:
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
<book category="COOKING">
<title lang="en">Italian Classics</title>
<author>Marcella Hazan</author>
<year>1991</year>
<price>85.00</price>
</book>
<magazine category="HOME">
<title lang="en">House Beautiful</title>
<publisher>Hearst</publisher>
</magazine>
</bookstore>
- Find all book titles and their prices:
for $book in doc("bookstore.xml")/bookstore/book
return <book-price title="{$book/title/text()}" price="{$book/price/text()}"/>
- Find books published after 2000 and list their titles:
for $book in doc("bookstore.xml")/bookstore/book
where $book/year > 2000
return <title>{$book/title/text()}</title>
- Find the average price of all books:
avg(doc("bookstore.xml")/bookstore/book/price/number(.))
- List all distinct authors
distinct-values(doc("bookstore.xml")/bookstore/book/author/text())
- Construct an HTML list of book titles:
<html>
<body>
<h1>Book Titles</h1>
<ul>
{
for $book in doc("bookstore.xml")/bookstore/book
order by $book/title
return <li>{$book/title/text()}</li>
}
</ul>
</body>
</html>
6. Conclusion
XQuery is a powerful and versatile language for querying and manipulating XML data. Built upon the foundation of XPath, it provides expressive constructs like FLWOR expressions and a rich library of functions that enable you to extract, filter, sort, and reshape XML data with remarkable precision. Whether you need to retrieve specific information, perform complex data transformations, or generate reports from XML sources, mastering XQuery will significantly enhance your ability to work with XML effectively. Its ability to query data from various sources and construct new XML structures makes it an indispensable tool in modern data processing and integration scenarios. Continue to explore the capabilities of XQuery, experiment with different queries, and unleash its full potential to supercharge your interaction with XML data.