XQuery

1. Overview (Current Status of XQuery Support)

eXist implements core XQuery syntax (with the exception of XML Schema-related features) as specified in W3C's recommendations outlined in the June 2006 candidate recommendation. Functions in the standard function library should also follow recommendations in the June 2006 version of the "XQuery 1.0 and XPath 2.0 Functions and Operators" recommendation. However, the function library is not yet completely aligned with the latest version of the spec. Some function signatures may differ, other functions could be missing. Please check against the local function library documentation.

The XQuery implementation is tested against the official XQuery Test Suite (XQTS). This suite contains more than 10,000 tests. As of June 2006, eXist passes > 80% of the test suite, but we are continuously trying to improve these results. Current test results are reported on the web site. A good part of the remaining failures is caused by smaller problems with value comparisons, formatting of floating point numbers and the missing schema support in eXist.

Note

eXist implements ALL features described in the XQuery specification with the EXCEPTION of the following unsupported features:

  • Schema-related Features: validate, import schema.

    eXist's XQuery processor does currently not support the schema import and schema validation features defined as optional in the XQuery specification. The database does not store type information along with the nodes. It therefore cannot know the typed value of a node and has to assume xs:untypedAtomic. This is the behaviour defined by the XQuery specification.

    Also, you currently can't specify a data type in an element or attribute test. The node test element(test-node) is supported, but the test element(test-node, xs:integer) will result in a syntax error.

    Important

    To avoid misunderstandings: eXist does nevertheless support strong typing whenever the expected type of an expression, a function argument or function return value is explicitely specified or can be known otherwise. Don't expect eXist to be lax about type checks!

  • Atomic Data Types: xs:gYear, xs:gMonth, etc. (These types are seldom used).

  • XPath Axes: following, preceding

    Both axes work with an explicit test for an attribute or element (e.g. following::foo will work, but following::* will not).

In addition to the standard features, eXist provides extended support for modules and implements the full axis feature, which means you can use the optional axes: ancestor, ancestor-or-self, following, following-sibling, preceding, and preceding-sibling.

2. Function Library

A complete list of eXist-supported XQuery functions can be viewed in the Built-in Functions section. Note that this list is dynamically generated using an XQuery script, so the database needs to be running to view this page. Each function description is taken directly from the signature provided by the class implementing the Function interface.

3. The Module System

Using eXist, you can write entire web applications in XQuery. This may result in rather complex XQuery scripts, consisting of several thousand lines of code. Being able to package related functions into modules is thus an important feature. eXist allows modules to be imported from a variety of sources:

For example, a typical import statement in an XQuery will look like this:

import module namespace status="http://exist-db.org/xquery/admin-interface/status"
at "http://exist-db.org/modules/test.xqm";

Provided that the module namespace does not point to one of the preloaded standard modules (see below), the query engine will try to locate the module source by looking at the URI given after the at keyword. In the example above, the module was specified using a full URI and the query engine will attempt to load the module source from the given URI. However, the module could also be stored in a database collection:

import module namespace status="http://exist-db.org/xquery/admin-interface/status"
at "xmldb:exist:///db/modules/test.xqm";

The query engine recognizes that the module should be stored in the local database instance and tries to directly compile it from there.

If the XQuery module is part of a Java application, it might also be an option, to pack the module into a Java archive (.jar file) along with the Java classes and use the following import to load the module from a Java package:

import module namespace status="http://exist-db.org/xquery/admin-interface/status"
at "resource:org/exist/xquery/lib/test.xqm";

Finally, XQuery modules can also be implemented in Java (see below), in which case you can import them by specifying the class path of the Module class:

import module namespace xdiff="http://exist-db.org/xquery/xmldiff"
at "java:org.exist.xquery.modules.xmldiff.XmlDiffModule";

3.1. Using Relative URIs

If the location specified in an import statement is a relative URI, the query engine will try to load the module relativ to the current module load path. The module load path is determined as follows:

  1. if the main XQuery was retrieved from the file system, the module load path points to that directory. This applies to queries executed through the XQueryServlet, XQueryGenerator or the Java admin client.

  2. if the main XQuery was loaded from a database collection, the module load path is the URI of that collection.

    For example, if you access an XQuery via the REST server:

    http://localhost:8080/exist/servlet/db/modules/test.xq

    All relative module paths will be resolved relative to the /db/modules collection.

3.2. Builtin Modules

eXist comes with a set of utility modules, which are all implemented in Java. You can also write your own modules as described below. Some of these utility modules are frequently used in queries, so they are automatically imported into every query by default.

The query engine allows you to configure which modules will be auto-loaded. The <builtin-modules> element in conf.xml lists the namespaces and implementing classes of all modules to be preloaded into queries:

Example: Auto-loaded Modules

<xquery enable-java-binding="no">
    <builtin-modules>
        <module uri="http://exist-db.org/xquery/util"
            class="org.exist.xquery.functions.util.UtilModule"/>
        <module uri="http://exist-db.org/xquery/transform"
            class="org.exist.xquery.functions.transform.TransformModule"/>
                    

You never need to specify a location when importing a preloaded module. The namespace of the module is already known and eXist knows how to load it. Also, auto-loaded modules don't need to be explicitely imported into the main XQuery, though you still need to import them if you want to use them from within another XQuery module.

4. XQuery Caching

XQuery modules executed via the REST interface, the XQueryServlet or XQueryGenerator are automatically cached: the compiled expression will be added to an internal pool of prepared queries. The next time a query or module is loaded from the same location, it will not be compiled again. Instead, the already compiled code is reused. The code will only be recompiled if eXist decides that the source was modified or it wasn't used for a longer period of time.

Modules are cached along with the main query that imported them.

5. eXist Extension Functions

eXist offers a number of additional functions and operators, which are discussed in some detail in the following subsections.

5.1. Specifying the Input Document Set

A database can contain a virtually unlimited set of collections and documents. By default, database queries that use the XML:DB API will only process the documents in the current XML:DB collection. However, four additional functions are available to change this behavior: doc(), document(), collection() and xcollection(). The collection() and doc() functions are standard XQuery/XPath functions; whereas, xcollection() and document() are eXist-specific extensions.

Without an URI scheme, eXist interprets the arguments to collection and doc as absolute or relative paths, leading to some collection or document within the database. For example:

doc("/db/collection1/collection2/resource.xml)

refers to a resource stored in /db/collection1/collection2.

doc("resource.xml")

references a resource relative to the base URI property defined in the static XQuery context. The base URI contains an XML:DB URI pointing to the base collection for the current query context, e.g. xmldb:exist:///db.

The base collection depends on how the query context was initialized. If you call a query via the XML:DB API, the base collection is the collection from which the query service was obtained. All relative URLs will be resolved relative to that collection. If a stored query is executed via REST, the base collection is the collection in which the XQuery source resides. In most other cases, the base collection will point to the database root /db.

Note

As it might not always be clear what the base collection is, we recommend to use an explicit path to access a document. This makes it easier to use a query via different interfaces.

You can also pass a full URI to the doc function:

doc("http://localhost:8080/exist/servlet/db/test.xml")

in this case, the URI will be retrieved and the data stored into a temporary document in the database.

doc() / document()

While doc() is restricted to a single document-URI argument, document() accepts multiple document paths to be included into the input node set. Second, calling document() without an argument includes EVERY document node in the current database instance. Some examples:

document()//SPEAKER
document('/db/test/abc.xml', '/db/test/def.xml')//title

collection() / xcollection()

The collection() function specifies the collection of documents to be included in the query evaluation. By default, documents found in subcollections of the specified collection are also included. For example, suppose we have a collection /db/test that includes two subcollections /db/test/abc and /db/test/def. In this case, the function call collection('/db/test') will include all of the resources found in /db/test, /db/test/abc and /db/test/def.

The function xcollection() can be used to change the behavior of collection(). For instance, the function call

xcollection('/db/test')//title

will ONLY include resources found in /db/test, but NOT in /db/test/abc or /db/test/def.

5.2. Querying Text (Fulltext Searching)

The standard XPath/XQuery function library contains most of the common string manipulation functions provided by most programming languages. However, these functions are insufficient for conducting keyword or phrase searches inside a larger portion of text or mixed content. This is a weak point if you have to work with document-centric (i.e. mainly free-form text), as opposed to data-centric documents. For many types of documents, the standard string functions do not yield satisfactory search results.

For example, suppose upon reading a chapter in an electronic text, you encountered something about "XML" and "databases", but later you could not recall the exact section where you read it. Using standard XPath, you could try a query like:

//chapter[contains(., 'XML') and contains(., 'databases')]

This query execution will likely be quite slow, since the XPath engine will, in this case, scan the entire character content of all chapter nodes and their descendants. And yet, there is no certainty that all possible text matches will be found - for example, "databases" might have been written with a capital letter at the start of the sentence, and so would not be included in the results.

To resolve this issue, eXist offers additional operators and extension functions for efficient, index-based access to the full-text content of nodes. With eXist, you could alternatively formulate the above query as follows:

//chapter[near(., 'XML database?', 50)]

This will return all chapters containing both keywords in the correct order, and as well, will find matches that have under 50 words between them. Additionally, the wildcard character ? in database? will match the singular as well as the plural instances of "database", and the search would NOT be case-sensitive. Furthermore, since the query is index-based, it will usually be an order of magnitude faster than the standard XPath query above.

Operators

In this section, we discuss each of eXist's text-search extensions. In cases where the order and distance of search terms is not important, eXist offers two additional operators for simple keyword queries: &= and |=.

node-set &= 'string of keywords'

This operator selects context nodes containing ALL of the keywords in the right-hand argument in any order. The default tokenizer is used to split the right-hand argument into single tokens, i.e. any punctuation or white spaces are used to separate the keywords and, after which, are omitted. Note also that wildcards are allowed, and keyword comparison is NOT case-sensitive.

node-set |= 'string of keywords'

Similar to above, this operator selects context nodes containing ANY of the keywords in the right-hand argument.

Note

With the &= and |=operations, keyword search strings are split into tokens using the default tokenizer function. The current implementation of this operation will work well for all European languages. For non-European languages, however, eXist uses the predefined Unicode code points (0 to 10FFFF) to determine where the string will be split.

Both of the above operators accept simple wildcards in the keyword string. A ? matches zero or one character, * matches zero or more characters. A character range [abc] (as a regular expression) matches any of the characters in that range. You may use a backslash to escape wildcard characters.

To match more complex patterns, full regular expression syntax is supported through additional functions, which are discussed below.

Note

There is an important semantic difference between the following two expressions:

document()//SPEECH[LINE &= "cursed spite"]

and

document()//SPEECH[LINE &= "cursed" and LINE &="spite"]

The first expression selects all distinct LINE nodes that contain both of the search terms. The second expression selects all context nodes (SPEECH nodes) that have LINE children containing either or both of the terms, and should yield more results than the first one. To make the first expression select the same nodes (at least, nearly the same nodes), you would have to change the first expression to:

document()//SPEECH[. &= "cursed spite"]

Note, however, that this new expression will also include other nodes, for instance SPEAKER or STAGEDIR, which are children of the SPEECH parent node.

Functions

near()

As shown in a previous example, the near() function behaves quite similarly to the &= operator, but also pays attention to the order of search terms and their distance from each other in the source document.

The syntax for this function is as follows:

near(node-list, 'string of keywords' [, max-distance])

The function measures the distance between two search terms by counting the number of words between them. A maximum distance of 1 is assumed by default, in which case the search terms occur next to each other. Other values for the maximal and minimal distance may be specified in the optional third argument. As a special case, if the string in the second argument contains only one token, any distance values in the third and fourth argument are ignored, and the function performs identically to the &= operator. For example, with the following search expression:

document()//SPEECH[near(., 'love marriage', 25)

the search engine will return any SPEECH elements containing the words "love" and "marriage" within the range of 25 words between them.

Similar to the &= operator, near() accepts wildcards in the keyword string, and punctuation and whitespace will be skipped according to the default tokenization rules.

match-all()/match-any()

These two functions are variations of the &= and |= operators, and interpret their arguments as regular expressions. However, contrary to the matches() function in the XQuery core library, match-all() and match-any() try to match the regular expression argument against the keywords contained in the full-text index, but NOT against the entire text.

For example, assume you have a document that contains the following paragraph:

<para>Peter lives in Frankfurt</para>

Then the following expression:

match-all(para, "li[vf]e.?", "frank.*")

will match this paragraph because it contains two keywords matching the specified regular expression patterns.

match-all() corresponds to &= in that it will select context nodes with keywords matching ALL of the specified regular expressions. match-any() will select nodes with keywords matching ANY of the specified regular expression.

Since tokenization doesn't work correctly with regular expression patterns, each keyword has to be specified as a separate argument, so the syntax looks like:

match-all(node-set, 'regexp' [, 'regexp' ...])

Note

Please note that the match-all() and match-any() functions will try to match the regular expression against the entire keyword. For example, the expression

//SPEECH[match-all(LINE, 'li[vf]e')]

will match 'live', 'life', but not 'lives'.

eXist uses the java.util.regex API for regular expressions. A description of the supported regexp syntax can be found on the Sun Java Tutorial.

5.3. Manipulating Database Contents

The XML:DB extension functions can be used to create new database collections, or to store query output into the database. To illustrate, suppose we have a large file containing several RDF metadata records, but we do not want to store the metadata records in a single file, since our application expects each record to have its own document. In this case, we must divide the document into smaller units. Using an XSLT stylesheet would be one way to accomplish this - however, it is also a memory-intensive approach. A preferable option is to use XQuery to do the job.

The XQuery script below shows how to split a large RDF file into a series of smaller documents:

Example: Splitting a Document


xquery version "1.0";

declare namespace  xdb="http://exist-db.org/xquery/xmldb";
declare namespace util="http://exist-db.org/xquery/util";

xdb:register-database("org.exist.xmldb.DatabaseImpl", true()),
let $root := xdb:collection("xmldb:exist:///db", "admin", ""),
    $out := xdb:create-collection($root, "output")
for $rec in /rdf:RDF/* return
    xdb:store($out, concat(util:md5($rec/@rdf:about), ".xml"),
        <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
            {$rec}
        </rdf:RDF>
    )

                    

Let's look at this script in some detail. First, since the extension functions are based on the XML:DB API, we have to register a database driver with xdb:register-database. Then we can retrieve the root collection using xdb:collection. The variable $root returned by xdb:collection holds the collection as a Java object, and we can pass this variable to xdb:create-collection to create a new sub-collection, called "output".

Next, the for-loop iterates through all child elements of the top RDF element. In each iteration, we use xdb:store to write out the current child node to a new document. Since a unique document name is required for each new document, we need a way to generate unique names. In this case, the URI contained in the rdf:about attribute is unique, so we simply compute an MD5 key from it, append the ".xml" extension, and use it as the document's name.

5.4. Utility Functions

The function namespace http://exist-db.org/xquery/util contains a number of common utility functions. The util:md5 function, used in the above example, is one such function. In this subsection, we discuss a few of the important available utility functions. For a complete list, please go to the Built-in Functions section.

However, one important function, util:eval, requires further explanation.

util:eval()

This function is used to dynamically execute a constructed XQuery expression inside a running XQuery script. This can be very handy in some cases - for example, web-based applications that dynamically generate queries based on HTTP request parameters the user has passed.

Consider the following simple example script in which any two numbers submitted by a user are added or subtracted:

Example: Adding/Subtracting Two Numbers

                                    xquery version "1.0";

declare namespace request="http://exist-db.org/xquery/request";
declare namespace util="http://exist-db.org/xquery/util";

declare function local:do-query() as element()
{
    let $n1 := request:get-parameter("n1", ""),
		$n2 := request:get-parameter("n2", ""),
		$op := request:get-parameter("op", "")
	return
		if($n1 = "" or $n2 = "") then
			<p>Please enter two operands.</p>
		else
			let $query := concat($n1, " ", $op, " ", $n2)
			return
				<p>{$query} = {util:eval($query)}</p>
};

<html>
	<body>
		<h1>Enter two numbers</h1>

		<form action="{request:get-uri()}" method="get">
			<table border="0" cellpadding="5">
			<tr>
				<td>First number:</td>
				<td><input name="n1" size="4"/></td>
			</tr>
			<tr>
				<td>Operator:</td>
				<td>
					<select name="op">
						<option name="+">+</option>
						<option name="-">-</option>
					</select>
				</td>
			</tr>
			<tr>
				<td>Second number:</td>
				<td><input name="n2" size="4"/></td>
			</tr>
			<tr>
				<td colspan="2"><input type="submit"/></td>
			</tr>
			</table>
		</form>

		{ local:do-query() }
	</body>
</html>
                                

In this example, there is one XQuery script responsible for evaluating the user-supplied parameters, which uses the parameters from the HTTP request to construct another XQuery expression, which it then passes to util:eval for evaluation. The application would then post-process the returned results, and display them to the user. (For more information on how to write web applications using XQuery, go to our Developer's Guide.)

5.5. XSL Transformations

eXist has a function for directly applying an XSL stylesheet to an XML fragment within the XQuery script. This can be very convenient, for example, if your application requirements are basic, and you do not want to install Cocoon to run the XSLT.

transform:transform()

This XSL transformation function has the following signature:

transform:transform($input as node()?, $stylesheet as
item(), $parameters as node()?) as node()?

transform:transform expects the node to be transformed in the first argument $input. If $input is an empty sequence, the function returns immediately.

The XSL stylesheet will be read from the location specified in $stylesheet, which should be either an URI or a node. If $stylesheet is of type xs:anyURI, the function will attempt to load the stylesheet from the specified location. A relative URI is interpreted as a file path. The function then tries to locate the stylesheet in the same way as imported XQuery modules, i.e. relative to the module load directory determined by the static XQuery context.

Some examples for referencing the stylesheet:

transform:transform($root, doc("/db/styles/style.xsl"), ())

Creates the stylesheet from a document node.

transform:transform($root, xs:anyURI("style.xsl"), ())

Loads the stylesheet from the file style.xsl. The function usually expects the file to reside in the same directory as the main query.

transform:transform($root, xs:anyURI("http:exist-db.org/style.xsl"), ())
transform:transform($root, xs:anyURI("xmldb:exist:///db/styles/style.xsl"), ())

The last two examples try to load the stylesheet from an URI. However, the "xmldb:" URI points to a resource stored in the database.

The stylesheet will be compiled into a template using the standard Java APIs (javax.xml.transform). The template is shared between all instances of the function and will only be reloaded if modified since its last invocation.

The $options parameter can be used to pass stylesheet parameters to the XSL processor as an XML fragment - for example:

<parameters>
<param name="param1" value="value1"/>
<param name="param2" value="value2"/>
</parameters>

This will set the stylesheet parameter param1 to the string value value1, and in the XSL stylesheet, the parameter can then be referenced as follows:

<xsl:param name="param1"/>

transform:stream-transform()

Identical to the transform:transform function, but it directly streams the transformation result to the HTTP request output stream and doesn't return anything. The function is thus only usable in a web context. Note that the servlet output stream will be closed afterwards.

5.6. HTTP-Related Functions

eXist offers functions for handling HTTP request parameters and session variables that use the http://exist-db.org/xquery/request namespace. Functions in this namespace are only usable if the query is executed through the XQueryGenerator or the XQueryServlet (for more information consult eXist's Developer's Guide ).

request:get-parameter(name, default value)

This HTTP function expects two arguments: the first denotes the name of the parameter, the second specifies a default value, which is returned if the parameter is not set. This function returns a sequence containing the values for the parameter. The above script (Adding/Subtracting Two Numbers) offers an example of how request:get-parameter can be used to read HTTP request parameters.

request:get-uri()

This function returns the URI of the current request. To encode this URI using the current session identifier, use the following function:

session:encode-url(request:get-uri())

session:create()

This function creates a new HTTP session if none exists.

Other session functions read and set session attributes, among other operations. For example, an XQuery or Java object value can be stored in a session attribute, to cache query results. For more example scripts, please look at our Examples page, under the XQuery Examples section.

6. Java Binding

eXist supports calls to arbitrary Java methods from within XQuery. The binding mechanism follows the short-cut technique introduced by Saxon (a collection of tools for processing XML documents). The class where the external function will be found is identified by the namespace URI of the function call. The namespace URI should start with the prefix java: followed by the fully qualified class name of the class. For example, the following code snippet calls the static method sqrt (square-root function) of class java.lang.Math:

Example: Calling a Static Method

declare namespace math="java:java.lang.Math";
math:sqrt(2)
				

Note that if the function name contains a hyphen, the letter following the hyphen is converted to upper-case and the hyphen is removed (i.e. it applies the CamelCase naming convention), and so, to-string() will call the Java method toString().

If more than one method in the class matches the given name and parameter count, eXist tries to select the method that best fits the passed parameter types at runtime. The result of the method call can be assigned to an XQuery variable. If possible, it will be mapped to the corresponding XML schema type. Otherwise, it's type is the built-in type object.

Java constructors are called using the function new. Again, a matching constructor is selected by looking at the parameter count and types. The returned value is a new Java object with the built-in type object.

Instance methods are called by supplying a valid Java object as first parameter. The Java object has to be an instance of the given class. For example, the following snippet lists all files and directories in the current directory:

Example: List Contents of the Current Directory

declare namespace file="java:java.io.File";

<files>
    {
        for $f in file:list-files( file:new(".") )
        let $n := file:get-name($f)
        order by $n
        return
            if (file:is-directory($f)) then
                <directory name="{ $n }"/>
            else
                <file name="{ $n }" size="{ file:length($f) }"/>
    }
</files>       			
        		

Note

For security reasons, the Java binding is disabled by default. To enable it, the attribute enable-java-binding in the central configuration file has to be set to yes:

<xquery enable-java-binding="yes">

Enabling the Java binding bears some risks: if you allow users to directly pass XQuery code to the database, e.g. through the sandbox application, they might use Java methods to inspect your system or execute potentially destructive code on the server.

The XACML package therefore allows for fine-grained control of the Java binding feature, e.g. restricting access to certain Java classes. Please make sure you have properly set up XACML if you are planning to access Java code via XQuery on a production system.

7. Creating XQuery Modules

eXist supports XQuery-based library modules. These modules are simply collections of function definitions and global variable declarations, of which eXist knows two types: External Modules, which are themselves written in XQuery, and Internal Modules, which are implemented in Java. The standard XPath/XQuery functions and all extension functions described in the above sections are defined through internal modules.

You can declare an XQuery file as a module and import it using the import module directive. The XQuery engine imports each module only once during compilation. The compiled module is then made available through the static XQuery context.

Users can also provide additional Java modules. With the current release, it is relatively simple to add these modules, although the current API may change in the future. To register a module, eXist requires a namespace URI by which the module is identified, and the list of functions it supplies. For this, you need only to pass a driver class to the XQuery engine, and this class should implement the interface org.exist.xpath.InternalModule.

Moreover, the class org.exist.xpath.AbstractInternalModule already provides an implementation skeleton. The class constructor expects an array of function definitions for all functions that should be registered. A function definition (class FunctionDef) has two properties: the static signature of the function (as an instance of FunctionSignature), and the Java Class that implements the function.

A function is a class extending org.exist.xpath.Function or org.exist.xpath.BasicFunction. Functions without special requirements (e.g. overloading) should subclass BasicFunction. To illustrate, the following is a simple function definition:

Example: A Basic Java Function


public class EchoFunction extends BasicFunction {

public final static FunctionSignature signature =
new FunctionSignature(
	new QName("echo", ExampleModule.NAMESPACE_URI, ExampleModule.PREFIX),
	"A useless example function. It just echoes the input parameters.",
    new SequenceType[] {
        new SequenceType(Type.STRING, Cardinality.ZERO_OR_MORE)
    },
	new SequenceType(Type.STRING, Cardinality.ZERO_OR_MORE));

public EchoFunction(XQueryContext context) {
	super(context, signature);
}

public Sequence eval(Sequence[] args, Sequence contextSequence)
throws XPathException {
	// is argument the empty sequence?
	if (args[0].getLength() == 0)
		return Sequence.EMPTY_SEQUENCE;
	// iterate through the argument sequence and echo each item
	ValueSequence result = new ValueSequence();
	for (SequenceIterator i = args[0].iterate(); i.hasNext();) {
		String str = i.nextItem().getStringValue();
		result.add(new StringValue("echo: " + str));
	}
	return result;
}

}

In looking at this sample, first note that every function class has to provide a function signature. The function signature defines the QName by which the function is identified, a documentation string, the sequence types of all arguments, and the sequence type of the returned value. In the example above, we accept a single argument of type xs:string and a cardinality of ZERO_OR_MORE. In other words, we accept any sequence of strings containing zero or more items.

Next, the subclass overwrites the eval method, which has two arguments: the first contains the values of all arguments passed to the function, the second passes the current context sequence (which might be null). Note that the argument values in the array args have already been checked to match the sequence types defined in the function signature. We therefore do not have to recheck the length of the array: if more or less than one argument were passed to the function, an exception would have been thrown before eval gets called.

In XQuery, all values are passed as sequences. A sequence consists of one or more items, and every item is either an atomic value or a node. Furthermore, a single item is also a sequence. The function signature specifies that any sequence containing zero or more strings is acceptable for our method. We therefore have to check if the empty sequence has been passed. In this case, the function call returns immediately. Otherwise, we iterate through each item in the sequence, prepend echo:" to its string value, and add it to the result sequence.

In the next step, we want to add the function to a new module, and therefore provide a driver class. The driver class defines a namespace URI and a default prefix for the module. Functions are registered by passing an array of FunctionDef to the constructor. The following is an example driver class definition:

Example: Creating a Driver Class

public class ExampleModule extends AbstractInternalModule {

public final static String NAMESPACE_URI = 
    "http://exist-db.org/xquery/examples";
	
public final static String PREFIX = "example";
	
private final static FunctionDef[] functions = {
	new FunctionDef(EchoFunction.signature, EchoFunction.class)
};
	
public ExampleModule() {
	super(functions);
}

public String getNamespaceURI() {
	return NAMESPACE_URI;
}

public String getDefaultPrefix() {
	return PREFIX;
}

}

Finally, we are able to use this newly created module in an XQuery script:

Example: Importing a Module

xquery version "1.0";

import module namespace "http://exist-db.org/xquery/examples"
at "java:org.exist.examples.xquery.ExampleModule";

example:echo(("Hello", "World!"))

The query engine recognizes the java: prefix in the location URI, and treats the remaining part (in this case, org.exist.examples.xquery.ExampleModule) as a fully qualified class name leading to the driver class of the module.

8. Using Collations

Collations are used to compare strings in a locale-sensitive fashion. XQuery allows one to specify collations at several places by means of a collation URI. For example, a collation can be specified in the order by clause of a XQuery FLWOR expression, as well as any string-related functions. However, the concrete form of the URI is defined by the eXist implementation. Specifically, eXist recognizes the following URIs:

  1. http://www.w3.org/2004/07/xpath-functions/collation/codepoint

    This URI selects the unicode codepoint collation. This is the default if no collation is specified. Basically, it means that only the standard Java implementations of the comparison and string search functions are used.

  2. http://exist-db.org/collation?lang=xxx&strength=xxx&decomposition=xxx

    or, in a simpler form:

    ?lang=xxx&strength=xxx&decomposition=xxx

    The lang parameter selects a locale, and should have the same form as in xml:lang. For example, we may specify "de" or "de-DE" to select a german locale.

    The strength parameter (optional) value should be one of "primary", "secondary", "tertiary" or "identical".

    The decomposition parameter (optional) has the value of "none", "full" or "standard".

The following example selects a german locale for sorting:

for $w in ("das", "daß", "Buch", "Bücher", "Bauer", "Bäuerin", "Jagen", "Jäger")
order by $w collation "?lang=de-DE"
return $w

And returns the following:

Bauer, Bäuerin, Buch, Bücher, das, daß, Jagen, Jäger

You can also change the default collation:

declare default collation "?lang=de-DE";
"Bäuerin" < "Bier"

Which returns true. Note that if you use the default codepoint collation instead, the comparison would evaluate to false.

9. Serialization Options

The serialization of query results into a binary stream is influenced by a number of parameters. These parameters can be set within the query itself, however the interpretation of the parameters depends on the context in which the query is called. Most output parameters are applicable only if the query is executed using the XQueryGenerator or XQueryServlet servlets, or the REST server.

Serialization parameters can either be set via pragmas (see the following section) or a declare option statement in the query prolog. In declare option, the serialization parameters can be specified as follows:

declare option exist:serialize "method=xhtml media-type=application/xhtml+html";

Here, single options are specified within the string literal, separated by a whitespace. Note also that the option QName must be exist:serialize, where the exist prefix is bound to the namespace http://exist.sourceforge.net/NS/exist, which is declared by default and need not be specified explicitly.

Note that these same options can be passed using the XPathQueryService.setProperty() and XQueryService.setProperty() methods in Java. These methods are defined in javax.xml.transform.OutputKeys and EXistOutputKeys. The latter eXist-specific options include the following:

The general options include the following:

For example, to disable XInclude expansion, and indent the output, you can use the following pragma:

declare option exist:serialize "expand-xincludes=no";

For the output method parameter, eXist currently recognizes three methods: xml, xhtml and text. Note that unlike the xml method, the xhtml setting uses only the short form for elements that are declared empty in the xhtml DTD. For example, the br tag is always returned as <br/>. On the other hand, the text method only returns the contents of elements - for instance, <A>Content</A> is returned as Content. However, attribute values, processing instructions, comments, etc. are all ignored.

10. Other Options

To avoid that the server is blocked by a badly formulated query, eXist watches all query threads. A blocking query can be killed if it takes longer than a specified amount of time or consumes too many memory resources on the server. There are two options to control this behaviour:

declare option exist:timeout "time-in-ms";

Specifies the maximum amount of query processing time (in ms) before it is cancelled by the XQuery engine.

declare option exist:output-size-limit "size-hint";

Defines a limit for the max. size of a document fragment created within an XQuery. The limit is just an estimation, specified in terms of the accumulated number of nodes contained in all generated fragments. This can be used to prevent users from consuming too much memory if they are allowed to pass in their own XQueries.

11. Pragmas

XQuery pragmas are a way to pass implementation-specific information to the query engine from within a XQuery. The syntax for pragmas has changed between the different drafts of the XQuery specification. In earlier eXist releases, pragmas were used similar to what is now the "declare option" prolog expression. The new syntax is quite different: pragmas can now be wrapped around an arbitrary XQuery expression (see the specificiation).

Currently, eXist recognizes only two pragmas: exist:timer and exist:batch-transaction.

11.1. exist:timer

Provides a simple way to measure the time for executing a given expression. For example:

(# exist:timer #) { //some/path/expression }

creates a timer for the expression enclosed in curly braces and prints timing information to the trace logger. Please note that trace needs to be enabled in log4j.xml:

Example: Configure log4j to Display Trace Output

	                    
<root>
	    <priority value="trace"/>
	    <appender-ref ref="console"/>
	</root>
	                

11.2. exist:batch-transaction

Currently only for XQuery Update Extensions. Provides a method for batching updates on the database into a single Transaction, allowing a set of updates to be atomically guaranteed. Also for each affected document or collection, any configured Triggers will only be called once, the prepare() method will be fired before the first update to the configured resource and the finish() method fired after the last update to the configured resource.

(# exist:batch-transaction #) {
update value //some/path/expressionA width "valueA",
update value //some/path/expressionB width "valueB"
}

Uses a single Transaction and Trigger events for the expressions enclosed in curly braces.

We will certainly add more pragma expressions in the near future. Among other things, pragmas are a good way to pass optimization hints to the query engine.