Providing Input to XMLUnit

Source and ISource

All core parts of XMLUnit use a single abstraction for "pieces of XML" they are supposed to work on. For Java this is javax.xml.transform.Source and for .NET we've created Org.XmlUnit.ISource which basically adds a wrapper around an XmlReader.

For Java many implementations of said interface are part of the Java class library, for .NET we've added the corresponding

ReaderSource - just wraps an existing XmlReader
DOMSource - creates a Source from an XmlNode
StreamSource - creates a Source from a TextReader, Stream or a string holding an URI
LinqSource - creates a Source from an XNode

At the time of this writing there is no XML-Serialization based equivalent of JAXBSource for .NET.

In order to make it easier to create instances of Source or ISource there a builder, that provides a fluent API.

CommentLessSource

CommentLessSource is a decorator of a different source and provides XML that consists of the original source's content with all comments removed.

Use this wrapper if you want XMLUnit to ignore comments.

This is class is used under the covers if you tell DiffBuilder to ignore comments.

WhitespaceStrippedSource

WhitespaceStrippedSource is a decorator of a different source that removes all empty text nodes and trims the remaining text nodes.

The main use of this decorator is to remove all "element content whitespace", i.e. text content between XML elements that is just an artifact of "pretty printing" XML.

Examples

Empty text nodes are removed:

<element>
</element>

becomes

<element></element>

Text Nodes are stripped:

<element>
  foo
</element>

becomes

<element>foo</element>

If the XML content has been created in memory rather than been deserialized from an external source it could contain adjacent Text nodes so that

<element>
  foo
  bar
</element>

could become

<element>foobar</element>

or

<element>
foo
bar
</element>

depending on how the document has been structured. In order to get more control the input had to be normalized (using Document.normalize() or XmlDocument.Normalize()) before wrapping it in a WhitespaceStrippedSource - or by using an additional NormalizedSource wrapper.

WhitespaceNormalizedSource

WhitespaceNormalizedSource is a decorator of a different source that replaces all whitespace characters found in Text nodes with Space characters and collapses consecutive whitespace characters into a single Space.

Examples

<element>a

    b
</element>

becomes

<element>a b </element>

NormalizedSource

NormalizedSource performs XML normalization on the wrapped document. This means adjacent text nodes are merged to single nodes and empty Text nodes removed (recursively). For Java when wrapping a Document rather than a Node additional normalizations may be preformed - see XmlNode.Normalize for .NET and Node#normalize as well as Document#normalizeDocument for Java.

When reading documents a parser usually puts the document into normalized form anway. You will only need to perform XML normalization on DOM trees you have created programmatically.

InputBuilder

Overview
General Concepts
Comparing XML
Validating XML
- XXE Prevention
Utilities
Migrating from XMLUnit 1.x to 2.x
Known Issues
- XMLUnit with Java 9 and above
- JAXB

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Providing Input to XMLUnit

Source and ISource

CommentLessSource

WhitespaceStrippedSource

Examples

WhitespaceNormalizedSource

Examples

NormalizedSource

InputBuilder

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally