Skip to content

Providing Input to XMLUnit

Stefan Bodewig edited this page May 23, 2015 · 16 revisions

Source and ISource

All core parts of XMLUnit use a single abstraction for "pieces of XML" they are supposed to work on. For Java this is javax.xml.transform.Source and for .NET we've created Org.XmlUnit.ISource which basically adds a wrapper around an XmlReader.

For Java many implementations of said interface are part of the Java class library, for .NET we've added the corresponding

  • ReaderSource - just wraps an existing XmlReader
  • DOMSource - creates a Source from an XmlNode
  • StreamSource - creates a Source from a TextReader, Stream or a string holding an URI
  • LinqSource - creates a Source from an XNode

At the time of this writing there is no XML-Serialization based equivalent of JAXBSource for .NET.

In order to make it easier to create instances of Source or ISource there a builder, that provides a fluent API.

CommentLessSource

CommentLessSource is a decorator of a different source and provides XML that consists of the original source's content with all comments removed.

Use this wrapper if you want XMLUnit to ignore comments.

This is class is used under the covers if you tell DiffBuilder to ignore comments.

WhitespaceStrippedSource

WhitespaceStrippedSource is a decorator of a different source that removes all empty text nodes and trims the remaining text nodes.

The main use of this decorator is to remove all "element content whitespace", i.e. text content between XML elements that is just an artifact of "pretty printing" XML.

Examples

Empty text nodes are removed:

<element>
</element>

becomes

<element></element>

Text Nodes are stripped:

<element>
  foo
</element>

becomes

<element>foo</element>

It is not completely predictable what happens to bigger textual content,

<element>
  foo
  bar
</element>

could become

<element>foobar</element>

or

<element>
foo
bar
</element>

depending on how the XML parser provides the content. The parser is free to create multiple Text nodes from the element's content. In order to get more control the input had to be normalized (using Document.normalize()) before wrapping it in a WhitespaceStrippedSource.

WhitespaceNormalizedSource

WhitespaceNormalizedSource is a decorator of a different source that replaces all whitespace characters found in Text nodes with Space characters and collapses consecutive whitespace characters into a single Space.

Examples

<element>a

    b
</element>

becomes

<element>a b </element>

InputBuilder

Clone this wiki locally