Extensible Markup Language - XML

XML parsing module offers Flows for parsing, processing and writing XML documents.

Reported issues

Tagged issues at Github

Artifacts

sbt
libraryDependencies += "com.lightbend.akka" %% "akka-stream-alpakka-xml" % "0.19"
Maven
<dependency>
  <groupId>com.lightbend.akka</groupId>
  <artifactId>akka-stream-alpakka-xml_2.12</artifactId>
  <version>0.19</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.akka', name: 'akka-stream-alpakka-xml_2.12', version: '0.19'
}

XML parsing

XML processing pipeline starts with an XmlParsing.parser flow which parses a stream of ByteStrings to XML parser events.

Scala
val parse = Flow[String]
  .map(ByteString(_))
  .via(XmlParsing.parser)
  .toMat(Sink.seq)(Keep.right)
Java
final Sink<String, CompletionStage<List<ParseEvent>>> parse = Flow.<String>create()
  .map(ByteString::fromString)
  .via(XmlParsing.parser())
  .toMat(Sink.seq(), Keep.right());

Source on Github Source on Github

To parse an XML document run XML document source with this parser.

Scala
val doc = "<doc><elem>elem1</elem><elem>elem2</elem></doc>"
val resultFuture = Source.single(doc).runWith(parse)
Java
final String doc = "<doc><elem>elem1</elem><elem>elem2</elem></doc>";
final CompletionStage<List<ParseEvent>> resultStage = Source.single(doc).runWith(parse, materializer);

XML writing

XML processing pipeline ends with an XmlWriting.writer flow which writes a stream of XML parser events to ByteStrings.

Scala
val writer: Sink[ParseEvent, Future[String]] = Flow[ParseEvent]
  .via(XmlWriting.writer)
  .map[String](_.utf8String)
  .toMat(Sink.fold[String, String]("")((t, u) => t + u))(Keep.right)
Java
final Sink<ParseEvent, CompletionStage<String>> write = Flow.of(ParseEvent.class)
  .via(XmlWriting.writer())
  .map(ByteString::utf8String)
  .toMat(Sink.fold("", (acc, el) -> acc + el), Keep.right());
final Sink<ParseEvent, CompletionStage<String>> write = Flow.of(ParseEvent.class)
        .via(XmlWriting.writer())
        .map(ByteString::utf8String)
        .toMat(Sink.fold("", (acc, el) -> acc + el), Keep.right());

Source on Github Source on Github

To write an XML document run XML document source with this writer.

Scala
val listEl = List(
  StartDocument,
  StartElement(
    "book",
    namespace = Some("urn:loc.gov:books"),
    prefix = Some("bk"),
    namespaceCtx = List(Namespace("urn:loc.gov:books", prefix = Some("bk")),
                        Namespace("urn:ISBN:0-395-36341-6", prefix = Some("isbn")))
  ),
  StartElement(
    "title",
    namespace = Some("urn:loc.gov:books"),
    prefix = Some("bk")
  ),
  Characters("Cheaper by the Dozen"),
  EndElement("title"),
  StartElement(
    "number",
    namespace = Some("urn:ISBN:0-395-36341-6"),
    prefix = Some("isbn")
  ),
  Characters("1568491379"),
  EndElement("number"),
  EndElement("book"),
  EndDocument
)

val doc =
  """<?xml version='1.0' encoding='UTF-8'?><bk:book xmlns:bk="urn:loc.gov:books" xmlns:isbn="urn:ISBN:0-395-36341-6"><bk:title>Cheaper by the Dozen</bk:title><isbn:number>1568491379</isbn:number></bk:book>"""
val resultFuture: Future[String] = Source.fromIterator[ParseEvent](() => listEl.iterator).runWith(writer)
resultFuture.futureValue(Timeout(3.seconds)) should ===(doc)
Java
final String doc = "<?xml version='1.0' encoding='UTF-8'?>"+
        "<bk:book xmlns:bk=\"urn:loc.gov:books\" xmlns:isbn=\"urn:ISBN:0-395-36341-6\">"+
        "<bk:title>Cheaper by the Dozen</bk:title><isbn:number>1568491379</isbn:number></bk:book>";
final List<Namespace> nmList = new ArrayList<>();
nmList.add(Namespace.create("urn:loc.gov:books",Optional.of("bk")));
nmList.add(Namespace.create("urn:ISBN:0-395-36341-6", Optional.of("isbn")));
final List<ParseEvent> docList= new ArrayList<>();
docList.add(StartDocument.getInstance());
docList.add(StartElement.create("book", Collections.emptyList(), Optional.of("bk"), Optional.of("urn:loc.gov:books"), nmList));
docList.add(StartElement.create("title", Collections.emptyList(), Optional.of("bk"),Optional.of("urn:loc.gov:books")));
docList.add(Characters.create("Cheaper by the Dozen"));
docList.add(EndElement.create("title"));
docList.add(StartElement.create("number", Collections.emptyList(), Optional.of("isbn"),Optional.of("urn:ISBN:0-395-36341-6")));
docList.add(Characters.create("1568491379"));
docList.add(EndElement.create("number"));
docList.add(EndElement.create("book"));
docList.add(EndDocument.getInstance());


final CompletionStage<String> resultStage = Source.from(docList).runWith(write, materializer);

XML Subslice

Use XmlParsing.subslice to filter out all elements not corresponding to a certain path.

Scala
val parse = Flow[String]
  .map(ByteString(_))
  .via(XmlParsing.parser)
  .via(XmlParsing.subslice("doc" :: "elem" :: "item" :: Nil))
  .toMat(Sink.seq)(Keep.right)
Java
final Sink<String, CompletionStage<List<ParseEvent>>> parse = Flow.<String>create()
  .map(ByteString::fromString)
  .via(XmlParsing.parser())
  .via(XmlParsing.subslice(Arrays.asList("doc", "elem", "item")))
  .toMat(Sink.seq(), Keep.right());

Source on Github Source on Github

To get a subslice of an XML document run XML document source with this parser.

Scala
val doc =
  """
    |<doc>
    |  <elem>
    |    <item>i1</item>
    |    <item><sub>i2</sub></item>
    |    <item>i3</item>
    |  </elem>
    |</doc>
  """.stripMargin
val resultFuture = Source.single(doc).runWith(parse)
Java
final String doc =
  "<doc>" +
  "  <elem>" +
  "    <item>i1</item>" +
  "    <item><sub>i2</sub></item>" +
  "     <item>i3</item>" +
  "  </elem>" +
  "</doc>";
final CompletionStage<List<ParseEvent>> resultStage = Source.single(doc).runWith(parse, materializer);
The source code for this page can be found here.