Text and charsets

The text flows allow to translate a stream of text data according to the used character sets. It supports conversion between ByteString and String, as well as conversion of the character set in binary text data in the form of ByteStrings.

The main use case for these flows is the transcoding of text read from a source with a certain character set, which may not be usable with other flows or sinks. For example may CSV data arrive in UTF-16 encoding, but the Alpakka CSV parser does only support UTF-8.

Reported issues

Tagged issues at Github

Artifacts

sbt
libraryDependencies += "com.lightbend.akka" %% "akka-stream-alpakka-text" % "0.20"
Maven
<dependency>
  <groupId>com.lightbend.akka</groupId>
  <artifactId>akka-stream-alpakka-text_2.12</artifactId>
  <version>0.20</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.akka', name: 'akka-stream-alpakka-text_2.12', version: '0.20'
}

Text transcoding

The text transcoding flow converts incoming binary text data (ByteString) to binary text data of another character encoding.

The flow fails with an UnmappableCharacterException, if a character is not representable in the targeted character set.

Scala
import java.nio.charset.StandardCharsets
import akka.stream.scaladsl.FileIO
import akka.stream.alpakka.text.scaladsl.TextFlow

val byteStringSource: Source[ByteString, _] = // ...

byteStringSource
  .via(TextFlow.transcoding(StandardCharsets.UTF_16, StandardCharsets.UTF_8))
  .runWith(FileIO.toPath(targetFile))
Full source at GitHub
Java
Source<ByteString, ?> byteStringSource = // ...
byteStringSource
    .via(TextFlow.transcoding(StandardCharsets.UTF_16, StandardCharsets.UTF_8))
    .runWith(FileIO.toPath(targetFile), materializer);
transcoding
Full source at GitHub

Text encoding

The text encoding flow converts incoming Strings to binary text data (ByteString) with the given character encoding.

The flow fails with an UnmappableCharacterException, if a character is not representable in the targeted character set.

Scala
import java.nio.charset.StandardCharsets
import akka.stream.alpakka.text.scaladsl.TextFlow
import akka.stream.scaladsl.FileIO

val stringSource: Source[String, _] = // ...

ringSource
.via(TextFlow.encoding(StandardCharsets.US_ASCII))
.intersperse(ByteString("\n"))
.runWith(FileIO.toPath(targetFile))
Full source at GitHub
Java
import akka.stream.alpakka.text.javadsl.TextFlow;
import akka.stream.IOResult;
import akka.stream.javadsl.FileIO;
import akka.stream.javadsl.Sink;
import akka.stream.javadsl.Source;
import akka.util.ByteString;

import java.nio.charset.StandardCharsets;

Source<String, ?> stringSource = // ...
stringSource
    .via(TextFlow.encoding(StandardCharsets.US_ASCII))
    .intersperse(ByteString.fromString("\n"))
    .runWith(FileIO.toPath(targetFile), materializer);
encoding
Full source at GitHub

Text decoding

The text decoding flow converts incoming ByteStrings to Strings using the given character encoding.

Scala
import java.nio.charset.StandardCharsets
import akka.stream.alpakka.text.scaladsl.TextFlow

val byteStringSource: Source[ByteString, _] = // ...

val result: Future[immutable.Seq[String]] =
  byteStringSource
    .via(TextFlow.decoding(StandardCharsets.UTF_16))
    .runWith(Sink.seq)
Full source at GitHub
Java
Source<ByteString, ?> byteStringSource = // ...
byteStringSource
    .via(TextFlow.decoding(StandardCharsets.UTF_16))
    .runWith(Sink.seq(), materializer);
decoding
Full source at GitHub
The source code for this page can be found here.