FlatMap Operator
The Flat-Map Operator allows an input from the source to be split into multiple entries in the sink. Because it is a map operator, split inputs can be mapped too. However, unlike a map operator, it creates a one to many mapping. Its a powerful tool that could be used to digest arrays, objects, or other nested data.
Prerequisites
This guide uses local
Fluvio cluster. If you need to install it, please follow the instructions at here.
Syntax
Below is an example of a tranform function using flat-map. The function takes the string and splits in half. Both halves are inserted into the sink.
transforms:
- operator: flat-map
run: |
fn halfword(input: String) -> Result<Vec<String>> {
let mut ret: Vec<String> = Vec::new();
let mid = input.len() / 2;
ret.push(format!("first half: {}",&input[..mid]));
ret.push(format!("second half: {}",&input[mid..]));
Ok(ret)
}
In the example function, the return type is a vector of strings. The split string is re-encoded. In general, the transform function should return a vector of the sink's type.
Running the Example
Copy and paste following config and save it as dataflow.yaml
.
# dataflow.yaml
apiVersion: 0.5.0
meta:
name: flat-map-example
version: 0.1.0
namespace: examples
config:
converter: raw
topics:
sentences:
schema:
value:
type: string
halfword:
schema:
value:
type: string
services:
flat-map-service:
sources:
- type: topic
id: sentences
transforms:
- operator: flat-map
run: |
fn halfword(input: String) -> Result<Vec<String>> {
let mut ret: Vec<String> = Vec::new();
let mid = input.len() / 2;
ret.push(format!("{}",&input[..mid]));
ret.push(format!("{}",&input[mid..]));
Ok(ret)
}
sinks:
- type: topic
id: halfword
To run example:
$ sdf run --ephemeral
Produce sentences to in sentence
topic:
$ echo "0123456789" | fluvio produce sentences
Consume topic halfword
to retrieve the result in another terminal:
$ fluvio consume halfword -Bd
first half: 01234
second half: 56789
Here two strings are produced from the input.
Cleanup
Exit sdf
terminal and clean-up. The --force
flag removes the topics:
$ sdf clean --force
Conclusion
We just covered another basic operator in SDF, the Flat-Map Operator.