Preprocessors
Preprocessors operate on the raw data stream and transform it. They are run before data reaches the codec and do not know or care about tremor's internal representation.
Online codecs, preprocessors can be chained to perform multiple operations in succession.
Supported Preprocessors
lines
Splits the input into lines, using character 10 \n
as the line separator.
Buffers any line fragment that may be present (after the last line separator), till more data arrives. This makes it ideal for use with streaming onramps like tcp, to break down incoming data into distinct events.
Any empty lines present are forwarded as is -- if you want to remove them, please chain the remove-empty preprocessor with this preprocessor. An example:
preprocessors:
- lines
- remove-empty
Note: The proliferation of various lines preprocessors here will go away once preprocessors support configuration.
lines-null
Variant of the lines preprocessor that uses null byte \0
as the line separator.
lines-pipe
Variant of the lines preprocessor that uses pipe character |
as the line separator.
lines-no-buffer
Variant of the lines preprocessor that does not buffer any data that may be present after the last line separator -- the fragment is forwarded as is (i.e. treated as a full event).
lines-cr-no-buffer
Variant of the lines-no-buffer preprocessor that uses character 13 \r
(carriage return) as the line separator.
base64
Decodes base64 encoded data to the raw bytes.
decompress
Decompresses a data stream. It is assumed that each message reaching the decompressor is a complete compressed entity.
The compression algorithm is detected automatically from the supported formats. If it can't be detected, the assumption is that the data was decompressed and will be sent on. Errors then can be transparently handled in the codec.
Supported formats:
- gzip
- zlib
- xz
- snappy
- lz4
gzip
Decompress GZ compressed payload.
zlib
Decompress Zlib (deflate) compressed payload.
xz
Decompress Xz2 (7z) compressed payload.
snappy
Decompress framed snappy compressed payload (does not support raw snappy).
lz4
Decompress Lz4 compressed payload.
gelf-chunking
Reassembles messages that were split apart using the GELF chunking protocol.
If the GELF messages were sent compressed, you can decompress them by chaining the decompress preprocessor. An example is documented here -- you may need to apply decompress
before and/or after the reassembly here, depending on how your GELF client(s) behave.
remove-empty
Removes empty messages (aka zero len).
length-prefixed
Separates a continuous stream of data based on length prefixing. The length for each package in a stream is based on the first 64 bit decoded as an unsigned big endian value.
textual-length-prefix
Extracts the message based on prefixed message length given in ascii digits which is followed by a space as used in RFC 5425 for TLS/TCP transport for syslog.