Log processing

Whenever a log is sent to the server with the correct Protocol, it will run be processed with the following procedure.

Metadata collected

Earliest log line timestamp
Latest log line timestamp
Count of lines per category
Count of lines per severity

Or in Go representation:

type logChunkMetadata struct {
	Start           time.Time
	End             time.Time
	LineCount       int32
	CategoriesCount map[string]int // Used to get a string[] of categories
	SeveritiesCount map[string]int
}

Procedure

The log is processed in O(1) complexity, reading one character at a time.

A buffer is built with all line characters until encountering a \n. The line buffer is then used to extract metadata for that specific line. The metadata collected is:

type logChunkLineMetadata struct {
	Timestamp time.Time
	Category  string
	Severity  string
}

If a line cannot be processed due to invalid format (as described in Protocol), the line is still counted, to make sure that absolute line numbers remain correct when filtering. A line is deemed invalid if there is an error parsing, or if any of the fields is empty.

After line processing, the metadata is added to logChunkMetadata. The LineCount is incremented regardless of whether the metadata for a line could be extracted. The other fields are only updated when all metadata fields are set.

Storage

After metadata is collected, the original log chunk is saved to blob storage. The contents remain unmodified. This ensures no data loss in case of a processing error.

Filtering logs

When requesting logs without any filters, there is no filtering logic applied and all chunks will be streamed, in chronological order

When filters are set, the same extraction algorithm for incoming chunks is used for performance, and logs are streamed to the client whenever a chunk is done processing, to show the first lines as soon as possible.

Chunk metadata from the database is used to determine if a chunk contains any lines that should be sent. If there are no matches, the chunk is not loaded from blobstorage.

Performance

Due to the per-character algorithm for log chunk processing, it allows the server to be very performant with extracting the metadata.

The current benchmarks for extracting metadata from a 100.000-line incoming log chunk (which is much larger than what can be expected in a production setting) is around 60ms.

The benchmark for filtering lines from a 100.000-line chunk for streaming to a user (as is used for the admin panel) is around 70ms.

Metadata collected​

Procedure​

Storage​

Filtering logs​

Performance​

Metadata collected

Procedure

Storage

Filtering logs

Performance