High-Speed Logging (HSL) Architecture: Processing Infrastructure Data for SLA Reporting

Aug 30, 2020

AWS, Cloud Strategies, HighSpeedLogging, Monitoring

High-Speed Logging (HSL) Architecture: Processing Infrastructure Data for SLA Reporting

In the article “HSL – High Speed Logging,” we explored High-Speed Logging (HSL) and how infrastructure-sourced technical data can support Service Level Agreement (SLA) reporting. This article provides insights into processing and transporting data from its source into a data store. To maintain versatility across different deployment strategies, the design intentionally avoids cloud-native services.

Source Data Generation: On-Premises vs. Cloud

Data generation and transport differ significantly between on-premises and cloud-based deployments.

On-Premises Environment:

In an on-premises environment, data generation and shipping require specific handling. Log data collected from a Load Balancer must be sent to a central syslog host. Often, this includes data manipulation, such as date format translation or adding information for later processing. Logstash’s syslog input plugin can create a central syslog message endpoint, and Logstash’s filters can massage the data as needed. The ELK stack will be discussed in more detail elsewhere in this blog.

Cloud-Based Deployment:

In a cloud-based deployment, data collection is transformed. The cloud provider handles the delivery of logs into storage, from where they can be retrieved. AWS, for example, uses Simple Storage Service (S3) for this purpose. AWS ELBs (Elastic Load Balancers) can store their logs in an S3 bucket, from where they can be individually pulled for further processing.

Components

Each component in the stack is designed to be replaceable with a different solution and/or deployment strategy. Connections and communications are loosely coupled to avoid hard-wired setups.

Redis: Provides a flexible, fast-acting data structure store used as an in-memory database.
RabbitMQ: Serves as a message broker to handle the actual log objects.
Containers: Ensure reliability, availability, and scalability for the individual microservices. A single container image adjusts its behavior during startup based on an environment parameter. Docker Swarm Service provides an easy-to-use environment for managing scalability and microservice recovery.

Software Components

Listing: Creates a library of available, configured ELBs in a Region. This function sets the stage for the subsequent components.
Receiving: An orchestration job that creates work packages for processing.
Parsing: The workhorse of the design, scaling up or down as needed to translate ELB logs into JSON arrays and enrich the data for later analytics.

Logistics

Region-Based Deployment: Supports deployments in regions like us-east-1 or eu-west-1.
Self-Learning System: Automatically identifies ELBs in a given region, eliminating ongoing manual tasks for new ELB log processing.
Automated Discovery: Detects if an ELB’s configuration has logging disabled for reporting and alerting purposes.
Tracking System: Monitors when an ELB was first seen, last reviewed, and when objects were last processed.
Message Brokerage: Handles work packages for processing (downloading and unpacking) chunks of log objects.
Automated Object Removal: Removes objects from S3 buckets after processing, turning the buckets into buffers and adding a layer for fail-safe operations.
High Availability: If a process within a container fails, it automatically recovers to ensure continuous processing.