This document discusses the use of Fluentd as a unified logging layer. Fluentd can parse log data into a common format, decouple log collection from data sources, and provide reliability, scalability and extensibility as logs are ingested from various systems and sent to different outputs like search, databases and storage. It provides an example of using Fluentd to parse Apache web server logs and tag them with metadata before outputting. The document advocates that Fluentd can help bring together logs from multiple systems and services to support common needs like analytics, monitoring and incident response.
1 of 42
Downloaded 14 times
More Related Content
Pivotal Open Source: Using Fluentd to gain insights into your logs
10. Acquire Data (or
so you think)
WUT!? Invalid
UTF8?
Fix the encoding
issue¡
Yell at the
engineers
Some columns
are missing!?
Run the
script¡DIVISION
BY ZERO!!!
22. How to Unify Logging (1)
Common Interface + Decoupling
Mobile Web IoT
Message
Queue
Search
Backend
Analytic DB
Archival
Storage
Unified
Logging Layer
Parse into a
common data format
Decouple from
data sources
23. How to Unify Logging (2)
Reliability & Scalability
Mobile Web IoT
Message
Queue
Search
Backend
Analytic DB
Archival
Storage
Unified
Logging Layer
Need
persistence/buff
ering
Robust retries
and recovery
24. How to Unify Logging (3)
Extensibility
? Web IoT
? Search
Backend
Analytic DB
Archival
Storage
Unified
Logging Layer
Adding a new
in/output must be
easy
Same for filters
so, I am a big fan of spoilers when it comes to tech talks. I think spoilers give the audience a much better idea of what to expect. so here it is.
phew. I just said that. no, this talk is definitely not about ¡°big data¡±
besides poking fun around the buzzword¡
people can¡¯t seem to agree on what it is. I want to talk about something far more concrete
I worked as a quantitative analyst for three years
definitely not just data engineers¡¯ problem
started to think more deeply about why logging becomes haphazard.
talked to hundreds of people at treasure data.
eventually, I had a couple of observations.
and here is another observation
the first requirement is common interface between data inputs and outputs.
why?
common interface -> one data can be stored into multiple places with the same semantics.
You don¡¯t know if you stick to the same backend system.
You probably will need to piece together information from multiple data sources.
data pipelines fail
format changes
volume spikes
hardware/IaaS hiccups
Scalability matters
You need to be able to scale out the logging layer.
New data sources/outputs come up
Need to be able to extend your system
So, here is a rather self-aggrandizing claim: Fluentd can be that unified logging layer.
In the rest of the talk, I will show you how.