Comparison, Technical

How Fluentd plays a central role in Kubernetes logging

Written by Twain Taylor

Collecting logs is a complex challenge with containerized applications. Docker enables you to decompose a monolith into microservices. This brings more control over each part of the application. Containers help to separate each layer – infrastructure, networking, and application – from the other layers. It gives you flexibility in where to run your app – in a private data center, or a public cloud platform, or even move between cloud platforms as the need arises. Networking tools are plugins to the container stack and they enable container-to-container communication at large scale. At the application layer, microservices is the prefered model when adopting containers, although containers can be leveraged to improve efficiency in monolithic applications as well. While this is a step up from the vendor-centric options of yesterday, it brings complexity across the application stack. To manage this complexity it takes deep visibility. The kind of visibility that starts with metrics, but goes deeper with logging.

The thing about logging containers is that there are too many data sources. At every level of the stack logs are generated in great detail, certainly much more than traditional monolithic applications. Log data is generated as instances are created and retired, their configurations changed, as network proxies communicate with each other, as services are deployed, and requests are processed. This data is rich with nuggets of information vital for monitoring and controlling the performance of the application.

With the amount of log data ever increasing, it requires specialized tools to manage and adapt to the specific needs of containers at every step of the log lifecycle. Kubernetes is built with an open architecture that leaves room for this type of innovation. It allows for open source logging tools to be created which can extract logs from Kubernetes and process these logs on their own. In response, there have been logging tools that have stepped up to the task. These logging tools are predominantly open source, and give you flexibility in how you’d like to manage logging. There are tools for every step including log aggregation, log analysis, and log visualization. One such tool that has risen to the top in terms of log aggregation is Fluentd.

What is Fluentd?

Fluentd is an open source tool that focuses exclusively on log collection, or log aggregation. It gathers log data from various data sources and makes them available to multiple endpoints. Fluentd aims to create a unified logging layer. It is source and destination agnostic and is able to integrate with tools and components of any kind. Fluentd has first-class support for Kubernetes, the leading container orchestration platform. It is the recommended way to capture Kubernetes events and logs for monitoring. Adopted by the CNCF (Cloud-native Computing Foundation), Fluentd’s future is in step with Kubernetes, and in this sense, it is a reliable tool for the years to come.

Source: Flickr

EFK is the new ELK

Previously, the ELK stack (Elasticsearch, Logstash, Kibana) was the best option to log applications using open source tools. Elasticsearch is a full-text search engine and database that’s ideally suited to process and analyze large quantities of log data. Logstash is similar to Fluentd – a log aggregation tool. Kibana focuses exclusively on visualizing the data coming from Elasticsearch. Logstash is still widely used but now has competition from Fluentd – more on this later. So today, we’re not just talking about the ELK stack, but the EFK stack. Although, admittedly, it’s not as easy to pronounce.

Fluentd Deployment

You can install Fluentd from its Docker image which can be further customized.

Kubernetes also has an add-on that lets you easily deploy the Fluentd agent. If you use Minikube, you can install Fluentd via its Minikube addon. All it takes is a simple command minikube addons enable efk. This installs Fluentd alongside Elasticsearch and Kibana. While Fluentd is lightweight, the other two are resource heavy and will need additional memory in the VM used to host them and will take some time to initialize as well.

Kops, the Kubernetes cluster management tool, also has an addon to install Fluentd as part of the EFK trio.

Another way to install Fluentd is to use a Helm chart. If you have Helm setup, this is the simplest and most future-proof way to install Fluentd. Helm is a package manager for Kubernetes and lets you install Fluentd with a single command:

$ helm install –name my-release incubator/fluentd-elasticsearch

Once installed, you can further configure the chart with many options for annotations, Fluentd configmaps and more. Helm makes it easy to manage versioning for Fluentd, and even has a powerful rollback feature which lets you revert to an older version if needed. It is especially useful if you want to install Fluentd on remote clusters as you can share Helm charts easily and install them in different environments.

If you use a Kubernetes managed service they may have their own way of installing Fluentd that’s specific to their platform. For example, with GKE, you’ll need to define variables that are specific to the Google Cloud platform like region, zone, and Project ID. Then, you’ll need to create the service account, create a Kubernetes cluster, deploy a test logger and finally deploy the Fluentd daemonset to the cluster.

How it works

The Docker runtime collects logs from every container on every host and stores them at /var/log. The Fluentd image is already configured to forward all logs from /var/log/containers and some logs from /var/log. Fluentd reads the logs and parses them into JSON format. Since it’s stored in JSON the logs can be shared widely with any endpoint.

Fluentd also adds some Kubernetes-specific information to the logs. For example, it adds labels to each log message to give the logs some metadata which can be critical in better managing the flow of logs across different sources and endpoints. It reads Docker logs, etcd logs, and kubernetes logs.

The most popular endpoint for log data is Elasticsearch, but you can configure Fluentd to send logs to an external service such as LogDNA for deeper analysis. By using a specialized log analysis tool, you can save time troubleshooting and monitoring. With features like instant search, saved views, and archival storage of data, a log analysis tool is an essential if you’re setting up a robust logging system that involves Fluentd.

Fluentd Alternatives


Logstash is the most similar alternative to Fluentd and does log aggregation in a way that works well for the ELK stack.

Logstash uses if-then rules to route logs while Fluentd uses tags to know where to route logs. Both are powerful ways to route logs exactly where you want them to go with great precision. Which you prefer will depend on the kind of programming language you’re familiar with – declarative or procedural.

Next, both Fluentd and Logstash have a vast library of plugins which make them both versatile. In terms of getting things done with plugins, both are very capable and have wide support for pretty much any job. You have plugins for the most popular input and output tools like Elasticsearch, Kafka, and AWS S3 and plugins for tools that may be used by a niche group of users as well. Fluentd here has a bit of an edge as it has a comparatively bigger library of plugins.

Source: Flickr

When it comes to size, Fluentd is more lightweight than Logstash. This has a bearing on the logging agent that’s attached to containers. The bigger the production applications, the larger the number of containers and data sources, the more agents are required. A lighter logging agent like Fluentd’s is prefered for Kubernetes applications.

Fluent Bit

While Fluentd is pretty light, there’s also Fluent Bit an even lighter version of the tool that removes some functionality, and has a limited library of 30 plugins. However, it’s extremely lightweight weighing in at ~450KB next to the ~40MB of the full blown Fluentd.


Logging is a critical function when running applications in Kubernetes. Logging is difficult with Kubernetes, but thankfully, there are capable solutions at every step of the logging lifecycle. Log aggregation is at the center of logging with Kubernetes. It enables you to collect all logs end-to-end and deliver them to various data analysis tools for consumption. Fluentd is the leading log aggregator for Kubernetes due to its’ small footprint, better plugin library, and ability to add useful metadata to the logs makes it ideal for the demands of Kubernetes logging. There are many ways to install Fluentd – via the Docker image, Minikube, kops, Helm, or your cloud provider.

Being tool-agnostic, Fluentd can send your logs to Elasticsearch or a specialized logging tool like LogDNA. If you’re looking to centralize logging from Kubernetes and other sources, Fluentd can be the unifying factor that brings more control and consistency to your logging experience. Start using Fluentd and get more out of your log data.


The Dollars & Sense of Pricing: Daily vs. Monthly vs. Metered

SaaS has been around for what seems like forever, but one standard hasn’t emerged as the victor for pricing format — and that statement applies to the logging and monitoring industry as well.  The three standards that have gained the most adoption, however, are daily data caps, monthly data caps, and metered billing. In this article, we’ll break down the pros and cons of each. To do this, we’ll analyze Badass SaaS, a fictitious company that produced the following log data in a month:

Screen Shot 2018-06-19 at 1.13.58 PM

This data volume represents the typical peaks and valleys that we see companies produce in a given month.  Let’s get into it.

Daily Volume Cap

If Badass SaaS were to utilize a logging platform with a daily volume cap, they’d have to base their plan on the highest daily usage (or face the mighty paywall); using our example above, we see that the highest usage is 512 GB.  When choosing a plan, they would also have to budget for possible future spikes (for times in future months where the max is above 512 GB). Then they would have to choose the closest package that the logging provider offers — in this case, let’s say its 600 GB/day.  

It becomes painfully obvious that Badass SaaS is paying for a 600 GB daily limit, but is using far less than that on the average day.  To quantify the waste, badass is averaging 207 GB/day, but is paying for almost three times that. The more variability in your data, the more you’re getting squeezed by a company that implements a daily volume cap.  There’s a tremendous amount of waste that comes into play with daily volume caps.

Monthly Volume Cap

If Badass SaaS were to go with a logging platform that uses a monthly volume cap, it eliminates the waste that comes through daily variability, but the same problem arises when we look at things from a monthly perspective.  It makes sense that Badass would have monthly variability in their data (similar to the case with daily usage), and they would have to choose a monthly plan that covers the highest anticipated monthly usage. If their monthly variability typically ranges from 4 TB to 12 TB, they would have to pick a plan with at least 12 TB of monthly data, or again face the dreaded paywall.  This again leads to lots of waste — Badass pays for 12 TB of monthly data, and uses much less than that most months.

Badass couldn’t realistically choose a 12 TB monthly limit since these data volumes are predictions about the future, not looking at historical data.  Badass would likely choose a plan of at least 15 TB to take into account any unforeseen upside variance.

Metered Billing

With metered billing, there’s no need to guess at what your data volume might or might not be in the future.  You choose a per-GB price, and you get billed based on your actual usage at the end of each month. It’s that simple.  

This style of billing wasn’t very prevalent until Amazon’s recent implementation of it with AWS. Now with AWS’ adoption, everybody is familiar with it.

Daily vs. Monthly vs. Metered

Let’s compare how Badass SaaS’ metered bill would compare to their bill if they would have used a provider with daily or monthly limits.  

Using the example above, Badass would have paid for a total of 600 MB /day, or 18,000 GB over a month — and their total 30-day usage was 6,211 GB.  

With a monthly data cap plan, Badass would be on a 15 TB plan given our example above, and again used 6,211 GB.

With a metered billing setup, Badass doesn’t have to pick a fixed data bucket; they just pay for what they use.  In this case, they pay for just the 6,211 GB they use.

Plan Type Actual Usage (GB) Data Paid For Wastage
Daily 6,211 18,000 65.5%
Monthly 6,211 15,360 59.6%
Metered 6,211 6,211 0%

Doing Your Own Analysis

Comparing a daily cap plan to a monthly cap plan involves more than just multiplying the daily cap by 30 and doing the comparison between a daily, monthly and metered plan.  As you’ve seen here, variability plays a huge role in the true cost of both a daily and monthly plan, and what you’re getting (and throwing away) — the more variability in data, the more wastage.  If you’re already using logging software, the best way to compare prices is to look at your actual daily and monthly usage over time and get a true understanding of the true cost of a daily, monthly or metered plan. Don’t forget to take into account possible future variance.

At LogDNA, we implemented metered pricing with the customer in mind.  We could have implemented another ‘me too’ daily or monthly capped plan, and collected money for data our customers weren’t ingesting.  But instead, we were the first (and are still the only) logging company to implement metered billing because that’s the best thing for our customers.  We pride ourselves on our user experience, and that doesn’t stop at a beautiful interface.

Comparison, Technical

3 Logging Use Cases

The versatility of logs allows them to be used across the development lifecycle, and to solve various challenges within an organization. Let’s look at three logging use cases from leading organizations at various stages of the product life cycle, and see what we can learn from them.


Transferwise – Improving Mobile App Reliability

Transferwise is an online payments platform for transferring money across countries easily, and their app runs across multiple platforms. One of the challenges they face with their mobile app is analyzing crashes. It’s particularly difficult to reproduce crashes with mobile as there are many more possible issues – device-specific features, carrier network, memory issues, battery drain, interference from other apps, and many more. A stack trace doesn’t have enough information to troubleshoot the issue. To deal with this, Transferwise uses logs to better understand crashes. They attach a few lines of logs to a crash report which gives them vital information on the crash.

To implement this, they use the open source tool CocoaLumberjack. It transmits crash logs to external loggers where they can be analyzed further. It enables Transferwise to print a log message to the console. You can save the log messages to the cloud or include them in a user-generated bug report. As soon as the report is sent, the user is notified that Transferwise is already working on fixing the issue. This is much better than being unaware of the crash, or ignoring it because they can’t find the root cause.

You should ensure to exclude sensitive data in the log messages. To have more control over how log messages are reported and classified, Transferwise uses a logging policy. They classify logs into 5 categories – error, warning, info, debug, and verbose – each has a different priority level, and are reported differently.

While CocoaLumberjack works only on Mac and iOS, you can find a similar tool like Timber or Hugo for Android. But the key point of this case study is that logging can give you additional insight into crashes especially in challenging environments like mobile platforms. It takes a few unique tools and some processes and policies in place to ensure the solution is safe enough to handle sensitive data, but the value is in increased visibility into application performance, and how you can use it to improve user experience.

[Read more here.]

Wealthfront – Enhancing User Experience with A/B Tests

Wealthfront is a wealth management solution that uses data analytics to help its users invest wisely and earn more over the long term. Though the Wealthfront web app is the primary interface for a user to make transactions, their mobile app is more actively engaged with and is an important part of the solution. Wealthfront is a big believer in A/B testing to improve the UI of their applications. While they have a mature A/B testing process setup for the web app, they didn’t have an equivalent for their mobile apps. As a result they just applied the same learnings across both web and mobile. This is not the best strategy, as mobile users are different from web users, and the same results won’t work across both platforms. They needed to setup an A/B testing process for their mobile apps too.

For inspiration, they looked to Facebook who had setup something similar for their mobile apps with Airlock – a framework for A/B testing on mobile. Wealthfront focussed their efforts on four fronts – backend infrastructure, API design, the mobile client, and experiment analysis. They found logs essential for the fourth part – experiment analysis. This is because logs are a much more accurate representation of the performance and results of an experiment than relying on a backend database. With mobile, the backend infrastructure is very loosely coupled with the frontend client and reporting can be inaccurate if you rely on backend numbers. With logs, however, you can gain visibility into user actions, and each step of a process as it executes. One reason why logging is more accurate is that the logging is coded along with the experiment. Thus, logging brings you deeper visibility into A/B testing and enables you to provide a better user experience. This is what companies like Facebook and Wealthfront have realized, and it can work for you too.

[Read more here.]

Twitter – Achieving Low Latencies for Distributed Systems

At Twitter where they run distributed systems to manage data at very large scale, they use high-performance replicated logs to solve various challenges brought on by distributed architectures. Leigh Stewart of Twitter comments that “Logs are a building block of distributed systems and once you understand the basic pattern you start to see applications for them everywhere.”

To implement this replicated log service they use two tools. The first is the open source Apache BookKeeper which is a low-level log storage tool. They chose BookKeeper for its low latency and high durability even under peak traffic. Second, they built a tool called DistributedLog to provide higher level features on top of BookKeeper. These features include naming and metadata for log streams, data segmentation policies like log retention and segmentation. Using this combination, they were able to achieve write latencies of 10ms, and not exceeding 20ms even at the slowest write speed. This is very efficient, and is possible because of using the right open source, and proprietary tools in combination with each other.

[Read more here.]

As the above examples show, logs play a vital role in various situations across multiple teams and processes. They can be used to make apps more reliable by reducing crashes, improve the user interface using A/B tests, and enforce better safety policies on end users. As you look to improve your applications in these areas, the way these organizations have made use of logs is worth taking note of and implementing in a way that’s specific to your organization. You also need a capable log analysis platform like LogDNA to collect, process and present your log data in a way that’s usable and actionable. Working with log data is challenging, but with the right goals, the right approach, and the right tools, you can gain a lot of value from log data to improve various aspects of your application’s performance.

Comparison, Technical

LogDNA Helps Developers Adopt the AWS Billing Model for More Cost-Effective Logging

Amazon Web Services (AWS) uses a large scale pay-as-you-go model for billing and pricing some seventy plus cloud services. LogDNA has taken a page from that same playbook and offers similar competitive scaling for our log management system. For most companies, managing data centers and a pricey infrastructure is a thing of the past. Droves of tech companies have transitioned into cloud-based services. This radical shift in housing backend data and crucial foundations has completely revolutionized the industry and created a whole new one in the process.

LogDNA Helps Developers Adopt the AWS Billing Model for More Cost-Effective Logging

For such an abrupt change – one would think that an intelligent shift in pricing methods would have followed. For the majority of companies this is simply not the case.

New industries call for new pricing arrangements. Dynamically scalable pricing is practically a necessity for data-based SaaS companies. Flexible pricing just makes sense and accounts for vast and variable customer data usage.

AWS, and for that matter, LogDNA, have taken the utilities approach to a complex problem. The end user will only pay for what they need and use. Adopting this model comes with a set of new challenges and advantages that can be turned into actionable solutions. There is no set precedent for a logging provider using the AWS billing model. We are on the frontier of both pricing and innovation of cloud logging.

LogDNA Pricing Versus a Fixed System

The LogDNA billing model is based on a pay-per-gig foundation. That means that each GB used is charged on an individual basis before being totaled at the end of the month. What follows then is for each plan: low minimums, no daily cap, and scaling functionality.

Here is an example of a fixed tiered system with a daily cap. For simplicity’s sake, here is a four day usage-log (no pun intended) of a log management system with a 1 GB /day cap.

Monthly Plan: 30 GB Monthly – $99

Day 1: 0.2 GB

Day 2: 0.8 GB

Day 3: 1 GB

Day 4: 0.5 GB

This four day usage is equivalent to 2.5 GB logged. That’s an incredible amount of waste because of a daily cap and variable use. Let’s dive into a deeper comparison of the amount of money wasted compared to our lowest tiered plan.

LogDNA’s Birch Plan charges $1.50 per GB. If we had logged that same amount of usage with our pricing system it would cost roughly $3.75. While the fixed system doesn’t show us the price per GB – we can compare it to LogDNA with some simple math. If a monthly plan at a fixed rate of $99 per month is equal to 30 GB usage per month then you can reasonably say that each GB is equal to about $3.30 in this situation.

Can you spot the difference in not only pricing, but cloud waste as well? With a daily cap, the end-user isn’t even getting to use all of that plan anyhow. A majority of cloud users are underestimating how much they’re wasting. Along with competitive pricing, our billing model cuts down tremendously on wasted cloud spend.       

Challenges of the Model

It’s important again to note that our model is unique amongst logging providers. This unearths a number of interesting challenges. AWS itself has set a great example by publishing a number of guides and guidelines.

The large swath of AWS services (which seems to be growing by the minute) are all available on demand. For simple operations, this means that only a few services will be needed without any contracts or licensing. The scaled pricing allows the company to grow at any rate they can afford, without having to adjust their plan. This lessens the risk of provisioning too much or too little. Simply put, we scale right along with you. So there’s no need to contact a sales rep.

LogDNA as an all-in-one system deals with a number of these same challenges. The ability to track usage is a major focus area to us so that we can ensure you have full transparency into what your systems are logging with us. Our own systems track and bill down to the MB, so that the end-user can have an accurate picture of the spend compared to usage rates. This is not only helpful, but allows us to operate in a transparent manner with no hidden fees. Though it is powered by a complex mechanism internally, it provides a simplified, transparent billing experience for our customers.

LogDNA users have direct control over their billing. While this may seem like just another thing to keep track of, it’s rather a powerful form of agency you can now use to take control of your budget and monetary concerns. Users can take their methodical logging mentality and apply that to their own billing process, allowing greater control over budgets and scale.   

Say, for example, that there is an unexpected spike in data volume. Your current pricing tier will be able to handle the surge without any changes to your LogDNA plan. As an added bonus, we also notify you in the event of a sudden increase in volume. Due to the ever-changing stream of log data – we even offer the tools of ingestion control so that you can even exclude logs you don’t need and not be billed for them.

Our focus on transparency as part of the user experience not only builds trust, but also fosters a sense of partnership.

Scaling for All Sizes & Purposes

Our distinctly tiered system takes into account how many team members (users) will be using LogDNA on the same instance and length of retention (historical log data access for metrics and analytic purposes.) Additionally we also have our scaled pricing tier – HIPAA compliant for protected health information (which includes a Business Associate Agreement, or BAA, for handling sensitive data).

Pictured here is a brief chart of some basic scaled prices for our three initial individual plans. The full scope of the plans is listed here. This is a visualization of a sample plan per each tier.

Plan Estimator

BIRCH – $1.50 /GB – Retention: 7 Days – Up to 5 Users
Monthly GB Used 1GB 4GB 16GB 30GB
Cost Per Month $1.50 $6 $24 $45

Monthly Minimum: $3.00

MAPLE – $2.00 /GB – Retention: 14 Days – Up to 10 Users
Monthly GB Used 10GB 30GB 120GB 1TB
Cost Per Month $20 $60 $240 $2000

Monthly Minimum: $20.00

OAK – $3.00 /GB – Retention: 30 Days – Up to 25 Users
Monthly GB Used 50GB 60GB 150GB 1TB
Cost Per Month $150 $180 $450 $3,000

Monthly Minimum: $100.00

Custom Solutions for All & Competitive Advantages

Many pricing systems attempt to offer a one size fits all model. Where they miss the mark, we succeed with a usability that scales from small shops to large enterprise solutions. Our WIllow (Free) Plan is a single user system that allows an individual to see if a log management system is right for their individual project or eventual collaborated team effort into a paid tier. High data plans are also customized and still retain the AWS billing model. We also offer a full-featured 14-day trial.

The adoption of this model creates a competitive advantage in the marketplace for both parties. LogDNA can provide services to all types of individuals and companies with a fair transparent pricing structure. The end user is given all relevant data usage and pricing information along with useful tools to manage it as they see fit.

For example, imagine you are logging conservatively, focusing only on the essentials like poor performance and exceptions. In the middle of putting out a fire, your engineering team realizes that they are missing crucial diagnostic information. Once the change is made, those new log lines will start flowing into LogDNA without ever having to spend time mulling over how to adjust your plan. Having direct control over your usage and spending without touching billing is enormously beneficial to not only our customers, but also reduces our own internal overhead for managing billing.

Competitive Scenario – Bridging the Divide Between Departments

Picture this scenario; there has been an increased flux of users experiencing difficulty while using your app. The support team has been receiving ticket after ticket. Somewhere there is a discrepancy between what the user is doing and what the app is returning. The support team needs to figure out why these users are having difficulty using the app. These support inquiries have stumped the department – the director needs to ask the engineering team how they can retrieve pertinent information to remedy a fix.  

LogDNA helps bridge the divide by providing the support team with relevant information to the problem at hand. For this particular example, the engineering team instruments new code to log all customer interactions with API endpoints. The support team has a broader vision of how users are interacting with the interface. They’ve now been equipped with a new tool in their arsenal from the engineers. There was nothing lost in translation between the departments during this exchange.  

After looking through the new logged information, the support team is able to solve the problem many of its users were experiencing. The support team has served its purpose by responding to these inquiries and making the end-user happy. All it took was some collaboration between two different departments.   

The log volume has increased due to new logs being funneled through the system. But the correlation between increased log volume and better support is worth it. During this whole process no changes are required to your current account plan with LogDNA. Future issues that may arise will be easily fixed as a result of this diagnostic information being readily available. The cost of losing users outweighs the cost of extra logs.

LogDNA places the billing model on an equal level of importance as the actual log management software itself. It can be used to make decisions all across the board. LogDNA’s billing model allows itself to adapt to budgetary concerns, user experience and a better grasp of your own data all at once.