Reflections

The future of logging

I skate to where the puck is going to be, not where it has been.

– Wayne Gretzky

While computer logs have been around for decades, their implementation has changed over time. From a dedicated server with a single log file, to microservices distributed across multiple virtualized machines with dozens of log files, the way we generate and consume logs changes as we adopt new infrastructure and programming paradigms. Even our market space of cloud-based logging was not heavily adopted until recently, yet the way we use logs will continue to evolve. The million dollar question is, how?

The alluring cloud

Cloud-based logging is relatively new and is a result of the general industry trend of moving away from on-premise servers to cloud-based services. In fact, Gartner estimates that between 2016 and 2020, IT will spend more than $1 trillion responding to the shift to the cloud. This means that more and more businesses are moving their operations to the cloud, including their logs. This has some interesting implications.

Part of the beauty of moving to the cloud is the ability to easily deploy and scale your infrastructure without having to undertake a large internal infrastructure project. This is a significant reduction in both time and cost, and is central to the value proposition of moving to the cloud. Taking this reasoning further, since there are only a relatively small number of large-scale hosting providers, businesses can be built entirely on making cloud infrastructure management simpler, easier, and more flexible. Enter platform as a service (PaaS).

cloud-migration
Credit: biblipole.com

In addition to hosting providers like Amazon Web Services, DigitalOcean, and Microsoft Azure, all sorts of PaaS businesses have popped up, such as Heroku, Elastic Beanstalk, Docker Cloud, Flynn, Cloud Foundry and a whole host of others. These PaaS offerings have become more and more ubiquitous, and are becoming increasingly lucrative. In 2010, Heroku was bought by Salesforce for $212 million, and last year Microsoft attempted to buy Docker for a rumored $4 billion. This demonstrates a significant shift from raw hosting providers, to simplified, managed services that automate the manual grunt work of directly managing cloud infrastructure. All for similar reasons to migrating to the cloud in the first place.

So what does this have to do with logging? It means that providing ingestion integrations with PaaS as well as hosting providers becomes increasingly important. Do you integrate with Heroku? Can I send you my Docker logs? I use Flynn, how do I get my logs to you? If you’re a cloud logging provider, the answer to all of these questions should be yes. And don’t forget to create integrations as new PaaS offerings appear.

The rise of containers

With the adoption of cloud-based infrastructure, things like distributed microservices architecture have become more popular. One of the primary benefits of using a microservices architecture is its highly modular nature. This means parts can be swapped out quickly and efficiently, without as much risk of disrupting your customers. However, this also means higher risk of development environment fragmentation. This is what containers were built to solve.

Containers are essentially wrappers that isolate individual apps within a consistent environment. With containers, it shouldn’t matter what hosting provider or development infrastructure you use, your software applications should run exactly the same as they do on your development machines. Matching development and production environments means more reliable testing, and less time spent chasing down environmental issues. It also has the added benefit of being able to reliably run multiple apps inside containers on the same machine, as well as to quickly respond to fluctuating load. According to Gartner, containers are even considered more secure than running apps on a bare OS.

kubernetes-is-coming
Credit: memegenerator.net

While containers in themselves solve the problem of development environment fragmentation, managing lots of individual containers can be a pain. Hence the rise of container orchestration tools, namely Kubernetes. With Kubernetes, you can deploy and manage hundreds of containers, and easily automate nearly everything, including networking and DNS. This is particularly appealing, since managing these at scale using a traditional hosting provider takes significant effort.

Between creating a consistent environment and automating networking, adoption of Kubernetes has been steadily increasing, and is poised to become a popular tool of choice. This also means that acquiring and organizing log data from Kubernetes is paramount to understanding the state of your infrastructure. LogDNA provides integrations for this and we strongly believe that containers and container orchestration will become highly prevalent within the next few years.

Machine learning and big data

So far we’ve primarily focused on infrastructure evolution, but it is also important to consider the trajectory and impact of software trends on logging. We’ve all heard “big data” and “machine learning” toted as popular buzzword answers to solving difficult software problems, but what do they actually mean?

Before we dive deeper into logging applications (no pun intended), let’s consider the general benefits of machine learning and big data. For example, Netflix uses machine learning to provide movie recommendations based on the context-dependent aggregate preferences of its users. Google Now uses machine learning to provide you with on-demand pertinent information based on multiple contexts, such as location, time of day, and your browsing habits. In both cases, these services look for patterns in large datasets to predict what information will be useful to you.

big-data
Credit: imgflip.com

Predicting useful patterns is the key value proposition of big data and machine learning, so how does this apply to logs? Since logs enable quicker debugging of infrastructure and code issues, machine learning could be used to notify us of useful patterns that we otherwise may have missed. For example, if I have a webserver log and the number of requests spikes or there is an increase in the number of 400 errors, machine learning could be used to notify you of these events before they have a serious impact on your infrastructure.

Taken further, machine learning could be used to find relationships between logs, such as tracing a request through multiple servers without explicitly knowing the path of the request beforehand. Given additional contextual inputs, like git commits, deployments, continuous integration, and bug tracking, machine learning could even be used to find relationships between code changes and log statements. Imagine being automatically notified that a particular set of commits is likely responsible for the errors you’re seeing in production. Pretty mind blowing.

So why hasn’t this been done already? Even though finding general relationships between entities isn’t hugely difficult, discerning useful and meaningful insights from unstructured data is actually pretty challenging. How do you find useful relationships in large unstructured datasets consisting primarily of background noise? The answer, however unsexy, is classification and training.

As evidenced by both the Netflix and Google Now examples, human recommendation is key to making machine learning insights actually worthwhile, hence the ‘learning’ part of the name. While it may seem like this initial recommendation effort detracts from the promised convenience of machine learning, it is actually what makes machine learning work so well. Instead of hunting down the data and finding patterns yourself, you are prompted to verify that the insights generated are helpful and correct. As more and more choices are made by us humans, the more useful these machine learning insights become, and the fewer verification prompts we receive. This is the point at which machine learning has fulfilled its highest potential.

Moving forward

From PaaS and containers, to machine learning and big data, we keep all these things in mind as we improve LogDNA. Like machine learning, we also rely on recommendation and feedback from our customers. Without it, our product would not be where it is today.

What do you think is the future of logging? Let us know at feedback@logdna.com!

Reflections

Log Everything First

When launching an application into production, everyone knows that even the best code can fail for any number of reasons. Implementing strategies that enable you to quickly understand why something is failing and proactively identify the culprit will give you peace of mind and even let you catch some shut eye at night.

The first question developers are always faced with is, how much should I log? 

Troubleshooting no longer just means looking at web server and infrastructure logs, but also custom application event logs, database logs, mail server logs, operating system logs, network logs, and so on. It’s tough to figure out where to look and how to correlate events with each other. Many late evenings are spent by weary dev ops teams grepping files on each server and watching dozens of shells running tail -f.

However, preventing these types of scenarios is where cloud log management solutions really shine. LogDNA’s agent allows you specify the logs you want to watch and you can not only watch them in real time, but also go back in time to see everything that happened. You can also set up views and alerts based on custom smart filters so that you can quickly identify and even prevent situations that have happened before.

Unfortunately most cloud logging solutions force you into deciding what to log right away by charging you on volume. LogDNA takes that question away by starting you off with 50GB/month – for free. There’s so much that goes into building and launching your application, don’t stress about optimizing how much to log on day 1.

This way you have time to figure out what logging events are significant and what views and alerts you need. As you troubleshoot your beta application issues, you’ll undoubtedly dig deeper and add more logging in certain areas and remove logging from others. Keep iterating, keep improving and keep delighting your customers.

Reflections

You got 99 problems, but logging ain’t one

There’s never a dull moment in the world of running a startup. As founders, business and technical leaders, we are literally faced with 99 problems on a daily basis. Therefore, we truly believe that log management shouldn’t be one of them.

I would be deceiving you if I said we always wanted to build a log management system, and that our number one goal in life was to disrupt the logging ecosystem. But as Steve Jobs once said, “You can’t connect the dots looking forward: you can only connect them looking backwards.” And it all makes sense, all the dots aligned when looking back at our journey…

Y COMBINATOR 2015

We came up with LogDNA by accident; we built this system for internal use as we were frustrated with the competitors. We never intended our log system to be a standalone product but hey, things happen for a reason right?

Let me rewind for a second. It was winter 2015, my brilliant co-founder Lee Liu and I were in Y Combinator’s Winter 2015 program. We had a company called PushMarket and we thought our personalization engine was going to revolutionize the eCommerce space. I’ll be the first to admit that we struggled to find “product market fit”, the holy grail for every aspiring startup. “Build something people want. Build something people want! Build something people want!!!” were the words slung at me at our weekly YC dinners. It was clear a few months after the program that we weren’t building something that people want, so we initiated a new strategy and a new game plan.

We looked back at our PushMarket journey, we looked at every aspect of the product and the design, we worked harder to reinvent it. At one point in our journey to perfect PushMarket, we built our own logging system as a basic needs requirement. And as the old proverb goes, necessity is the mother of invention.

THE CURRENT LANDSCAPE IS INTOLERABLE

Looking at the logging space is very interesting. I’ll break down the basics for you:

1. The business model is “We want to help you succeed, but we will charge you for every gig that you store with us”.

We thought this model was rubbish. Instead of helping us succeed, we were limited. We restricted the amount of log data we sent for fear of being overcharged. We would constantly change logging code to log less or log more compactly. This was not a good use of our engineering resources. So we ended up with a goal to design a system to give developers the flexibility to log everything, we wanted to take the away the unnecessary requirement of creating and maintaining a ELK stack, and we wanted to alleviate the stress of log management altogether.

2. The industry limits the search window for logs.

The standard “2 weeks” retention window was too brief. There were many occasions where we needed to search 4 weeks, 8 weeks and 6 months back. We desired to alter this.

3. User interface and user experience is clunky at best.

They gave us a “take it or leave it” option. We needed to change the choices. We wanted beautiful UI, but more importantly, awesome UX so that you can jump in, get what you need and go back to working on your product. The word frictionless comes to mind and is one of our design goals.

OUR “SLACK” MOMENT

I attended a dinner late last summer when a few tech friends of mine were criticizing and grumbling about their log management system. It shocked me that their frustrations mirrored ours; we were not alone in the dissatisfaction. A few weeks later I started to ask others. The general consensus was that the current systems are unsatisfying in terms usability and functionality, but the major concern and aggravation was storage issues and the expense related to data storage. It’s 2016! That’s the equivalent of having a monthly mobile phone plan and still being charged by the minute!

It was clear that we were onto something, that’s when we had our “Slack” moment. We realized that we built something internally that was more powerful than our original business plan. Similar to how Slack started off as a MMO gaming company before they became the illustrustrious, enterprise company they’re known for today. Their transformation was ingenious, and we were inspired!

And just like that, LogDNA was born.

THE JOURNEY BEGINS

I don’t know what the future holds, but I’m glad we are going down this path. I encourage you to give LogDNA test drive! I’d love to hear your feedback and suggestions on how we can make this space better.

We’re already planning to leverage infrastructure data and apply machine learning to help forecast certain future outcomes. We’re developers ourselves, and it’s gratifying to have built something that every team needs.

Here’s to a great 2016!