Discover the five actionable steps organizations can take to better manage their applications. Reduce application downtime, while improving performance, and your end users' experience.
1. A Sumo Logic White Paper
Five Steps to
Better Application
Performance
2. 305 Main Street
Redwood City, CA 94063
Toll-Free 1 855-LOG-SUMO +
Int’l 1 650 810 8700 +
www.sumologic.com +
Introduction
Modern enterprises increasingly rely on custom-built applications to
provide differentiated services to their customers and to improve
employee productivity. These business critical applications are
distributed, large-scale, and rely on in-house developed software as
well as open-source and 3rd-party components, making them difficult
to manage, monitor and troubleshoot. Managing these applications
is further complicated by the combination of physical, virtual and
cloud computing infrastructures in which these applications are
deployed. In short, the job of managing and supporting today’s
applications has never been more important—or more difficult.
The solution to many of the above issues can be found in one of the most
overlooked and undervalued IT assets: log data. Managing logs requires a
solution that, unlike today’s home-grown and on-premise solutions, is built
from the ground up to handle the volume, type and location of today’s rapidly
proliferating logs.
The following paper outlines the different ways that the next generation in
log management and analytics can dramatically improve your applications—
and thus your business performance. The five steps to better application
performance fall into three broad categories:
+ Reducing application downtime
+ Improving application performance
+ Improving your end users’ experience
Reducing application downtime
Step 1: Troubleshoot and get quickly to root cause analysis
When something goes wrong in a distributed production application,
troubleshooting and root cause analysis can take hours. The ability to
quickly isolate the problematic application server, module, or even a single
line of code dramatically reduces mean-time to identification (MTTI) and
mean-time to resolution (MTTR). But to do so requires a powerful and fast
log analysis system that enables you to look across your entire application
and underlying infrastructure in one fell swoop. With application downtime
translating directly to business performance and profitability, proper collection
and retention, coupled with a powerful analytics tool for your logs, is key to
immediate and accurate diagnosis and resolution.
The ability to quickly isolate
the problematic application
server, module, or even a
single line of code dramatically
reduces mean-time to
identification (MTTI) and
mean-time to resolution
(MTTR).
3. 305 Main Street
Redwood City, CA 94063
Toll-Free 1 855-LOG-SUMO +
Int’l 1 650 810 8700 +
www.sumologic.com +
Today’s next-generation solutions collect, process and analyze all your application
logs--regardless of volume, type or location--and do it in real time. So you can
query up-to-the-minute relevant information that your application generates, then
quickly narrow down the root causes and get your application running again.
Step 2: Uncover unknown issues lurking within your application
Certain types of issues causing application failure are difficult to identify by
asking questions via traditional query language. Often the issue has not been
observed in the past and isn’t easy to express in a query, so the root cause
remains hidden, eventually causing a major, long-lasting failure.
Identifying new patterns in log data, such as new types of Exception, require
new learning-based analytics technology. IT needs to be able to remove the
noise and find a new type of log message within millions of others without
having to identify it using traditional query language. As a result, application
operation teams can quickly address the issue rather than spend hours on
writing queries and chasing ghosts in their log data.
Step 3: Use your logs proactively
Keeping your applications running should not involve fighting one fire after
another. To keep applications running smoothly it is important to set up
proactive monitoring based on precise rules that help detect issues before
they turn into critical events. The right solution should be able to deliver
proactive notifications based on specific conditions or new patterns seen in
log data with near-zero latency.
Identifying new patterns
in log data, such as new
types of Exception, require
new learning-based
analytics technology.
4. 305 Main Street
Redwood City, CA 94063
Toll-Free 1 855-LOG-SUMO +
Int’l 1 650 810 8700 +
www.sumologic.com +
Conditions can be extremely precise and can describe, for example, a
number of occurrences of a specific type of exception. More importantly,
they should be able to be configured to notify when patterns in logs deviate
from baseline seen within data during normal application operations. These
types of notifications improve response time and help uncover new important
conditions within applications before outages occur.
Improving overall application performance
Step 4: Dig deep into how your application is performing at every level
Once problem diagnosis and resolution is under control and your
applications are running with minimal downtime, you can focus on
improving overall application performance. Understanding underlying
application performance does not stop with profilers and visibil ity into how
long particular method calls take.
In order to get a deep and actionable understanding of application
performance in the world of distributed applications, IT teams must dig
deep and understand the performance of:
+ Individual application components, modules and methods
+ Transactions that span many modules and systems
+ Underlying infrastructure such as servers and networks
...with this new technology
you can extract long-running
or distributed end-to-end
performance metrics.
5. 305 Main Street
Redwood City, CA 94063
Toll-Free 1 855-LOG-SUMO +
Int’l 1 650 810 8700 +
www.sumologic.com +
Next-generation log management technology enables your organization to
access and understand all of these metrics without new instrumentation. You
can capture fine-grained individual API call metrics from your logs to gain a deep
understanding of where your application spends its cycles. And extract those
metrics, analyze them further, and share them with engineering as evidence.
Furthermore, with this new technology you can extract long-running or
distributed end-to-end performance metrics that represent far more meaningful
metrics, such as end-to-end customer transactions that span multiple systems
and even multiple applications. With end-to-end transaction performance
information you get a holistic view of where the biggest time sinks are and
exactly how they impact your end users. You can then set meaningful end user-relevant
targets that your teams can work towards achieving.
Finally, you can capture and correlate how network, server, and third party
software performance impacts your application performance. For example,
you might find that the performance issue is not related to your application but
rather is a result of lack of network bandwidth capacity in a particular location.
Make your applications better for your end users
Step 5: Analyze and improve your end users’ experience
Once the application is stable and performs well, the final step is improving the
overall application from the end-user perspective. This is especially valuable
for customer-facing and revenue-generating applications, where end-user
satisfaction is critical to business operations. The teams who develop, operate,
market and sell these applications need to be able to glean insights into which
features are driving customer engagement, where usage drop-offs are, and
what the overall user experience is.
Next-generation solutions help enterprises quickly gain a deep understanding
of what works and what doesn’t for their end users. Without any new
instrumentation you can collect and analyze how customers interact with
your application (e.g. which features are being used, which features are hard
to discover) or figure out which workflows lead to confusion and understand
where customers drop off (e.g. which steps in the purchasing process take too
long). Log-ins by user, page load times, URL requests, form posts, session IDs,
engagement duration, and much more is already in your logs - you just need to
get the right log analysis tool to tap into it.