官术网_书友最值得收藏!

Challenges in log analysis

The current log analysis process mostly involves checking logs at multiple servers that are written by different components and systems across your application. This has various problems, which makes it a time-consuming and tedious job. Let's look at some of the common problem scenarios:

  • Non-consistent log format
  • Decentralized logs
  • Expert knowledge requirement

Non-consistent log format

Every application and device logs in its own special way, so each format needs its own expert. Also, it is difficult to search across because of different formats.

Let's take a look at some of the common log formats. An interesting thing to observe will be the way different logs represent different timestamp formats, different ways to represent INFO, ERROR, and so on, and the order of these components with logs. It's difficult to figure out just by seeing logs what is present at what location. This is where tools such as Logstash help.

Tomcat logs

A typical tomcat server startup log entry will look like this:

May 24, 2015 3:56:26 PM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deployment of web application archive \soft\apache-tomcat-7.0.62\webapps\sample.war has finished in 253 ms

Apache access logs – combined log format

A typical Apache access log entry will look like this:

127.0.0.1 - - [24/May/2015:15:54:59 +0530] "GET /favicon.ico HTTP/1.1" 200 21630

IIS logs

A typical IIS log entry will look like this:

2012-05-02 17:42:15 172.24.255.255 - 172.20.255.255 80 GET /images/favicon.ico - 200 Mozilla/4.0+(compatible;MSIE+5.5;+Windows+2000+Server)

Variety of time formats

Not only log formats, but timestamp formats are also different among different types of applications, different types of events generated across multiple devices, and so on. Different types of time formats across different components of your system also make it difficult to correlate events occurring across multiple systems at the same time:

  • 142920788
  • Oct 12 23:21:45
  • [5/May/2015:08:09:10 +0000]
  • Tue 01-01-2009 6:00
  • 2015-05-30 T 05:45 UTC
  • Sat Jul 23 02:16:57 2014
  • 07:38, 11 December 2012 (UTC)

Decentralized logs

Logs are mostly spread across all the applications that may be across different servers and different components. The complexity of log analysis increases with multiple components logging at multiple locations. For one or two servers' setup, finding out some information from logs involves running cat or tail commands or piping these results to grep command. But what if you have 10, 20, or say, 100 servers? These kinds of searches are mostly not scalable for a huge cluster of machines and need a centralized log management and an analysis solution.

Expert knowledge requirement

People interested in getting the required business-centric information out of logs generally don't have access to the logs or may not have the technical expertise to figure out the appropriate information in the quickest possible way, which can make analysis slower, and sometimes, impossible too.

主站蜘蛛池模板: 宝坻区| 柘荣县| 灵山县| 四平市| 金门县| 桂平市| 平阳县| 齐齐哈尔市| 萨嘎县| 醴陵市| 漳州市| 陇西县| 利川市| 蓬莱市| 东乌| 神农架林区| 长治县| 乐山市| 沈丘县| 循化| 中山市| 尉犁县| 巴林左旗| 布尔津县| 沭阳县| 吴旗县| 津南区| 苗栗市| 沙洋县| 阳城县| 青河县| 汶川县| 襄城县| 乡城县| 长宁县| 荆门市| 太康县| 镇康县| 突泉县| 临泉县| 格尔木市|