官术网_书友最值得收藏!

Data collection – concurrent searches

If you are building a new Splunk deployment for your company, you and your user communities may have limited estimates of how many ad hoc and scheduled searches for reports, dashboards, and alerts you can expect to employ, so this factor may be a bit difficult to nail down exactly. It may help to understand the various types of searches that are available from a Splunk perspective. We will cover searches more thoroughly in Chapter 6, Searching with Splunk, but for the purpose of gathering an estimate of the number of concurrent searches expected, you only need to be aware of the basic types of searches so that you can discuss this information with your user community:

  • Ad hoc: These are run by a user from Splunk Web for troubleshooting, one-time investigative reports, and so on.
  • Scheduled historical search: These are searches from a scheduled report or alert, or a dashboard that updates its panels periodically, which run against already-indexed data. This will likely represent the vast majority of the searches on your system.
  • Real-time search: These are searches that are run against events as they are received for indexing, typically for time-sensitive monitoring. Real-time searches can be run against events before they are indexed, or just after they are indexed with a slight delay. The number of concurrent real-time searches running can greatly affect indexing performance; for this reason, only users with the admin role can run and save them by default (this ability can also be assigned to specific users or roles), and it is best to limit their use.
  • Summary indexing search: This is a frequently running search that extracts information of interest (specific fields out of each event, for example) and saves the results into a designated summary index. You can then run searches and reports for longer reporting periods against this significantly smaller summary index (instead of against the entire range of full-sized events for the given time period) for increased performance.

Again, it is usually difficult to determine the number of expected concurrent searches because there are so many variables, and there are no hard and fast rules of thumb for estimating them. It is a good idea to sit down with each of your user communities and discuss the various types and number of reporting products they may want to leverage from the Splunk deployment. The Fig 2.2 depicts an example of a Splunk reporting products document that you may want to prepare to aid in these discussions, and to let your user base provide some useful planning feedback:

Fig 2.2: Splunk report planning worksheet

Don't let this part of the data collection process discourage you; assuming you're planning to implement a reasonable amount of search capability to start with, and your user community isn't wanting to run an exceptionally high number of concurrent searches, this part of the data collection process isn't nearly as critical as determining the volume of data you should expect to ingest. As a general rule, you will need to add more indexers before you run out of processing capability in search heads, and if you are expanding an existing Splunk environment, you can run reports to measure the number of searches by type and compare this to the numbers of users and data sources to establish a user-to-concurrent-search ratio for better capacity planning and management.

Before we dive into using the information we've just collected to choose Splunk hardware options, there are a few more topics that need to be covered. By the way, in the discussion and examples of implementing Splunk in this book, we are going to assume the use of a Splunk Enterprise license since that is the most likely scenario you will be working with eventually.

主站蜘蛛池模板: 徐闻县| 神农架林区| 许昌县| 克山县| 天峨县| 阳春市| 溆浦县| 南雄市| 伊宁市| 长兴县| 吉安县| 新邵县| 滨海县| 滨海县| 绥江县| 罗平县| 堆龙德庆县| 呈贡县| 惠东县| 瑞昌市| 太谷县| 隆德县| 南江县| 射洪县| 高唐县| 阿荣旗| 定西市| 屏山县| 彭泽县| 平阴县| 黄骅市| 简阳市| 通州区| 江山市| 昂仁县| 抚顺县| 洪雅县| 偏关县| 宜春市| 英山县| 遵义县|