官术网_书友最值得收藏!

Introduction to Apex

The world is producing data at unprecedented levels, with a rapidly growing number of mobile devices, sensors, industrial machines, financial transactions, web logs, and so on. Often, the streams of data generated by these sources can offer valuable insights if processed quickly and reliably, and companies are finding it increasingly important to take action on this data-in-motion in order to remain competitive. MapReduce and Apache Hadoop were among the first technologies to enable processing of very large datasets on clusters of commodity hardware. The prevailing paradigm at the time was batch processing, which evolved from MapReduce's heavy reliance on disk I/O to Apache Spark's more efficient, memory-based approach.

Still, the downside of batch processing systems is that they accumulate data into batches, sometimes over hours, and cannot address use cases that require a short time to insight for continuous data in motion. Such requirements can be handled by newer stream processing systems, which can process data in real time, sometimes with latency as low as a few milliseconds. Apache Storm was the first ecosystem project to offer this capability, albeit with prohibitive trade-offs such as reliability versus latency. Today, there are newer and production-ready frameworks that don't force the user to make such choices. Rather, they enable low latency, high throughput, reliability, and a unified architecture that can be applied to both streaming and batch use cases. This book will introduce Apache Apex, a next-generation platform for processing data in motion.

In this chapter, we will cover the following topics:

  • Unbounded data and continuous processing
  • Use cases and case studies
  • Application Model and API
  • Value proposition of Apex
主站蜘蛛池模板: 托克托县| 格尔木市| 民乐县| 县级市| 依兰县| 桐庐县| 本溪市| 白水县| 崇州市| 上蔡县| 满城县| 岳池县| 永顺县| 栖霞市| 大同县| 博湖县| 沐川县| 苏州市| 都安| 许昌市| 凌源市| 井研县| 云龙县| 塘沽区| 凤凰县| 新民市| 神木县| 永和县| 厦门市| 吉首市| 东莞市| 宜川县| 察雅县| 乌鲁木齐县| 德州市| 特克斯县| 申扎县| 郁南县| 新乡市| 侯马市| 鞍山市|