舉報(bào)

會(huì)員
Mastering Apache Storm
Ankit Jain 著
更新時(shí)間:2021-07-02 20:33:02
開(kāi)會(huì)員,本書免費(fèi)讀 >
最新章節(jié):
Summary
IfyouareaJavadeveloperwhowantstoenterintotheworldofreal-timestreamprocessingapplicationsusingApacheStorm,thenthisbookisforyou.NopreviousexperienceinStormisrequiredasthisbookstartsfromthebasics.Afterfinishingthisbook,youwillbeabletodevelopnot-so-complexStormapplications.
最新章節(jié)
- Summary
- Kafka spout sentiments bolt and HDFS bolt
- Using Kafka producer to store the tweets in a Kafka cluster
- Twitter sentiment analysis
- Exploring machine learning
- Twitter Tweet Collection and Machine Learning
品牌:中圖公司
上架時(shí)間:2021-07-02 18:31:00
出版社:Packt Publishing
本書數(shù)字版權(quán)由中圖公司提供,并由其授權(quán)上海閱文信息技術(shù)有限公司制作發(fā)行
- Summary 更新時(shí)間:2021-07-02 20:33:02
- Kafka spout sentiments bolt and HDFS bolt
- Using Kafka producer to store the tweets in a Kafka cluster
- Twitter sentiment analysis
- Exploring machine learning
- Twitter Tweet Collection and Machine Learning
- Summary
- Calculate the count for each operating system
- Calculate the count for each browser
- Calculate the page hit from each country
- MySQL queries
- Deploy topology
- Kafka spout and define topology
- Persisting the process data
- Calculate the search keyword
- Identifying country operating system type and browser type from the log file
- Splitting the Apache log line
- Why are we using Kafka between Logstash and Storm?
- Configuration of Logstash
- Installation of Logstash
- Why are we using Logstash?
- What is Logstash?
- Installation of Logstash
- Producing Apache log in Kafka using Logstash
- Apache log processing elements
- Apache Log Processing with Storm
- Summary
- Integrating Storm with Esper
- Integrating Storm with Elasticsearch
- Integrating Storm with Redis
- Integrating Storm with HBase
- Storm Integration with Redis Elasticsearch and HBase
- Summary
- Storm-Starter topologies on Storm-YARN
- Setting up Storm-YARN
- Integration of Storm with Hadoop
- Write Storm topology to persist data into HDFS
- Setting up YARN
- Setting up HDFS
- Getting the Hadoop bundle and setting up environment variables
- Setting passwordless SSH
- Installation of Hadoop
- ApplicationMaster (AM)
- NodeManager (NM)
- ResourceManager (RM)
- YARN
- Secondary namenode
- HDFS client
- Datanode
- Namenode
- Hadoop Distributed File System
- Hadoop Common
- Introduction to Hadoop
- Storm and Hadoop Integration
- Summary
- Deploy the Kafka topology on Storm cluster
- Kafka Storm integration
- Kafka producers and publishing data into Kafka
- Share ZooKeeper between Storm and Kafka
- Multiple Kafka brokers on a single node
- Setting up a three node Kafka cluster
- Setting up a single node Kafka cluster
- Installation of Kafka brokers
- Data retention
- Broker
- Consumer
- Replication
- Producer
- Kafka architecture
- Introduction to Kafka
- Integration of Storm and Kafka
- Summary
- Monitoring the Storm cluster using Ganglia
- Monitoring the Storm cluster using JMX
- Fetching information with Nimbus thrift
- Cluster statistics using the Nimbus thrift client
- Monitoring of Storm Cluster
- Summary
- Registering a CustomScheduler class
- Converting supervisors to slots
- Converting component IDs to executors
- Writing a custom supervisor class
- Configuration setting at component level
- Configuration changes in the supervisor node
- Custom scheduler
- Global component configuration
- Node-level configuration
- Worker-level configuration
- CPU usage example
- Memory usage example
- Component-level configuration
- Resource-aware scheduler
- Isolation scheduler
- Default scheduler
- Introduction to Storm scheduler
- Storm Scheduler
- Summary
- When to use Trident
- Distributed RPC
- Trident state
- Trident hello world topology
- Non-transactional topology
- groupBy before aggregate
- groupBy before partitionAggregate
- Trident groupBy operation
- Trident Topology and Uses
- Summary
- When to use Trident
- Utilizing the groupBy operation
- Aggregator chaining
- persistentAggregate
- CombinerAggregator
- Aggregator
- ReducerAggregator
- aggregate
- partitionAggregate
- Trident aggregator
- Utilizing partition operation
- Utilizing batchGlobal operation
- Utilizing broadcast operation
- Utilizing global operation
- Utilizing partitionBy operation
- Utilizing shuffle operation
- Trident repartitioning operations
- Trident projection
- Trident filter
- Trident function
- Writing Trident functions filters and projections
- Understanding Trident's data model
- Trident introduction
- Trident Introduction
- Summary
- Tick tuple
- Guaranteed message processing
- Custom grouping
- None grouping
- Local or shuffle grouping
- Direct grouping
- Global grouping
- All grouping
- Field grouping
- Shuffle grouping
- Different types of stream grouping in the Storm cluster
- Rebalance the parallelism of a SampleStormClusterTopology topology
- Rebalance the parallelism of a topology
- Worker process executor and task distribution
- Configure parallelism at the code level
- Task
- Executor
- Worker process
- Parallelism of a topology
- Storm Parallelism and Data Partitioning
- Summary
- Updating the log level from the Storm CLI
- Updating the log level from the Storm UI
- Dynamic log level settings
- Topology Summary section
- Nimbus Configuration section
- Supervisor Summary section
- Nimbus Summary section
- Cluster Summary section
- Walkthrough of the Storm UI
- Dynamic log level settings
- Kill
- Rebalance
- Activate
- Deactivate
- The different options of the Storm topology
- Developing the hello world example
- Setting up the Storm cluster
- Deployment of the ZooKeeper cluster
- Installing Java SDK 7
- Storm prerequisites
- Storm Deployment Topology Development and Topology Options
- Summary
- Programming languages
- Operation modes in Storm
- Definition of a Storm topology
- The Storm data model
- The ZooKeeper cluster
- Supervisor nodes
- Nimbus
- Storm components
- Features of Storm
- Apache Storm
- Real-Time Processing and Storm Introduction
- Questions
- Piracy
- Errata
- Downloading the color images of this book
- Downloading the example code
- Customer support
- Reader feedback
- Conventions
- Who this book is for
- What you need for this book
- What this book covers
- Preface
- Customer Feedback
- Why subscribe?
- www.PacktPub.com
- About the Reviewers
- About the Author
- Credits
- Mastering Apache Storm
- Copyright
- Title Page
- cover
- cover
- Title Page
- Copyright
- Mastering Apache Storm
- Credits
- About the Author
- About the Reviewers
- www.PacktPub.com
- Why subscribe?
- Customer Feedback
- Preface
- What this book covers
- What you need for this book
- Who this book is for
- Conventions
- Reader feedback
- Customer support
- Downloading the example code
- Downloading the color images of this book
- Errata
- Piracy
- Questions
- Real-Time Processing and Storm Introduction
- Apache Storm
- Features of Storm
- Storm components
- Nimbus
- Supervisor nodes
- The ZooKeeper cluster
- The Storm data model
- Definition of a Storm topology
- Operation modes in Storm
- Programming languages
- Summary
- Storm Deployment Topology Development and Topology Options
- Storm prerequisites
- Installing Java SDK 7
- Deployment of the ZooKeeper cluster
- Setting up the Storm cluster
- Developing the hello world example
- The different options of the Storm topology
- Deactivate
- Activate
- Rebalance
- Kill
- Dynamic log level settings
- Walkthrough of the Storm UI
- Cluster Summary section
- Nimbus Summary section
- Supervisor Summary section
- Nimbus Configuration section
- Topology Summary section
- Dynamic log level settings
- Updating the log level from the Storm UI
- Updating the log level from the Storm CLI
- Summary
- Storm Parallelism and Data Partitioning
- Parallelism of a topology
- Worker process
- Executor
- Task
- Configure parallelism at the code level
- Worker process executor and task distribution
- Rebalance the parallelism of a topology
- Rebalance the parallelism of a SampleStormClusterTopology topology
- Different types of stream grouping in the Storm cluster
- Shuffle grouping
- Field grouping
- All grouping
- Global grouping
- Direct grouping
- Local or shuffle grouping
- None grouping
- Custom grouping
- Guaranteed message processing
- Tick tuple
- Summary
- Trident Introduction
- Trident introduction
- Understanding Trident's data model
- Writing Trident functions filters and projections
- Trident function
- Trident filter
- Trident projection
- Trident repartitioning operations
- Utilizing shuffle operation
- Utilizing partitionBy operation
- Utilizing global operation
- Utilizing broadcast operation
- Utilizing batchGlobal operation
- Utilizing partition operation
- Trident aggregator
- partitionAggregate
- aggregate
- ReducerAggregator
- Aggregator
- CombinerAggregator
- persistentAggregate
- Aggregator chaining
- Utilizing the groupBy operation
- When to use Trident
- Summary
- Trident Topology and Uses
- Trident groupBy operation
- groupBy before partitionAggregate
- groupBy before aggregate
- Non-transactional topology
- Trident hello world topology
- Trident state
- Distributed RPC
- When to use Trident
- Summary
- Storm Scheduler
- Introduction to Storm scheduler
- Default scheduler
- Isolation scheduler
- Resource-aware scheduler
- Component-level configuration
- Memory usage example
- CPU usage example
- Worker-level configuration
- Node-level configuration
- Global component configuration
- Custom scheduler
- Configuration changes in the supervisor node
- Configuration setting at component level
- Writing a custom supervisor class
- Converting component IDs to executors
- Converting supervisors to slots
- Registering a CustomScheduler class
- Summary
- Monitoring of Storm Cluster
- Cluster statistics using the Nimbus thrift client
- Fetching information with Nimbus thrift
- Monitoring the Storm cluster using JMX
- Monitoring the Storm cluster using Ganglia
- Summary
- Integration of Storm and Kafka
- Introduction to Kafka
- Kafka architecture
- Producer
- Replication
- Consumer
- Broker
- Data retention
- Installation of Kafka brokers
- Setting up a single node Kafka cluster
- Setting up a three node Kafka cluster
- Multiple Kafka brokers on a single node
- Share ZooKeeper between Storm and Kafka
- Kafka producers and publishing data into Kafka
- Kafka Storm integration
- Deploy the Kafka topology on Storm cluster
- Summary
- Storm and Hadoop Integration
- Introduction to Hadoop
- Hadoop Common
- Hadoop Distributed File System
- Namenode
- Datanode
- HDFS client
- Secondary namenode
- YARN
- ResourceManager (RM)
- NodeManager (NM)
- ApplicationMaster (AM)
- Installation of Hadoop
- Setting passwordless SSH
- Getting the Hadoop bundle and setting up environment variables
- Setting up HDFS
- Setting up YARN
- Write Storm topology to persist data into HDFS
- Integration of Storm with Hadoop
- Setting up Storm-YARN
- Storm-Starter topologies on Storm-YARN
- Summary
- Storm Integration with Redis Elasticsearch and HBase
- Integrating Storm with HBase
- Integrating Storm with Redis
- Integrating Storm with Elasticsearch
- Integrating Storm with Esper
- Summary
- Apache Log Processing with Storm
- Apache log processing elements
- Producing Apache log in Kafka using Logstash
- Installation of Logstash
- What is Logstash?
- Why are we using Logstash?
- Installation of Logstash
- Configuration of Logstash
- Why are we using Kafka between Logstash and Storm?
- Splitting the Apache log line
- Identifying country operating system type and browser type from the log file
- Calculate the search keyword
- Persisting the process data
- Kafka spout and define topology
- Deploy topology
- MySQL queries
- Calculate the page hit from each country
- Calculate the count for each browser
- Calculate the count for each operating system
- Summary
- Twitter Tweet Collection and Machine Learning
- Exploring machine learning
- Twitter sentiment analysis
- Using Kafka producer to store the tweets in a Kafka cluster
- Kafka spout sentiments bolt and HDFS bolt
- Summary 更新時(shí)間:2021-07-02 20:33:02