書名： JavaScript Cloud Native Development Cookbook
作者名： John Gilbert
本章字?jǐn)?shù)： 438字
更新時(shí)間： 2021-07-16 18:03:33

How it works...

In this recipe, we implement a Command-Line Interface (CLI) program that reads events from the data lake S3 bucket and sends them to a specific AWS Lambda function. When replaying events, we do not re-publish the events because this would broadcast the events to all subscribers. Instead, we want to replay events to a specific function to either repair the specific service or seed a new service.

When executing the program, we provide the name of the data lake bucket and the specific path prefix as arguments. The prefix allows us to replay only a portion of the events, such as a specific month, day, or hour. The program uses functional reactive programming with the Highland.js library. We use a generator function to page through the objects in the bucket and push each object down the stream. Backpressure is a major advantage of this programming approach, as we will discuss in Chapter 8, Designing for Failure. If we retrieved all the data from the bucket in a loop, as we would in the imperative programming style, then we would likely run out of memory and/or overwhelm the Lambda function and receive throttling errors.

Instead, we pull data through the stream. When downstream steps are ready for more work they pull the next piece of data. This triggers the generator function to paginate data from S3 when the program is ready for more data.

When storing events in the data lake bucket, Kinesis Firehose buffers the events until a maximum amount of time is reached or a maximum file size is reached. This buffering maximizes the write performance when saving the events. When transforming the data for these files, we delimited the events with an EOL character. Therefore, when we get a specific file, we leverage the Highland.js split function to stream each row in the file one at a time. The split function also supports backpressure.

For each event, we invoke the function specified in the command-line arguments. These functions are designed to listen for events from a Kinesis stream. Therefore, we must wrap each event in the Kinesis input format that these functions are expecting. This is one reason why we included the Kinesis metadata when saving the events to the data lake in the Creating a data lake recipe. To maximize throughput, we invoke the Lambda asynchronously with the Event InvocationType, provided that the payload size is within the limits. Otherwise, we invoke the Lambda synchronously with the RequestReponse InvocationType. We also leverage the Lambda DryRun feature so that we can see what events might be replayed before actually effecting the change.

官术网_书友最值得收藏!

JavaScript Cloud Native Development Cookbook

How it works...