官术网_书友最值得收藏!

High-level Stream Java API

The high-level Apex Stream Java API provides an abstraction from the lower level DAG API. It is a declarative, fluent style API that is easier to learn for someone new to Apex. Instead of identifying inpidual operators, the developer works with methods on the stream interface to specify the transformations.

The API will internally keep track of the operator(s) needed for each of the transformations and eventually translate it into the lower level DAG. The high-level API is part of the Apex library and outside of the Apex engine.

Here is the Word Count example application written with the high-level API (using Java 8 syntax):

StreamFactory.fromFolder("/tmp") 
   .flatMap(input -> Arrays.asList(input.split(" ")), name("Words")) 
   .window(new WindowOption.GlobalWindow(), 
           new 
TriggerOption().accumulatingFiredPanes().withEarlyFiringsAtEvery(1)) .countByKey(input -> new Tuple.PlainTuple<>(new KeyValPair<>(input, 1L)),
name("countByKey")) .map(input -> input.getValue(), name("Counts")) .print(name("Console")) .populateDag(dag);

Windowing is supported and stateful transformations can be applied to a windowed stream, as shown with countByKey in the preceding code listing. The inpidual windowing options will be explained later in the Windowing and time section, as they are applicable in a broader context.

In addition to the transformations that are directly available through the Stream API, the developer can also use other (possibly custom) operators through the addOperator(..) and endsWith(..) methods. For example, if output should be written to JDBC, the connector from the library can be integrated using these generic methods instead of requiring the stream API to have a method like toJDBC.

The ability to add additional operators is important, because not all possible functionality can be baked into the API and larger projects typically require customizations to operators or additional operators that are not part of the Apex library. Additionally, there are many connectors available as part of the library, each with its own set of dependencies and, sometimes, these dependencies and connectors may conflict. In this situation it isn't practical or possible to add a method for each connector to the API. Instead, the developer needs to be able to plug-in the required connector and use it along with the generally applicable transformations that are part of the Stream API.

It is also possible to extend the Stream API with custom methods to provide new transformations without exposing the details of the underlying operator. An example for this extension mechanism can be found in the API unit tests.

For readers interested to explore the API further, there is a set of example applications in the apex-malhar repository at https://github.com/apache/apex-malhar/tree/master/examples/highlevelapi.

The Stream API is relatively new and there are several enhancements planned, including expansion of the set of windowed transforms, watermark handling, and custom trigger support. The community is also discussing expanding the language support to include a native API for Scala and Python.

主站蜘蛛池模板: 独山县| 游戏| 扎兰屯市| 壶关县| 永城市| 瑞金市| 罗源县| 西丰县| 云龙县| 辰溪县| 长宁区| 巴林左旗| 奇台县| 永春县| 酒泉市| 富源县| 新乡市| 阿合奇县| 兰西县| 贺兰县| 康乐县| 武强县| 宣城市| 历史| 芦山县| 锦屏县| 津南区| 四子王旗| 营口市| 莒南县| 定西市| 黔西| 松滋市| 驻马店市| 巴塘县| 廉江市| 剑川县| 安宁市| 蓬莱市| 瓦房店市| 吕梁市|