- Mastering Java for Data Science
- Alexey Grigorev
- 277字
- 2021-07-02 23:44:36
AOL Cyclops React
As we already learned, Java Streams API is a very powerful way of dealing with data in a functional way. The Cyclops React library extends this API by adding new operations on streams and allows for more control of the flow execution. To include the library, add this to the pom.xml file:
<dependency>
<groupId>com.aol.simplereact</groupId>
<artifactId>cyclops-react</artifactId>
<version>1.0.0-RC4</version>
</dependency>
Some of the methods it adds are zipWithIndex and cast and convenience collectors such as toList, toSet, and toMap. What is more, it gives more control for parallel execution, for example, it is possible to provide a custom executor, which will be used for processing data or intercepting exceptions declaratively.
Also, with this library, it is easy to create a parallel stream from the iterator--it is hard to do it with the standard library.
For example, let's take words.txt, extract all POS tags from it, and then create a map that associates each tag with a unique index. For reading data, we will use LineIterator from Commons IO, which otherwise would be hard to parallelize using only standard Java APIs. Additionally, we create a custom executor, which will be used for executing the stream operations in parallel:
LineIterator it = FileUtils.lineIterator(new File("data/words.txt"), "UTF-8");
ExecutorService executor = Executors.newCachedThreadPool();
LazyFutureStream<String> stream =
LazyReact.parallelBuilder().withExecutor(executor).from(it);
Map<String, Integer> map = stream
.map(line -> line.split("t"))
.map(arr -> arr[1].toLowerCase())
.distinct()
.zipWithIndex()
.toMap(Tuple2::v1, t -> t.v2.intValue());
System.out.println(map);
executor.shutdown();
it.close();
It is a very simple example and does not come close to describing all the functionality available in this library. For more information, refer to their documentation, which can be found at https://github.com/aol/cyclops-react. We will also use it in other examples in later chapters.
- 大數(shù)據(jù)技術(shù)基礎(chǔ)
- 數(shù)據(jù)庫技術(shù)與應用教程(Access)
- Google Visualization API Essentials
- Test-Driven Development with Mockito
- 正則表達式必知必會
- Python數(shù)據(jù)分析、挖掘與可視化從入門到精通
- SQL查詢:從入門到實踐(第4版)
- 大數(shù)據(jù)算法
- 大數(shù)據(jù)營銷:如何讓營銷更具吸引力
- 數(shù)據(jù)庫技術(shù)及應用教程
- Mastering LOB Development for Silverlight 5:A Case Study in Action
- Spring Boot 2.0 Cookbook(Second Edition)
- 中國云存儲發(fā)展報告
- 大數(shù)據(jù)測試技術(shù):數(shù)據(jù)采集、分析與測試實踐(在線實驗+在線自測)
- 領(lǐng)域驅(qū)動設(shè)計精粹