- Mastering Java for Data Science
- Alexey Grigorev
- 277字
- 2021-07-02 23:44:36
AOL Cyclops React
As we already learned, Java Streams API is a very powerful way of dealing with data in a functional way. The Cyclops React library extends this API by adding new operations on streams and allows for more control of the flow execution. To include the library, add this to the pom.xml file:
<dependency>
<groupId>com.aol.simplereact</groupId>
<artifactId>cyclops-react</artifactId>
<version>1.0.0-RC4</version>
</dependency>
Some of the methods it adds are zipWithIndex and cast and convenience collectors such as toList, toSet, and toMap. What is more, it gives more control for parallel execution, for example, it is possible to provide a custom executor, which will be used for processing data or intercepting exceptions declaratively.
Also, with this library, it is easy to create a parallel stream from the iterator--it is hard to do it with the standard library.
For example, let's take words.txt, extract all POS tags from it, and then create a map that associates each tag with a unique index. For reading data, we will use LineIterator from Commons IO, which otherwise would be hard to parallelize using only standard Java APIs. Additionally, we create a custom executor, which will be used for executing the stream operations in parallel:
LineIterator it = FileUtils.lineIterator(new File("data/words.txt"), "UTF-8");
ExecutorService executor = Executors.newCachedThreadPool();
LazyFutureStream<String> stream =
LazyReact.parallelBuilder().withExecutor(executor).from(it);
Map<String, Integer> map = stream
.map(line -> line.split("t"))
.map(arr -> arr[1].toLowerCase())
.distinct()
.zipWithIndex()
.toMap(Tuple2::v1, t -> t.v2.intValue());
System.out.println(map);
executor.shutdown();
it.close();
It is a very simple example and does not come close to describing all the functionality available in this library. For more information, refer to their documentation, which can be found at https://github.com/aol/cyclops-react. We will also use it in other examples in later chapters.
- 數(shù)據(jù)存儲架構(gòu)與技術(shù)
- 數(shù)據(jù)分析實戰(zhàn):基于EXCEL和SPSS系列工具的實踐
- 有趣的二進(jìn)制:軟件安全與逆向分析
- Effective Amazon Machine Learning
- SQL優(yōu)化最佳實踐:構(gòu)建高效率Oracle數(shù)據(jù)庫的方法與技巧
- Oracle PL/SQL實例精解(原書第5版)
- 數(shù)據(jù)庫設(shè)計與應(yīng)用(SQL Server 2014)(第二版)
- MySQL數(shù)據(jù)庫技術(shù)與應(yīng)用
- Spring Boot 2.0 Cookbook(Second Edition)
- 改進(jìn)的群智能算法及其應(yīng)用
- 云原生架構(gòu):從技術(shù)演進(jìn)到最佳實踐
- 數(shù)據(jù)挖掘算法實踐與案例詳解
- 領(lǐng)域驅(qū)動設(shè)計精粹
- Learn Selenium
- C# 7 and .NET Core 2.0 High Performance