官术网_书友最值得收藏!

Chapter 2. Data Pipelines

In the first chapter, you were acquainted with some rudimentary concepts regarding data processing, clustering, and classification.

This chapter is dedicated to the creation and maintenance of a flexible end-to-end workflow to train and classify data. The first section of the chapter introduces a data-centric (functional) approach to create number crunching applications, followed by a description of a configurable workflow computation model. The chapter concludes with an overview of different model validation techniques.

You will learn how to do the following:

  • Apply the concept of monadic design to create dynamic workflows
  • Leverage some of Scala's advanced patterns, such as the cake pattern, to build portable computational workflows
  • Take into account the bias-variance trade-off in selecting a model
  • Overcome overfitting in modeling
  • Break down data into training, test and validation sets
  • Implement model validation in Scala using precision, recall, and F score
主站蜘蛛池模板: 珠海市| 东丰县| 资源县| 绩溪县| 隆尧县| 钦州市| 双柏县| 七台河市| 永靖县| 工布江达县| 樟树市| 周至县| 涿鹿县| 南靖县| 安阳县| 昌乐县| 雅安市| 信阳市| 于都县| 沙坪坝区| 金塔县| 彭阳县| 壤塘县| 安龙县| 新宁县| 福鼎市| 阿尔山市| 长治县| 潞西市| 肥东县| 彰化市| 荣昌县| 岳阳市| 双城市| 焉耆| 林州市| 南通市| 车险| 英山县| 襄汾县| 普宁市|