- The Artificial Intelligence Infrastructure Workshop
- Chinmay Arankalle Gareth Dwyer Bas Geerdink Kunal Gera Kevin Liao Anand N.S.
- 201字
- 2021-06-11 18:35:26
Summary
In this chapter, we have discussed many ways to prepare data for machine learning and other forms of AI. Raw data from source systems had to be transported across the data layers of a modern data lake, including a historical data archive, a set of (virtualized) analytics datasets, and a machine learning environment. There are several tools for creating such a data pipeline: simple scripts and traditional software, ETL tools, big data processing frameworks, and streaming data engines.
We have also introduced the concept of feature engineering. This is an important piece of work in any AI system, where data is prepared to be consumed by a machine learning model. Independent of the programming language and frameworks that are chosen for this, an AI team has to spend significant time writing the features and ensuring that the resulting code and binaries are well managed and deployed, together with the models themselves.
We have performed exercises and activities where we have worked with Bash scripts, Jupyter Notebooks, Spark, and finally, stream processing with live Twitter data.
In the next chapter, we will look into a less technical but very important topic for data engineering and machine learning: the ethics of AI.
- FPGA從入門到精通(實(shí)戰(zhàn)篇)
- 電腦軟硬件維修大全(實(shí)例精華版)
- 電腦組裝與維修從入門到精通(第2版)
- 數(shù)字道路技術(shù)架構(gòu)與建設(shè)指南
- 電腦維護(hù)365問
- The Deep Learning with Keras Workshop
- STM32嵌入式技術(shù)應(yīng)用開發(fā)全案例實(shí)踐
- Managing Data and Media in Microsoft Silverlight 4:A mashup of chapters from Packt's bestselling Silverlight books
- 單片機(jī)原理及應(yīng)用:基于C51+Proteus仿真
- 電腦橫機(jī)使用與維修
- Drupal Rules How-to
- Blender for Video Production Quick Start Guide
- 基于S5PV210處理器的嵌入式開發(fā)完全攻略
- 主板維修實(shí)踐技術(shù)
- 計(jì)算機(jī)組裝與維護(hù)立體化教程(微課版)