- Machine Learning with Go Quick Start Guide
- Michael Bironneau Toby Coleman
- 338字
- 2021-06-24 13:34:00
Defining problem and objectives
Before any development begins, the problem to be solved must be defined together with objectives of what good will look like, to set expectations. The way the problem is formulated is very important, as this can mean the difference between intractability and a simple solution. It is also likely to involve a conversation about where the input data for any algorithm will come from.
The typical formulation of an ML problem takes the form given X dataset, predict Y. The availability of data or lack of it thereof can affect the formulation of the problem, the solution, and its feasibility. For example, consider the problem given a large labeled set of images of handwritten digits[18], predict the label of a previously unseen image. Deep learning algorithms have demonstrated that it is possible to achieve relatively high accuracy on this particular problem with little work on the part of the engineer, as long as the training dataset is sufficiently large[19]. If the training set is not large, the problem immediately becomes more difficult and requires a careful selection of the algorithm to use. It also affects the accuracy and thus, the set of attainable objectives.
Experiments performed by Michael Nielsen on the MNIST handwritten digit dataset show that the difference between training an ML algorithm with 1 example of labeled input/output pairs per digit and 5 examples was an improvement of accuracy from around 40% to around 65% for most algorithms tested[20]. Using 10 examples per digit usually raised the accuracy a further 5%.
If insufficient data is available to meet the project objectives, it is sometimes possible to boost performance by artificially expanding the dataset by making small changes to existing examples. In the previously mentioned experiments, Nielsen observed that adding slightly rotated or translated images to the dataset improved performance by as much as 15%.
- Augmented Reality with Kinect
- Linux KVM虛擬化架構實戰(zhàn)指南
- 深入淺出SSD:固態(tài)存儲核心技術、原理與實戰(zhàn)
- Linux運維之道(第2版)
- 電腦常見故障現(xiàn)場處理
- 數(shù)字邏輯(第3版)
- Manage Partitions with GParted How-to
- 電腦軟硬件維修從入門到精通
- 微軟互聯(lián)網(wǎng)信息服務(IIS)最佳實踐 (微軟技術開發(fā)者叢書)
- Spring Cloud微服務架構實戰(zhàn)
- Arduino BLINK Blueprints
- VMware Workstation:No Experience Necessary
- Spring Cloud微服務和分布式系統(tǒng)實踐
- 基于PROTEUS的電路設計、仿真與制板
- 從企業(yè)級開發(fā)到云原生微服務:Spring Boot實戰(zhàn)