- Hands-On Deep Learning for Games
- Micheal Lanham
- 353字
- 2021-06-24 15:47:55
The Cost function
A Cost function describes the average sum of errors for a batch in our entire network and is often defined by this equation:

The input is defined as each weight and the output is the total average cost we encountered over the processed batch. Think of this cost as the average sum of errors. Now, our goal here is to minimize this function or the cost of errors to the lowest value possible. In the previous couple of examples, we have seen a technique called gradient descent being used to minimize this cost function. Gradient descent works by differentiating the Cost function and determining the gradient with respect to each weight. Then, for each weight, or dimension if you will, the algorithm alters the weight based on the calculated gradient that minimizes the Cost function.
Before we get into the heavy math that explains the differentiation, let's see how gradient descent works in two dimensions, with the following diagram:

In simpler terms, all that the algorithm is doing is just trying to find the minimum in slow gradual steps. We use small steps in order to avoid overshooting the minimum, which as you have seen earlier can happen (remember the wobble). That is where the term learning rate also comes in, which determines how fast we want to train. The slower the training, the more confident you will be in your results, but usually at a cost of time. The alternative is to train quicker, using a higher learning rate, but, as you can see now, it may be easy to overshoot any global minimum.
Gradient descent is the simplest form we will talk about, but keep in mind that there are also several advanced variations of other optimization algorithms we will explore. In the TF example, for instance, we used AdamOptimizer to minimize the Cost function, but there are several other variations. For now, though, we will focus on how to calculate the gradient of the Cost function and understand the basics of backpropagation with gradient descent in the next section.
- SQL Server 2012數據庫技術與應用(微課版)
- Access 2007數據庫應用上機指導與練習
- Enterprise Integration with WSO2 ESB
- Mastering Machine Learning with R(Second Edition)
- 數據驅動:從方法到實踐
- 軟件成本度量國家標準實施指南:理論、方法與實踐
- Hadoop 3.x大數據開發實戰
- 一個64位操作系統的設計與實現
- 數據挖掘原理與SPSS Clementine應用寶典
- 云數據中心網絡與SDN:技術架構與實現
- 一本書講透Elasticsearch:原理、進階與工程實踐
- 數據庫與數據處理:Access 2010實現
- Python數據分析從小白到專家
- 大數據技術原理與應用:概念、存儲、處理、分析與應用
- 數據指標體系:構建方法與應用實踐