- Supervised Machine Learning with Python
- Taylor Smith
- 588字
- 2021-06-24 14:01:05
Loss functions
First, we'll cover loss functions, and then, prior to diving into hill climbing and descent, we'll take a quick math refresher.
So, as mentioned before, a machine learning algorithm has to measure how close it is to some objective. We define this as a cost function, or a loss function. Sometimes, we hear it referred to as an objective function. Although not all machine learning algorithms are designed to directly minimize a loss function, we're going to learn the rule here rather than the exception. The point of a loss function is to determine the goodness of a model fit. It is typically evaluated over the course of a model's learning procedure and converges when the model has maximized its learning capacity.
A typical loss function computes a scalar value which is given by the true labels and the predicted labels. That is, given our actual y and our predicted y, which is . This notation might be cryptic, but all it means is that some function, L, which we're going to call our loss function, is going to accept the ground truth, which is y and the predictions,
, and return some scalar value. The typical formula for the loss function is given as follows:

So, I've listed several common loss functions here, which may or may not look familiar. Sum of Squared Error (SSE) is a metric that we're going to be using for our regression models:

Cross entropy is a very commonly used classification metric:

In the following diagram, the L function on the left is simply indicating that it is our loss function over y and given parameter theta. So, for any algorithm, we want to find the set of the theta parameters that minimize the loss. That is, if we're predicting the cost of a house, for example, we may want to estimate the cost per square foot as accurately as possible so as to minimize how wrong we are.
Parameters are often in a much higher dimensional space than can be represented visually. So, the big question we're concerned with is the following: How can we minimize the cost? It is typically not feasible for us to attempt every possible value to determine the true minimum of a problem. So, we have to find a way to descend this nebulous hill of loss. The tough part is that, at any given point, we don't know whether the curve goes up or down without some kind of evaluation. And that's precisely what we want to avoid, because it's very expensive:

We can describe this problem as waking up in a pitch-black room with an uneven floor and trying to find the lowest point in the room. You don't know how big the room is. You don't know how deep or how high it gets. Where do you step first? One thing we can do is to examine exactly where we stand and determine which direction around us slopes downward. To do that, we have to measure the slope of the curve.
- Mastering Hadoop 3
- 平面設(shè)計(jì)初步
- Pig Design Patterns
- 人工智能趣味入門:光環(huán)板程序設(shè)計(jì)
- Lightning Fast Animation in Element 3D
- Salesforce for Beginners
- 氣動(dòng)系統(tǒng)裝調(diào)與PLC控制
- 電腦日常使用與維護(hù)322問
- Linux Shell編程從初學(xué)到精通
- 嵌入式GUI開發(fā)設(shè)計(jì)
- 單片機(jī)技術(shù)項(xiàng)目化原理與實(shí)訓(xùn)
- 人工智能:智能人機(jī)交互
- Hands-On Deep Learning with Go
- x86/x64體系探索及編程
- Flink內(nèi)核原理與實(shí)現(xiàn)