- Machine Learning with Spark(Second Edition)
- Rajdeep Dua Manpreet Singh Ghotra Nick Pentreath
- 321字
- 2021-07-09 21:07:51
Hypothesis
X denotes the input variables, also called input features, and y denotes the output or target variable that we are trying to predict. The pair (x, y) is called a training example, and the dataset used to learn is a list of m training examples, where {(x, y)} is a training set. We will also use X to denote the space of input values, and Y to denote the space of output values. For a training set, to learn a function, h: X → Y so that h(x) is a predictor for the value of y. Function h is called a hypothesis.
When the target variable to be predicted is continuous, we call the learning problem a regression problem. When y can take a small number of discrete values, we call it a classification problem.
Let's say we choose to approximate y as a linear function of x.
The hypothesis function is as follows:

In this last hypothesis function, the θi 's are parameters, also known as weights, which parameterize the space of linear functions mapping from X to Y. To simplify the notation, we also introduce the convention of letting x0 = 1 (this is the intercept term), such that:

On the RHS, we view θ and x both as vectors, and n is the number of input variables.
Now before we proceed any further, it's important to note that we will now be transitioning from mathematical fundamentals to learning algorithms. Optimizing the cost function and learning θ will lay the foundation to understand machine learning algorithms.
Given a training set, how do we learn the parameters θ? One method that looks possible is to get h(x) close to y for the given training examples. We shall define a function that measures, for each value of the θs, how close the h(x(i))s are to the corresponding y (i) s. We define this as a cost function.

- Circos Data Visualization How-to
- 數(shù)據(jù)挖掘?qū)嵱冒咐治?/a>
- 大數(shù)據(jù)時(shí)代的數(shù)據(jù)挖掘
- 精通數(shù)據(jù)科學(xué)算法
- 西門(mén)子變頻器技術(shù)入門(mén)及實(shí)踐
- Python:Data Analytics and Visualization
- 網(wǎng)站前臺(tái)設(shè)計(jì)綜合實(shí)訓(xùn)
- 激光選區(qū)熔化3D打印技術(shù)
- Spatial Analytics with ArcGIS
- The DevOps 2.1 Toolkit:Docker Swarm
- Creating ELearning Games with Unity
- Microsoft 365 Mobility and Security:Exam Guide MS-101
- 歐姆龍CP1H型PLC編程與應(yīng)用
- 數(shù)據(jù)庫(kù)技術(shù):Access 2003計(jì)算機(jī)網(wǎng)絡(luò)技術(shù)
- 信息技術(shù)基礎(chǔ)應(yīng)用