- Machine Learning with Swift
- Alexander Sosnovshchenko
- 428字
- 2021-06-24 18:55:02
Calculating the distance
How do we calculate a distance? Well, that depends on the kind of problem. In two-dimensional space, we used to calculate the distance between two points, (x1, y1) and (x2, y2), as —the Euclidean distance. But this is not how taxi drivers calculate distance because in the city you can't cut corners and go straight to your goal. So, they use (knowing it or not) another distance metric: Manhattan distance or taxicab distance, also known as l1-norm:
. This is the distance if we're only allowed to move along coordinate axes:

Jewish German mathematician Hermann Minkowski proposed a generalization of both Euclidean and Manhattan distances. Here is the formula for the Minkowski distance:

where p and q are n-dimensional vectors (or coordinates of points in n-dimensional space if you wish). But what does c stand for? It is an order of the Minkowsi distance: under the c = 1, it gives an equation of Manhattan distance, and under c = 2 it gives Euclidean distance.
In machine learning, we generalize the notion of distance to any kind of objects for which we can calculate how similar they are, using a function: distance metric. In this way, we can define the distance between two pieces of text, two pictures, or two audio signals. Let's take a look at two examples.
When you deal with two pieces of text of equal length, you use edit distance; for example, Hamming distance—the minimum number of substitutions needed to transform one string into another. To calculate the edit distance, we use dynamic programming, an iterative approach where the problem is broken into small subproblems, and the result of each step is remembered for future computations. Edit distance is an important measure in applications that deal with text revisions; for example, in bioinformatics (see the following diagram):

Often, we store different signals (audio, motion data, and so on) as arrays of numbers. How do we measure the similarity of such two arrays? We use the combination of Euclidean distance and edit distance, called DTW.
- 圖解西門子S7-200系列PLC入門
- FPGA從入門到精通(實(shí)戰(zhàn)篇)
- 電腦組裝與維修從入門到精通(第2版)
- 硬件產(chǎn)品經(jīng)理成長(zhǎng)手記(全彩)
- 深入淺出SSD:固態(tài)存儲(chǔ)核心技術(shù)、原理與實(shí)戰(zhàn)(第2版)
- 數(shù)字邏輯(第3版)
- Learning Game Physics with Bullet Physics and OpenGL
- Building 3D Models with modo 701
- 單片機(jī)開發(fā)與典型工程項(xiàng)目實(shí)例詳解
- 基于PROTEUS的電路設(shè)計(jì)、仿真與制板
- 3D Printing Blueprints
- 可編程邏輯器件項(xiàng)目開發(fā)設(shè)計(jì)
- 計(jì)算機(jī)組裝、維護(hù)與維修項(xiàng)目教程
- 電腦主板維修技術(shù)
- 主板維修實(shí)踐技術(shù)