- Python Deep Learning
- Ivan Vasilev Daniel Slater Gianmario Spacagna Peter Roelants Valentino Zocca
- 534字
- 2021-07-02 14:31:01
Naive Bayes
Naive Bayes is different from many other machine learning algorithms. Most machine learning techniques try to evaluate the probability of a certain event, Y , and given conditions, X, which we denote with . For example, when we are given a picture that represents digits (that is, a picture with a certain distribution of pixels), what is the probability that the number is five? If the pixel's distribution is close to the pixel distribution of other examples that were labeled as five, the probability of that event will be high. If not, the probability will be low.
Sometimes we have the opposite information, given the fact that we know that we have an event, Y. We also know the probability, that our sample is X. The Bayes theorem states that , where
means the probability of event, X, given Y, which is also why naive Bayes is called a generative approach. For example, we may calculate the probability that a certain pixel configuration represents the number five, knowing what the probability is. Given that we have a five, that a random pixel configuration may match the given one.
This is best understood in the realm of medical testing. Let's say we conduct a test for a specific disease or cancer. Here, we want to know the probability of a patient having a particular disease, given that our test result was positive. Most tests have a reliability value, which is the percentage chance of the test being positive when administered on people with a particular disease. By reversing the expression, we get the following:
p(cancer | test=positive) = p(test=positive | cancer) * p(cancer) / p(test=positive)
Let's assume that the test is 98% reliable. This means that if the test is positive, it will also be positive in 98% of cases. Conversely, if the person does not have cancer, the test result will be negative. Let's make some assumptions on this kind of cancer:
- This particular kind of cancer only affects older people
- Only 2% of people under 50 have this kind of cancer
- The test administered on people under 50 is positive only for 3.9% of the population (we could have derived this fact from the data, but we provide this information for the purpose of simplicity)
We can ask the following question: if a test is 98% accurate for cancer and if a 45-year-old person took the test, which turned out to be positive, what is the probability that they may have cancer? Using the preceding formula, we can calculate the following:
p(cancer | test=positive) = 0.98 * 0.02 / 0.039 = 0.50
We call this classifier naive because it assumes the independence of different events to calculate their probability. For example, if the person had two tests instead of one, the classifier will assume that the outcome of test 2 did not know about the outcome of test 1, and the two tests were independent from one another. This means that taking test 1 could not change the outcome of test 2, and therefore its result was not biased by the first test.
- 高手是如何做產品設計的(全2冊)
- C語言程序設計案例教程(第2版)
- PHP基礎案例教程
- Android 7編程入門經典:使用Android Studio 2(第4版)
- MySQL數據庫管理與開發實踐教程 (清華電腦學堂)
- Visual C++數字圖像處理技術詳解
- Programming with CodeIgniterMVC
- Internet of Things with ESP8266
- C#程序設計(項目教學版)
- Python程序設計與算法基礎教程(第2版)(微課版)
- Mastering Elixir
- Visual Basic程序設計基礎
- Offer來了:Java面試核心知識點精講(框架篇)
- SCRATCH編程課:我的游戲我做主
- Spring Web Services 2 Cookbook