A simple explanation of the Bayes theorem

The Bayes theorem is detailed in a separate article . This is a great job, but it has 15,000 words. In the same translation of the article from Kalid Azad, the very essence of the theorem is briefly explained.

The Bayes theorem turns test results into probability events.

We will understand the method

The article referenced at the beginning of this essay deals with a diagnostic method (mammogram) detecting breast cancer. Consider this method in detail.

Now let's create the following table:

Sick (1%)Do not get sick (99%)
Positive result of the method
Negative result of the method

How to work with this data?

How accurate is the method?

Now we analyze the positive test result. What is the probability that a person is really sick: 80%, 90%, 1%?

Let's think:

Now the table looks like this:
Sick (1%)Do not get sick (99%)
Positive result of the method
True positive:
1% * 80% = .008
False positive:
99% * 9.6% = .09504
Negative result of the method
False negative:
1% * 20% = .002
99% * 90.4% = .89496

What is the probability that a person is really sick if a mammogram positive result is obtained? The probability of an event is the ratio of the number of possible outcomes of an event to the total number of all possible outcomes.

event probability = event outcomes / all possible outcomes

The probability of a true positive result is .008. The probability of a positive result is the probability of a true positive outcome + the probability of a false positive.

(.008 + 0.09504 = .10304)

So, the probability of disease with a positive result of the study is calculated as: .008 / .10304 = 0.0776. This value is about 7.8%.

That is, a positive mammogram result only means that the probability of having a disease is 7.8%, not 80% (the latter value is only the estimated accuracy of the method). This result seems at first incomprehensible and strange, but you need to take into account: the method gives a false positive result in 9.6% of cases (and this is quite a lot), so the sample will have a lot of false positive results. For a rare disease, most positive results will be false positive.

Let's take a quick look at the table and try to intuitively grasp the meaning of the theorem. If we have 100 people, only one of them has a disease (1%). With this person, with 80% probability, the method will give a positive result. Of the remaining 99%, 10% will have positive results, which gives us, roughly speaking, 10 false-positive outcomes out of 100. If we consider all positive results, then only 1 out of 11 will be true. Thus, if a positive result is obtained, the probability of the disease is 1/11.

Above, we considered that this probability is 7.8%, i.e. the number is actually closer to 1/13, but here, using simple reasoning, we managed to find a rough estimate without a calculator.

Bayes theorem

Now we will describe the course of our thoughts with a formula, which is called the Bayes theorem. This theorem allows you to correct the results of the study in accordance with the distortion that false-positive results introduce:

Pr(A|X)= fracPr(X|A)Pr(A)Pr(X|A)Pr(A)+Pr(X|notA)Pr(notA)

It can be concluded: to get the probability of an event, the probability of a true positive outcome must be divided by the probability of all positive outcomes. Now we can simplify the equation:

Pr(A|X)= fracPr(X|A)Pr(A)Pr(X)

Pr (X) is the normalization constant. She served us well: without her, the positive outcome of the tests would give us an 80% chance of an event.
Pr (X) is the probability of any positive result, whether it is a real positive result in the study of patients (1%) or a false positive in the study of healthy people (99%).

In our example, Pr (X) is a rather large number, because the probability of false-positive results is high.

Pr (X) produces a result of 7.8%, which at first glance seems counterintuitive.

The meaning of the theorem

We are testing to find out the true state of affairs. If our tests are perfect and accurate, then the probabilities of the tests and the probabilities of events will coincide. All positive results will be really positive, and negative - negative. But we live in the real world. And in our world, trials give wrong results. The Bayes theorem takes into account distorted results, corrects errors, recreates the population and finds the probability of a true positive result.

Spam filter

Bayes theorem is successfully used in spam filters.

We have:

Pr(spam|words)= fracPr(words|spam)Pr(spam)Pr(words)

The filter takes into account the test results (the content in the letter of certain words) and predicts whether the letter contains spam. Everyone understands that, for example, the word "Viagra" is more common in spam than in regular letters.

The blacklist-based spam filter has flaws - it often gives false positive results.

The spam filter based on the Bayes theorem uses a balanced and reasonable approach: it works with probabilities. When we analyze words in a letter, we can calculate the probability that the letter is spam, and not make decisions like “yes / no.” If the probability that a letter contains spam is 99%, then the letter really is.

Over time, the filter is trained on a larger sample and updates the probabilities. For example, advanced filters based on the Bayes theorem test multiple words in a row and use them as data.

Additional sources:

Source: https://habr.com/ru/post/408775/

All Articles