Stunningly Simple
- The mathematics of Bayes Theorem are stunningly simple. In its most basic form, it is just an equation with three known variables and one unknown one.
- This simple formula can lead to surprising predictive insights.
Bayes and Laplace
- The intimate connection between probability, prediction, and scientific progress was thus well understood by Bayes and Laplace in the eighteenth century
- the period when human societies were beginning to take the explosion of information that had become available with the invention of the printing press several centuries earlier, and finally translate it into sustained scientific, technological, and economic progress.
Conditional Probability
- Bayes’s theorem is concerned with conditional probability. That is, it tells us the probability that a hypothesis is true if some event has happened.
Bayes Theorem
Probability that your partner is cheating on you, given an event
- Event: you come home from a business trip to discover a strange pair of underwear
Underwear Example*
* The Signal and the Noise: Why So Many Predictions Fail--but Some Don't, Nate Silver, 2012
p(u/c)
The probability of underwear u given cheating c
- Probability of underwear appearing, conditional on his cheating
- 50%
p(u)
The probability of the underwear u appearing if NO cheating
- Probability of the underwear’s appearing conditional on the hypothesis being false
- 5%
p(c)
The probability of cheating c
- What is the probability you would have assigned to him cheating on you before you found the underwear?
- 4%
Active Learning – Calculate Cheating Probability
Example: Classification of Drew
- We have two classes: c1=male, and c2=female
- Classifying drew as male or female is equivalent to asking is it more probable that drew is male or female.
Using Data
Bayesian Approach
- Posterior probability based on prior probability plus a new event
Classification of Documents
Questions We Can Answer
- Is this spam?
- Who wrote which Federalist papers?
- Positive or negative movie review?
- What is the subject of this article?
Text Classification
- Assigning subject categories, topics, or genres
- Authorship identification
- Age/gender identification
- Language Identification
- Sentiment analysis
- ...
For Active Learning we will use*
* http://en.wikipedia.org/wiki/Naive_Bayes_spam_filtering
Calculating Probabilities
- probability that word shows up in a language
- probability that word is not in language
Underflow Prevention
- Multiplying lots of probabilities can result in floating-point underflow. Since log(xy) = log(x) + log(y); better to sum logs of probabilities instead of multiplying probabilities.
- Add probability of words (per language) using:
- In JavaScript ln is Math.log, and e is Math.exp
- At completion of each language: