Naive Bayes Classifier

June 29, 2024

Bayes classifier

Logistic regression is trained to follow $P(y x)$
Bayes classifier is classified using Bayes’ theorem

$P(y|x) = \frac{p(x|y)P(y)}{p(x)}$

Curse of Dimension

Curse of dimension occurs as more independent variable K is used
- The amount of data needed to approximate distribution in K combination of spaces increases exponentially
Small number of samples is enough to make probability distribution if number of independent variables is small
Exponentially more data is needed for more independent variables

Naive Bayes classifier

Solution is needed for curse of dimension when many independent variables exist.
Curse of dimension can be solved by removing dependency of independent variables

How to get a probability distribution

How to get a probability distribution of a discrete variable
- counting
How to get a probability distribution of continuous variable
- Discretize
  - Make variables countable by bucketing
- Probability density estimation
  - Suppose normal distribution
    - $f(x;\mu,\sigma) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$

Inference

Naive bayes classifier’s assumption
$f_{naive\ bayes}(x) = \displaystyle\operatorname*{argmax}_{y}p(\boldsymbol{x}|y)P(y)$

$p(\boldsymbol{x}|y)P(y) = P(y)\displaystyle\prod_{i=1}^{K}p(x_i|y)$

Leave a comment

You may also enjoy

Eigenvalue and Eigenvector

May 7, 2025

About Eigenvalue and Eigenvector

Ensemble Method

June 30, 2024

About Ensemble Method

Deep Learning Outline

June 30, 2024

About Deep Learning

K-Means Clusturing

June 29, 2024

About K-Means Clusturing