Chapter 11 Classification

11.2 Bayes Classifier

  • TODO: Not the same as naïve Bayes classifier

\[ p_k(x) = P\left[ Y = k \mid X = x \right] \]

\[ C^B(x) = \underset{k \in \{1, 2, \ldots K\}}{\text{argmax}} P\left[ Y = k \mid X = x \right] \]


11.2.1 Bayes Error Rate

\[ 1 - \mathbb{E}\left[ \underset{k}{\text{max}} \ P[Y = k \mid X] \right] \]

11.3 Building a Classifier

\[ \hat{p}_k(x) = \hat{P}\left[ Y = k \mid X = x \right] \]

\[ \hat{C}(x) = \underset{k \in \{1, 2, \ldots K\}}{\text{argmax}} \hat{p}_k(x) \]

  • TODO: first estimation conditional distribution, then classify to label with highest probability
\(X = 1\) \(X = 2\) \(X = 3\) \(X = 4\)
\(Y = A\) 0.12 0.01 0.04 0.14
\(Y = B\) 0.05 0.03 0.10 0.15
\(Y = C\) 0.09 0.06 0.08 0.13
\(X = 1\) \(X = 2\) \(X = 3\) \(X = 4\)
0.26 0.1 0.22 0.42
\(Y = A\) \(Y = B\) \(Y = C\)
0.31 0.33 0.36
##              A          B         C
## [1,] 0.2608696 0.39130435 0.3478261
## [2,] 0.1481481 0.07407407 0.7777778
##           A          B         C
## 1 0.2608696 0.39130435 0.3478261
## 2 0.1481481 0.07407407 0.7777778
##           A          B         C
## 1 0.2608699 0.39130496 0.3478251
## 2 0.1481478 0.07407414 0.7777781

11.4 Modeling

11.4.1 Linear Models

  • TODO: use `nnet::multinom

11.4.2 k-Nearest Neighbors

  • TODO: use caret::knn3()

11.4.3 Decision Trees

  • TODO: use rpart::rpart()

11.5 MISC TODO STUFF