Case Study: Spam Prediction
Modeling Techniques
- The following modeling techniques were used to build a SPAM prediction. All models were tested against the validation dataset.
- Gains#: Semi-adaptive logistic using 23 variables selected from a forward stagewise selection
- Gains#: Adaptive logistic using the 23 selected variables selected from a forward stagewise selection
- Linear logistic regression model using 23 selected variables
- Linear discriminant analysis using the 23 selected variables
- K-nearest-neighbor classifier with K=5 using 23 selected variables
- CART tree with 21 nodes, using the Gini index for splitting
- MARS model with all 57 variables as inputs. MARS selected 22 direct (32 terms total)
- A feed-forward, MLP Neural Network with one hidden layer consisting of 10 neurons