Case Study:  Spam Prediction

      Background
  • Spam dataset originally provided by George Forman of Hewlett Packard Laboratories in Palo Alto, CA.  The dataset is available at ftp.ivs.uci.edu.  You may also download a copy here.
  • It contains various information on 4601 emails, where 1813 are categorized as spam and 2788 are legitimate emails.
  • It has 57 continuous ordinal predictors
               - Percentage of characters and words in email such as 'free', 'address', and 'business', '!', '?','#'
               - Average, sum, and maximum length of uninterrupted sequences of capital letters
  • Goal is to build a spam filter by predicting a given incoming email is spam

      Predictive Modeling
  • 1536 records were randomly selected for testing and 3065 records were allocated to the training data
  • Various modeling techniques were used to build a SPAM prediction.  See more detail...

      Model Comparison
  • The Comparison shows Gains# has excellent prediction results with great interpretability.

Caution:  One test does not prove that Gains# methods can always outperform other methods. It does, however, show that it can compete with these techniques while being transparent and easy interpret.

"I can't think of modeling software that can generate, modify or validate a response model faster; and it offers the precision, flexibility and elegance that one might think comes only from customized nonlinear modeling programs and a lot of sophisticated thought."

"Gains# shows great strength at the model building stage. Among the tools we used in our recent projects, it is the only one that can produce stable models, and it takes the least amount of effort."

"No other modeling software I have used identifies variable transformations as effectively as Gains#."

"Gains# is very fast to learn."
 
 
Customers Say ...
InfoDecipher
Innovative
    Predictive
        Modeling
Home    |     Products    |    Download   |    Support   |    Contacts
 
What's Behind Gains#?
Copyright Ó 2001-2009 InfoDecipher Corp. All Rights Reserved.