Archive for April, 2010

Progress so far

Thursday, April 15th, 2010

Kwaku unfortunately lost some work due to his HD dying so we lost some work.

But so far, this is what we’ve done so far.

  • Grab list of emails
  • Read emails unto stack
    • Each email is concatenated into one (super) long string
    • Each email will be a spam email
    • (define read
      (lambda (type)
      (lambda (state)
      (let (temp "")
      (unless (eof-object? (peek-char input))
      (write ((read-char in) output))
      (string-append temp output));;
      (push temp type state)))))
  • GP will try to evolve regex’s for the emails — In Progress
    • One problem we have to figure out is how to make our system doesn’t try to evolve one catch-all gigantic regex(which just horribly fails)
    • Have some ideas (actually not really) but have to talk to Lee
  • Fitness function will apply regex to another sample of emails. — In Progress

New Timeline

Wednesday, April 7th, 2010

This Week:
– Finish implementing RegEx as a stack in Schush
– Acquire list of emails that are classified as spam (either from database or from actual email collection as marked)

Next Week:
– Figure out a way of extracting relevant lexical information from email bodies, and how to divide it into strings
– Decide on a set of operations and code modification types that would be useful for creating a filter

Week after that:
– Try to get something running in some way
– Pray that it works

Week After after that
– Find out a way to realistically implement it
– Analyze methods that were found using Genetic Methods

Last Week?
– Compile findings
– Package everything somehow (finding a way to implement it, put it online for other people to use, etc.)

-Ted