
Ten-pass cross-validation scripts.

tenpass/split-log-into-buckets:  Split a mass-check logfile into n
identically-sized buckets, evenly taking messages from all checked corpora and
preserving comments.  It does this evenly by running through all buckets
sequentially as each line is read.  Output files are named 'out-N.log'.

  usage: tenpass/split-log-into-buckets 10 < mass-check.log

10pass-run:  the workhorse.   Generate a corpus, run this from the 'masses'
directory and leave it overnight.  Note that you will need to change
NSBASE and SPBASE  at the top of the script, to point to the basename and
path of the split logfiles.

  usage: tenpass/10pass-run

10pass-compute-tcr:  compute TCR, SpamRecall and SpamPrecision based on results
data from 10pass-run.

  usage: tenpass/10pass-compute-tcr

