Genomic Sweeping for Hypermethylated Genes Liang Goh, Susan K. Murhphy, Sayan Muhkerjee, Terrence S. Furey Institute for Genome Sciences & Policy, Duke University A brief guide to using the software available... findk: Algo 3.1 of paper To find stable clusters of unlabeled majority. It return a data struct containing information of the clusters and rank of the k's. Usually the first part of the code is run with a large k range to try and satisfy Csize > Pminority && Csize < Pfreq first. The clusters are then tabulated to obtain the range of cluster size. These are then used to set for parameters criteria1 and criteria2 in second part of code. dvcq_mfold: Creation of Majority Class Sets & boosting (Fig 2 of paper) This is the main code for cluster_boost. It uses the cluster information from findk to partition the unlabeled majority into m parts and then interatively run boost_mlp1 m times. boost_mlp1: Algo 3.2 of paper The boosting algorithm for MLP is implemented through the weight adjustments algorithms of matlab's newff through dwtmse and wtmse. For more info, refer to matlab's documentation on Advanced Topics for Neural Network Toolbox. Have fun! Liang