_PROBLEM CoEPrA-2006_Classification_001 _GROUP_NAME Scott Oloff _GROUP_MEMBERS Scott Oloff _ADDRESS Boehringer Ingelheim 900 Ridgebury Rd. Ridgefield, CT 06877 _MODELING_PROCEDURE I used a newly developed machine learning approach called kScore, which is based on kNN and SVM technologies. The approach is kernel free and uses all training compounds in a distance-weighted manner to predict a test compound rather than just its k Nearest Neighbors. The algorithm also incorporates Generalization constraints, a soft margin classifier (for Classification), and an Epsilon Loss function (for Regression) similar to those proposed by V. Vapnik. There is a built in Applicability Domain for each kScore model and a more detailed explanation of the method is being submitted for publication. The descriptors of the original training set were normalized from 0 to 1. The training set was then randomly divided into 5 training and test sets whereby the test set contained 20% of the compounds. For each training/test split several kScore models were generated (using the COEPRA provided descriptors) to find the best values of C and Epsilon for external prediction. The most predictive models were collected and used in a consensus fashion to predict the external test set. The average LOO Training accuracies were 0.9-0.95 and the average Test accuracies for each class varied between 0.8 and 1.0. The most predictive values of Epsilon and C varied from 0.6 to 0.8 and 500 to 1000, respectively. The descriptor weights of the most predictive model are attached. The predictions with the lowest confidence are: Obj_00002 Obj_00019 Obj_00023 Obj_00028 Obj_00041 Obj_00052 Obj_00057 Obj_00059 Obj_00060 Obj_00068 Obj_00078 _PREDICTION Obj_00001 -1 Obj_00002 -1 Obj_00003 -1 Obj_00004 -1 Obj_00005 +1 Obj_00006 +1 Obj_00007 -1 Obj_00008 -1 Obj_00009 -1 Obj_00010 +1 Obj_00011 +1 Obj_00012 +1 Obj_00013 +1 Obj_00014 +1 Obj_00015 -1 Obj_00016 -1 Obj_00017 -1 Obj_00018 +1 Obj_00019 -1 Obj_00020 -1 Obj_00021 +1 Obj_00022 +1 Obj_00023 +1 Obj_00024 +1 Obj_00025 +1 Obj_00026 +1 Obj_00027 +1 Obj_00028 -1 Obj_00029 +1 Obj_00030 +1 Obj_00031 +1 Obj_00032 +1 Obj_00033 -1 Obj_00034 -1 Obj_00035 +1 Obj_00036 -1 Obj_00037 -1 Obj_00038 -1 Obj_00039 -1 Obj_00040 +1 Obj_00041 -1 Obj_00042 -1 Obj_00043 -1 Obj_00044 +1 Obj_00045 +1 Obj_00046 -1 Obj_00047 +1 Obj_00048 +1 Obj_00049 +1 Obj_00050 +1 Obj_00051 +1 Obj_00052 +1 Obj_00053 -1 Obj_00054 -1 Obj_00055 +1 Obj_00056 +1 Obj_00057 -1 Obj_00058 -1 Obj_00059 +1 Obj_00060 +1 Obj_00061 -1 Obj_00062 +1 Obj_00063 +1 Obj_00064 -1 Obj_00065 +1 Obj_00066 +1 Obj_00067 +1 Obj_00068 -1 Obj_00069 +1 Obj_00070 -1 Obj_00071 -1 Obj_00072 +1 Obj_00073 -1 Obj_00074 -1 Obj_00075 -1 Obj_00076 -1 Obj_00077 -1 Obj_00078 +1 Obj_00079 -1 Obj_00080 +1 Obj_00081 +1 Obj_00082 -1 Obj_00083 -1 Obj_00084 +1 Obj_00085 -1 Obj_00086 -1 Obj_00087 +1 Obj_00088 -1