_PROBLEM CoEPrA-2006_Regression_002 _GROUP_NAME Scott Oloff _GROUP_MEMBERS Scott Oloff _ADDRESS Boehringer Ingelheim 900 Ridgebury Rd. Ridgefield, CT 06877 _MODELING_PROCEDURE I used a newly developed machine learning approach called kScore, which is based on kNN and SVM technologies. The approach is kernel free and uses all training compounds in a distance-weighted manner to predict a test compound rather than just its k Nearest Neighbors. The algorithm also incorporates Generalization constraints, a soft margin classifier (for Classification), and an Epsilon Loss function (for Regression) similar to those proposed by V. Vapnik. There is a built in Applicability Domain for each kScore model and a more detailed explanation of the method is being submitted for publication. The descriptors of the original training set were normalized from 0 to 1. The training set was then randomly divided into 5 training and test sets whereby the test set contained 20% of the compounds. For each training/test split several kScore models were generated (using the COEPRA provided descriptors) to find the best values of C and Epsilon for external prediction. The most predictive models were collected and used in a consensus fashion to predict the external test set. The average LOO Training Q2's were 0.65-0.85 and the average Test R2's between 0.65 and 0.85. The most predictive values of Epsilon and C varied from 0.0 to 0.6 and 200 to 500, respectively. The descriptor weights of the most predictive model are attached. There were 5 objects that fell just outside of the models' applicability domain. If the predictions are absolutely necessary for comparison in the competition the predictions are: Obj_00012 7.6617 Obj_00047 7.2749 Obj_00058 7.2346 Obj_00065 5.8769 Obj_00068 5.8811 _PREDICTION Obj_00001 7.678 Obj_00002 7.802 Obj_00003 7.807 Obj_00004 7.790 Obj_00005 7.800 Obj_00006 8.087 Obj_00007 7.962 Obj_00008 7.831 Obj_00009 6.331 Obj_00010 6.333 Obj_00011 8.012 Obj_00012 7.662 Obj_00013 7.538 Obj_00014 7.682 Obj_00015 7.801 Obj_00016 7.790 Obj_00017 7.729 Obj_00018 6.544 Obj_00019 7.322 Obj_00020 7.730 Obj_00021 7.662 Obj_00022 7.687 Obj_00023 7.797 Obj_00024 6.457 Obj_00025 7.593 Obj_00026 7.837 Obj_00027 7.694 Obj_00028 7.823 Obj_00029 7.878 Obj_00030 7.756 Obj_00031 7.440 Obj_00032 7.800 Obj_00033 7.624 Obj_00034 7.669 Obj_00035 6.440 Obj_00036 7.823 Obj_00037 7.812 Obj_00038 7.778 Obj_00039 7.796 Obj_00040 6.188 Obj_00041 6.260 Obj_00042 7.672 Obj_00043 7.877 Obj_00044 7.803 Obj_00045 7.694 Obj_00046 8.091 Obj_00047 7.275 Obj_00048 7.836 Obj_00049 8.054 Obj_00050 6.373 Obj_00051 8.058 Obj_00052 7.368 Obj_00053 6.761 Obj_00054 8.149 Obj_00055 7.702 Obj_00056 6.280 Obj_00057 7.769 Obj_00058 7.235 Obj_00059 7.598 Obj_00060 7.820 Obj_00061 7.673 Obj_00062 7.981 Obj_00063 8.046 Obj_00064 7.800 Obj_00065 5.877 Obj_00066 7.811 Obj_00067 7.817 Obj_00068 5.881 Obj_00069 7.608 Obj_00070 7.697 Obj_00071 7.983 Obj_00072 7.702 Obj_00073 7.839 Obj_00074 7.827 Obj_00075 7.828 Obj_00076 6.457