_PROBLEM CoEPrA-2006_Regression_001 _GROUP_NAME Joao Aires-de-Sousa _GROUP_MEMBERS Goncalo Carrera Sunil Gupta Yuri Binev Joao Aires-de-Sousa _ADDRESS REQUIMTE, CQFB, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal. E-mail: jas@fct.unl.pt; Fax: +351 21 2948550 _MODELING_PROCEDURE From the initial pool of 5787 descriptors given by the organizers, 433 descriptors (file descriptors.txt) were selected to exclude: - inter-correlation between descriptors above a Pearson correlation coefficient of 0.62. - descriptors with values in the test set outside the range of the training set. With the selected descriptors, regression models were built with Support Vector Machines. An ensemble of 5 SVMs was used. The R program was employed (version 2.2.1 2005-12-20 r36812) with the kernlab library (version 0.8-1). After manual optimization of parameters C, and nu, the values of C=8 and nu=0.008 were chosen on the basis of 3-fold cross-validation results for the training set. The default value of epsilon (0.1) was used. A radial basis kernel "Gaussian" was used. The output files are named regr_svmx_output.txt (x from 1 to 5), and the parameters are in the file regr_parameters_svm.txt. Experiments with Random Forests were performed to select the most relevant descriptors to train SVMs. These experiments always resulted in higher 3-fold cross-validation errors than the experiments with the 433 descriptors and no selection of features. This suggests a situation with different mechanisms of action involved, and made us decide to use the SVMs trained with 433 descriptors. The final predictions for the test set were obtained by the average of the individual predictions from the 5 SVMs of the ensemble. _PREDICTION Obj_00001 3.768 Obj_00002 5.828 Obj_00003 5.175 Obj_00004 4.974 Obj_00005 6.539 Obj_00006 6.149 Obj_00007 5.216 Obj_00008 4.857 Obj_00009 4.365 Obj_00010 6.051 Obj_00011 5.681 Obj_00012 5.818 Obj_00013 5.651 Obj_00014 5.767 Obj_00015 4.568 Obj_00016 5.159 Obj_00017 4.544 Obj_00018 5.634 Obj_00019 5.238 Obj_00020 4.751 Obj_00021 5.460 Obj_00022 5.616 Obj_00023 5.195 Obj_00024 6.626 Obj_00025 6.572 Obj_00026 5.569 Obj_00027 6.278 Obj_00028 5.665 Obj_00029 6.158 Obj_00030 5.529 Obj_00031 5.687 Obj_00032 5.399 Obj_00033 4.859 Obj_00034 4.897 Obj_00035 5.909 Obj_00036 4.650 Obj_00037 4.895 Obj_00038 4.884 Obj_00039 4.600 Obj_00040 6.418 Obj_00041 6.013 Obj_00042 4.625 Obj_00043 4.625 Obj_00044 5.531 Obj_00045 5.628 Obj_00046 4.916 Obj_00047 5.448 Obj_00048 5.796 Obj_00049 7.126 Obj_00050 6.013 Obj_00051 6.654 Obj_00052 5.706 Obj_00053 5.695 Obj_00054 4.944 Obj_00055 5.894 Obj_00056 5.328 Obj_00057 5.553 Obj_00058 4.703 Obj_00059 5.483 Obj_00060 4.723 Obj_00061 5.299 Obj_00062 5.410 Obj_00063 5.985 Obj_00064 4.903 Obj_00065 5.752 Obj_00066 5.461 Obj_00067 5.616 Obj_00068 5.140 Obj_00069 5.756 Obj_00070 4.942 Obj_00071 5.282 Obj_00072 5.423 Obj_00073 4.971 Obj_00074 4.796 Obj_00075 5.561 Obj_00076 4.637 Obj_00077 4.655 Obj_00078 5.839 Obj_00079 4.562 Obj_00080 5.942 Obj_00081 6.875 Obj_00082 5.005 Obj_00083 4.600 Obj_00084 5.708 Obj_00085 4.657 Obj_00086 5.014 Obj_00087 5.530 Obj_00088 4.427