_PROBLEM CoEPrA-2006_Classification_002 _GROUP_NAME Artem Cherkasov _GROUP_MEMBERS Emre Karakoc Cenk Sahinalp Artem Cherkasov _ADDRESS University of British Columbia, Medicine Simon Fraser University, Computer Science Vancouver, BC, Canada _MODELING_PROCEDURE Descriptors: Based on our previous experience with QSAR modeling of peptides and general QSAR clustering and classification of bioactivity properties, we decided to use the following strategy. First, we have optimized the geometry of the studied peptides using MMFF94 force field; carboxylic groups have been deprotonated, amino groups - protonated, partial charges computed according to [1]. Then we have computed QSAR descriptors that describe an entire peptide molecule (global parameters) as well as descriptors corresponding to constituent aminoacids considered in the context of their peptide environment (we did not use 'isolated aminoacids' approximation). Thus, for all peptides in the testing and training sets of Classification_002 problem, we initially calculated > 400 various 3D and 2D QSAR parameters that included: - 50 global 'inductive' QSAR descriptors as described in [2]. - 10 local 'inductive' QSAR descriptors (computed toward CA atom) have been calculated for each aminoacid of a given 8-mer; therefore, 80 additional 'inductive' QSAR descriptors have been produced. - 260 global atomic type-specific 'inductive' QSAR descriptors - (previously unpublished parameters) that have been computed additively for specific atomic types presented in the studied peptides; - We have also computed ~90 conventional 3D and 2D global QSAR parameters which are implemented within the MOE modeling package [3]. All 'inductive' QSAR descriptors that are described above, have been calculated by our own SVL scripts for the MOE; most of them can be freely downloaded through the SVL exchange. Descriptors Selection: As a first step, the kNN-based linear optimization method based on a distance measure [4] has been used for the initial selection of most relevant QSAR parameters. As the result, 31 global, local 'inductive' and conventional 2D descriptors have been selected for QSAR modeling of the Clafficifcation_002 problem (the corresponding values can be found in the attached excel spreadsheet). Modeling Procedure: In order to build a predictive QSAR model on the set of the training peptides we the PLS approach as it is implemented within the MOE [3]. The quality of the PLS-based model has been ensured with Leave-One-Out (LOO) validation. The resulting PLS solution has been applied to the external peptides of the Clafficifcation_002 problem and the outputs from the models have been interpreted with the Boolean terms using 7.77 threshold value. [1] Cherkasov, A. Inductive Electronegativity Scale. Iterative Calculation of Inductive Partial Charges. Journal of Chemical Information and Computer Sciences, 2003, 43, 2039-2047. Cherkasov, A., Z. Shi, Y. Li, S.M. Jones, M. Fallahi, G.L. Hammond. 'Inductive' Charges on Atoms in Proteins: Comparative Docking with the Extended Steroid Benchmark Set and Discovery of a Novel SHBG Ligand. Journal of Chemical Information and Modelling, 2005, 45, 1842-1853. [2] Cherkasov, A. 'Inductive' Descriptors. 10 Successful Years in QSAR. Current Computer-Aided Drug Design, 2005, 1, 21-42. [3] Molecular Operational Environment, 2005, by Chemical Computing Group Inc., Montreal, Canada. [4] Karakoc E., Cherkasov A., Sahinalp S. C. Distance Based Algorithms for Small Biomolecule Classification and Structural Similarity Search. ISMB'06, 14th Annual International conference on Intelligent Systems for Molecular Biology, Fortaleza, Brazil 2006. [5] SNNS: Stuttgart Neural Network Simulator; Version 4.0, University of Stuttgart, 1995. _PREDICTION Obj_00001 +1 Obj_00002 +1 Obj_00003 -1 Obj_00004 -1 Obj_00005 -1 Obj_00006 +1 Obj_00007 -1 Obj_00008 -1 Obj_00009 +1 Obj_00010 -1 Obj_00011 -1 Obj_00012 -1 Obj_00013 +1 Obj_00014 -1 Obj_00015 +1 Obj_00016 -1 Obj_00017 -1 Obj_00018 -1 Obj_00019 -1 Obj_00020 -1 Obj_00021 +1 Obj_00022 -1 Obj_00023 -1 Obj_00024 -1 Obj_00025 -1 Obj_00026 -1 Obj_00027 -1 Obj_00028 +1 Obj_00029 -1 Obj_00030 +1 Obj_00031 +1 Obj_00032 -1 Obj_00033 -1 Obj_00034 -1 Obj_00035 -1 Obj_00036 +1 Obj_00037 -1 Obj_00038 +1 Obj_00039 -1 Obj_00040 -1 Obj_00041 -1 Obj_00042 +1 Obj_00043 -1 Obj_00044 -1 Obj_00045 -1 Obj_00046 -1 Obj_00047 +1 Obj_00048 +1 Obj_00049 +1 Obj_00050 -1 Obj_00051 +1 Obj_00052 +1 Obj_00053 -1 Obj_00054 -1 Obj_00055 -1 Obj_00056 +1 Obj_00057 -1 Obj_00058 -1 Obj_00059 +1 Obj_00060 +1 Obj_00061 -1 Obj_00062 -1 Obj_00063 +1 Obj_00064 +1 Obj_00065 +1 Obj_00066 +1 Obj_00067 +1 Obj_00068 -1 Obj_00069 -1 Obj_00070 -1 Obj_00071 -1 Obj_00072 +1 Obj_00073 +1 Obj_00074 -1 Obj_00075 -1 Obj_00076 -1