_PROBLEM CoEPrA-2006_Regression_003_Dataset_2 _GROUP_NAME Levon Budagyan _GROUP_MEMBERS Levon Budagyan _ADDRESS levon@molsoft.com; Levon Budagyan 3366 N. Torrey Pines Ct., La Jolla, CA, 92037, USA; Molsoft (www.molsoft.com) _MODELING_PROCEDURE We used a combination of gapped pair counts and amino acid composition bit strings. The gapped pair count vector of a sequence with gap length >=0 is a vector with coordinates indexed by sequence alphabet symbol pairs, i.e. it has 26 x 26 coordinates, one for each pair of letters. For each pair the corresponding vector component contains the quantity of such ordered pairs with a given gap between them. E.g. for gap length l=2 and alphabet pair (A,A), the corresponding vector component will contain the number of A**A subsequences in the sequence, where * stands for any symbol. Descriptor vectors were composed from the pair count vectors with different gaps. We concatenated the pair count vectors for the gap sizes from 0 up to some gap length l0 (we used l0=2). Totally, we had m = 26^2 (l0+1) components in each sequence descriptor vector. In addition, each amino acid in the peptide was encoded with a simple bit pattern: 'A'->10000.., 'B'->01000.., 'Z'->..0001, and such bit vector was added to the gapped pair count descriptor. SVM regression method was applied then to the constructed descriptor set. All data transformations and analysis were performed using the ICM Pro 3.4 program (http://www.molsoft.com/icm_pro.html). _PREDICTION Obj_00001 7.167 Obj_00002 7.523 Obj_00003 6.171 Obj_00004 6.589 Obj_00005 7.638 Obj_00006 8.038 Obj_00007 7.518 Obj_00008 7.623 Obj_00009 8.121 Obj_00010 8.439 Obj_00011 6.418 Obj_00012 6.870 Obj_00013 6.779 Obj_00014 7.158 Obj_00015 6.794 Obj_00016 7.062 Obj_00017 6.093 Obj_00018 6.513 Obj_00019 6.217 Obj_00020 6.121 Obj_00021 6.396 Obj_00022 6.524 Obj_00023 6.524 Obj_00024 6.934 Obj_00025 6.532 Obj_00026 6.898 Obj_00027 7.151 Obj_00028 6.285 Obj_00029 6.524 Obj_00030 7.103 Obj_00031 6.694 Obj_00032 7.947 Obj_00033 7.390 Obj_00034 8.024 Obj_00035 6.954 Obj_00036 7.043 Obj_00037 7.447 Obj_00038 7.195 Obj_00039 7.600 Obj_00040 6.707 Obj_00041 7.145 Obj_00042 7.698 Obj_00043 7.705 Obj_00044 6.523 Obj_00045 7.377 Obj_00046 6.916 Obj_00047 7.317