_PROBLEM CoEPrA-2006_Classification_003 _GROUP_NAME Levon Budagyan _GROUP_MEMBERS Levon Budagyan _ADDRESS levon@molsoft.com; Levon Budagyan 3366 N. Torrey Pines Ct., La Jolla, CA, 92037, USA; Molsoft (www.molsoft.com) _MODELING_PROCEDURE We used gapped pair counts as descriptors and an SVM classifier as a prediction method. The gapped pair count vector of a sequence with gap length >=0 is a vector with coordinates indexed by sequence alphabet symbol pairs, i.e. it has 26 x 26 coordinates, one for each pair of letters. For each pair the corresponding vector component contains the quantity of such ordered pairs with a given gap between them. E.g. for gap length l=2 and alphabet pair (A,A), the corresponding vector component will contain the number of A**A subsequences in the sequence, where * stands for any symbol. Descriptor vectors were composed from the pair count vectors with different gaps. We concatenated the pair count vectors for the gap sizes from 0 up to some gap length l0 (we used l0=3). Totally, we had m = 26^2 (l0+1) components in each sequence descriptor vector. SVM classifier with dot kernel was used. All data transformations and analysis were performed using the ICM Pro 3.4 program (http://www.molsoft.com/icm_pro.html). _PREDICTION Obj_00001 +1 Obj_00002 +1 Obj_00003 +1 Obj_00004 +1 Obj_00005 +1 Obj_00006 -1 Obj_00007 -1 Obj_00008 -1 Obj_00009 -1 Obj_00010 -1 Obj_00011 -1 Obj_00012 -1 Obj_00013 +1 Obj_00014 -1 Obj_00015 -1 Obj_00016 +1 Obj_00017 +1 Obj_00018 -1 Obj_00019 +1 Obj_00020 -1 Obj_00021 +1 Obj_00022 +1 Obj_00023 +1 Obj_00024 -1 Obj_00025 -1 Obj_00026 -1 Obj_00027 -1 Obj_00028 +1 Obj_00029 -1 Obj_00030 +1 Obj_00031 +1 Obj_00032 -1 Obj_00033 -1 Obj_00034 -1 Obj_00035 +1 Obj_00036 -1 Obj_00037 -1 Obj_00038 +1 Obj_00039 +1 Obj_00040 +1 Obj_00041 -1 Obj_00042 -1 Obj_00043 -1 Obj_00044 -1 Obj_00045 -1 Obj_00046 -1 Obj_00047 -1 Obj_00048 -1 Obj_00049 -1 Obj_00050 -1 Obj_00051 +1 Obj_00052 -1 Obj_00053 +1 Obj_00054 +1 Obj_00055 -1 Obj_00056 +1 Obj_00057 +1 Obj_00058 +1 Obj_00059 -1 Obj_00060 -1 Obj_00061 -1 Obj_00062 +1 Obj_00063 +1 Obj_00064 +1 Obj_00065 -1 Obj_00066 -1 Obj_00067 +1 Obj_00068 +1 Obj_00069 -1 Obj_00070 +1 Obj_00071 +1 Obj_00072 -1 Obj_00073 +1 Obj_00074 +1 Obj_00075 +1 Obj_00076 +1 Obj_00077 -1 Obj_00078 -1 Obj_00079 +1 Obj_00080 -1 Obj_00081 +1 Obj_00082 +1 Obj_00083 +1 Obj_00084 -1 Obj_00085 +1 Obj_00086 +1 Obj_00087 -1 Obj_00088 +1 Obj_00089 +1 Obj_00090 -1 Obj_00091 -1 Obj_00092 +1 Obj_00093 +1 Obj_00094 -1 Obj_00095 -1 Obj_00096 -1 Obj_00097 -1 Obj_00098 +1 Obj_00099 +1 Obj_00100 -1 Obj_00101 -1 Obj_00102 +1 Obj_00103 -1 Obj_00104 +1 Obj_00105 +1 Obj_00106 -1 Obj_00107 -1 Obj_00108 +1 Obj_00109 -1 Obj_00110 -1 Obj_00111 +1 Obj_00112 +1 Obj_00113 -1 Obj_00114 -1 Obj_00115 +1 Obj_00116 -1 Obj_00117 +1 Obj_00118 +1 Obj_00119 -1 Obj_00120 -1 Obj_00121 +1 Obj_00122 -1 Obj_00123 +1 Obj_00124 +1 Obj_00125 -1 Obj_00126 +1 Obj_00127 +1 Obj_00128 +1 Obj_00129 +1 Obj_00130 -1 Obj_00131 +1 Obj_00132 +1 Obj_00133 +1