_PROBLEM CoEPrA-2006_Regression_001 _GROUP_NAME Levon Budagyan _GROUP_MEMBERS Levon Budagyan _ADDRESS levon@molsoft.com _MODELING_PROCEDURE We used gapped pair counts as descriptors and a PLS regression as a prediction method. The gapped pair count vector of a sequence with gap length >=0 is a vector with coordinates indexed by sequence alphabet symbol pairs, i.e. it has 26 x 26 coordinates, one for each pair of letters. For each pair the corresponding vector component contains the quantity of such ordered pairs with a given gap between them. E.g. for gap length l=2 and alphabet pair (A,A), the corresponding vector component will contain the number of A**A subsequences in the sequence, where * stands for any symbol. Descriptor vectors were composed from the pair count vectors with different gaps. We concatenated the pair count vectors for the gap sizes from 0 up to some gap length l0 (we used l0=7). Totally, we had m = 26^2 (l0+1) components in each sequence descriptor vector. PLS regression with dot kernel was used. _PREDICTION Obj_00001 5.192 Obj_00002 5.802 Obj_00003 4.566 Obj_00004 4.967 Obj_00005 7.693 Obj_00006 5.264 Obj_00007 4.339 Obj_00008 4.333 Obj_00009 5.876 Obj_00010 4.630 Obj_00011 6.216 Obj_00012 5.113 Obj_00013 5.636 Obj_00014 4.256 Obj_00015 6.209 Obj_00016 5.691 Obj_00017 4.906 Obj_00018 5.767 Obj_00019 4.704 Obj_00020 5.223 Obj_00021 5.196 Obj_00022 6.155 Obj_00023 5.164 Obj_00024 4.835 Obj_00025 4.617 Obj_00026 7.149 Obj_00027 6.451 Obj_00028 5.457 Obj_00029 5.203 Obj_00030 4.731 Obj_00031 5.460 Obj_00032 5.914 Obj_00033 5.555 Obj_00034 6.328 Obj_00035 5.190 Obj_00036 6.006 Obj_00037 4.616 Obj_00038 5.651 Obj_00039 5.379 Obj_00040 4.195 Obj_00041 5.900 Obj_00042 4.523 Obj_00043 6.993 Obj_00044 4.711 Obj_00045 5.903 Obj_00046 5.294 Obj_00047 4.432 Obj_00048 5.725 Obj_00049 4.786 Obj_00050 5.036 Obj_00051 5.523 Obj_00052 5.680 Obj_00053 5.637 Obj_00054 4.778 Obj_00055 4.633 Obj_00056 5.365 Obj_00057 4.977 Obj_00058 4.476 Obj_00059 7.572 Obj_00060 4.333 Obj_00061 5.865 Obj_00062 4.661 Obj_00063 5.956 Obj_00064 4.739 Obj_00065 5.665 Obj_00066 5.292 Obj_00067 6.568 Obj_00068 4.449 Obj_00069 5.446 Obj_00070 4.441 Obj_00071 5.190 Obj_00072 5.222 Obj_00073 5.057 Obj_00074 5.764 Obj_00075 6.086 Obj_00076 4.518 Obj_00077 5.461 Obj_00078 5.502 Obj_00079 4.722 Obj_00080 6.174 Obj_00081 5.929 Obj_00082 5.947 Obj_00083 4.227 Obj_00084 5.197 Obj_00085 7.489 Obj_00086 5.159 Obj_00087 5.876 Obj_00088 5.873