_PROBLEM CoEPrA-2006_Regression_002 _GROUP_NAME Levon Budagyan _GROUP_MEMBERS Levon Budagyan _ADDRESS levon@molsoft.com _MODELING_PROCEDURE We used gapped pair counts as descriptors and a PLS regression as a prediction method. The gapped pair count vector of a sequence with gap length >=0 is a vector with coordinates indexed by sequence alphabet symbol pairs, i.e. it has 26 x 26 coordinates, one for each pair of letters. For each pair the corresponding vector component contains the quantity of such ordered pairs with a given gap between them. E.g. for gap length l=2 and alphabet pair (A,A), the corresponding vector component will contain the number of A**A subsequences in the sequence, where * stands for any symbol. Descriptor vectors were composed from the pair count vectors with different gaps. We concatenated the pair count vectors for the gap sizes from 0 up to some gap length l0 (we used l0=7). Totally, we had m = 26^2 (l0+1) components in each sequence descriptor vector. PLS regression with dot kernel was used. _PREDICTION Obj_00001 7.622 Obj_00002 7.860 Obj_00003 7.858 Obj_00004 7.818 Obj_00005 7.860 Obj_00006 8.030 Obj_00007 8.112 Obj_00008 7.820 Obj_00009 6.770 Obj_00010 6.770 Obj_00011 7.995 Obj_00012 6.770 Obj_00013 7.594 Obj_00014 7.652 Obj_00015 7.820 Obj_00016 7.879 Obj_00017 7.863 Obj_00018 6.871 Obj_00019 6.640 Obj_00020 7.551 Obj_00021 6.602 Obj_00022 7.597 Obj_00023 7.860 Obj_00024 6.602 Obj_00025 7.584 Obj_00026 7.792 Obj_00027 7.532 Obj_00028 7.820 Obj_00029 7.899 Obj_00030 8.018 Obj_00031 6.602 Obj_00032 7.860 Obj_00033 7.729 Obj_00034 7.551 Obj_00035 6.751 Obj_00036 7.869 Obj_00037 8.163 Obj_00038 7.691 Obj_00039 7.820 Obj_00040 6.770 Obj_00041 6.770 Obj_00042 7.680 Obj_00043 8.047 Obj_00044 7.839 Obj_00045 6.618 Obj_00046 8.047 Obj_00047 7.331 Obj_00048 7.860 Obj_00049 8.113 Obj_00050 6.770 Obj_00051 8.068 Obj_00052 7.373 Obj_00053 6.770 Obj_00054 8.088 Obj_00055 7.517 Obj_00056 6.580 Obj_00057 7.975 Obj_00058 7.358 Obj_00059 7.551 Obj_00060 7.820 Obj_00061 7.660 Obj_00062 8.094 Obj_00063 8.109 Obj_00064 7.820 Obj_00065 7.428 Obj_00066 7.874 Obj_00067 7.891 Obj_00068 7.023 Obj_00069 6.640 Obj_00070 7.551 Obj_00071 8.047 Obj_00072 7.551 Obj_00073 8.047 Obj_00074 7.654 Obj_00075 7.871 Obj_00076 6.574