_PROBLEM CoEPrA-2006_Classification_002 _GROUP_NAME Marco Gori _GROUP_MEMBERS Marco Gori Franco Scarselli Gabriele Monfardini Werner Uwents _ADDRESS Dipartimento di Ingegneria dell'Informazione Via Roma, 56 53100 Siena - Italy _MODELING_PROCEDURE The modeling procedure exploits three SVM classifiers. The outputs of the classifiers are combined using the margins. The output of the modeling procedure on a given input is the output of the SVM with the largest marging. CLASSIFIER COMBINATION More precisely, let o1, o2, and o3 be the ouputs of the three SVM classifiers, m1, m2 and m3 the absolute values of the corresponding margins, tm1, tm2, and tm3 the sums of the absolute values of the margins on the prediction set for the three classifiers, and a1, a2 and a3 the accuracies achieved on the calibration set. Then the output of our system is oj, where j = argmax (ai * mi / tmi) CLASSIFIER DESCRIPTION The three SVM classifiers have been implemented using polynomial kernels of degree 2. The classifiers only differ for their inputs. CLASSIFIER 1 The input is constructed taking the following features - For each amino acid of the input peptide, the probability is computed that the aminoacid is observed in its current position in any positive (which has target 1) peptide of the calibration set. (8 features, one for each amino acid) - For each amino acid of the input peptide, the probability is computed that the aminoacid is observed in its current position in any negative (which has target -1) peptide of the calibration set. (8 features, one for each amino acid) - For each amino acid of the input peptide, the logarithm is computed of the number of occurrences of the amino acid in its current position in any positive (which has target 1) peptide of the calibration set. (8 features, one for each amino acid) CLASSIFIER 2 The input is constructed taking the following features - A flat coding of the sequence of the amino acid. Each amino acid is coded by an integer number. (8 features, one for each amino acid) - The 5144 descriptors provided by the organizers of competition. CLASSIFIER 3 The input is constructed taking the following features, which have been computed based on the Miyazawa-Jernigan substitution matrix provided by the organizers of competition - For each amino acid of the peptide, the probability is computed that the amino acid substites another amino acid which is observed in the same position of a positive peptide of the calibration set. Those values can be computed using Miyazawa-Jernigan substitution matrix and the probabilities of classifier 1. (8 features, one for each amino acid) - For each amino acid of the peptide, the probability is computed that the amino acid substites another amino acid which is observed in the same position of a negative peptide of the calibration set. Those values can be computed using Miyazawa-Jernigan substitution matrix and the probabilities of classifier 1. (8 features, one for each amino acid) - For each amino acid of the peptide, the corresponding diagonal element of the Miyazawa-Jernigan substitution matrix is considered. (8 features, one for each amino acid) - For each amino acid of the input peptide, the logarithm is computed of the number of occurrences of the amino acid in its current position in any positive peptide of the calibration set. (8 features, one for each amino acid) _PREDICTION Obj_00001 +1 Obj_00002 +1 Obj_00003 +1 Obj_00004 -1 Obj_00005 +1 Obj_00006 +1 Obj_00007 +1 Obj_00008 -1 Obj_00009 +1 Obj_00010 -1 Obj_00011 -1 Obj_00012 -1 Obj_00013 +1 Obj_00014 +1 Obj_00015 +1 Obj_00016 -1 Obj_00017 +1 Obj_00018 -1 Obj_00019 -1 Obj_00020 -1 Obj_00021 -1 Obj_00022 -1 Obj_00023 +1 Obj_00024 -1 Obj_00025 -1 Obj_00026 -1 Obj_00027 +1 Obj_00028 +1 Obj_00029 -1 Obj_00030 +1 Obj_00031 -1 Obj_00032 +1 Obj_00033 +1 Obj_00034 -1 Obj_00035 -1 Obj_00036 +1 Obj_00037 +1 Obj_00038 -1 Obj_00039 +1 Obj_00040 +1 Obj_00041 -1 Obj_00042 -1 Obj_00043 -1 Obj_00044 -1 Obj_00045 +1 Obj_00046 -1 Obj_00047 -1 Obj_00048 -1 Obj_00049 +1 Obj_00050 -1 Obj_00051 +1 Obj_00052 +1 Obj_00053 +1 Obj_00054 +1 Obj_00055 +1 Obj_00056 -1 Obj_00057 -1 Obj_00058 +1 Obj_00059 -1 Obj_00060 +1 Obj_00061 -1 Obj_00062 +1 Obj_00063 +1 Obj_00064 +1 Obj_00065 -1 Obj_00066 +1 Obj_00067 -1 Obj_00068 +1 Obj_00069 -1 Obj_00070 +1 Obj_00071 +1 Obj_00072 +1 Obj_00073 -1 Obj_00074 +1 Obj_00075 +1 Obj_00076 -1