Table 1 Performance of different predictive models on the training/testing dataset of 737 sequences (T737) during 10-fold cross-validation. Evaluation of the models on an independent validation dataset (V185)
PCC on Training/Testing Sets (T737) and Independent Validation Sets (V185) Using 10nCV
Predictive Model No.siRNA Feature NameNo. of FeaturesT737V185
1Mononucleotide composition40.530.54
2Dinucleotide composition160.680.64
3Trinucleotide composition640.700.66
4Tetranucleotide composition2560.690.65
5Pentanucleotide composition10240.680.63
6Binary760.550.56
71+2200.670.63
81+2+3840.700.63
91+2+3+43400.710.65
101+2+3+4+513640.710.65
111+2+3+4+6 (ASPsiPredSVM)4160.710.65
121+2+3+4+5+614400.710.65
13Thermodynamic feature210.410.30
14Secondary structure190.240.07
1513+14400.350.23
1612+134370.710.65
1712+144350.710.65
1812+13+144560.710.65
19ASPsiPredmatrixMatrix basedDeveloped on rules-based studies0.63
  • PCC, Pearson correlation coefficient; 10nCV, 10-fold cross-validation; T737, training/testing dataset for 10-fold cross-validation; V185, independent validation dataset. PCC is between actual and observed Effmut. Training/testing dataset is used to train different predictive models, while independent validation dataset was not used anywhere during training/testing of algorithm.