Table 4 Performances of the SVM models during 10-fold cross-validation using LOTOCV method
No. of ASP-siRNAsPearson Correlation Coefficient (PCC) During 10nCV and IV
S. No.Gene NameTraining DatasetValidation Dataset10nCVIV
1APP907150.710.88
2AR912100.710.19
3COL1A1912100.710.49
4COL3A1903190.710.34
5COL6A3911110.700.24
6COL7A1903190.710.55
7HTT883390.560.28
8KRAS844780.680.31
9KRT12884380.710.48
10KRT5884380.710.24
11KRT6a903190.700.31
12KRT9830920.630.26
13LRRK2901210.710.26
14Others844780.740.20
15P. Luciferase865570.710.23
16PPIB6952270.530.61
17PRNP904180.710.79
18PSEN1903190.430.30
19SNCA906160.710.50
20SOD1881410.530.34
21TGFBI903190.550.64
22TP63884380.580.33
  • ASP-siRNAs targeting a particular gene are assigned to the validation dataset, while sequences from other genes were assigned to the training set. Validation of the models was done using respective gene in the independent validation set. Standard HGNC gene symbols have been used. PCC is between the actual and observed Effmut. The training dataset is used to train different predictive models, while independent validation datasets were not used in any training algorithms. S.No., Serial number; 10nCV, ten-fold cross-validation; IV, independent validation.