Computational protocol: Composite Biomarkers Derived from Micro-Electrode Array Measurements and Computer Simulations Improve the Classification of Drug-Induced Channel Block

Similar protocols

Protocol publication

[…] Support vector classification (Boser et al., ) (SVC) is an adaptation of the support vector machine (SVM) method in a classification setting. Classification generally consists in attributing labels to inputs. The available data set, comprising both inputs and labels, is generally split into a training set used to build the classifier and a validation set to test the classifier. The inputs are often multidimensional and in our case correspond to the biomarkers, whether classical or composite. The labels are integers that represent the classes to which the inputs are assigned. These classes are mutually exclusive, meaning one sample can only belong to a single class. SVC belongs to the so-called supervised methods since the labels are known, at least for the training set. The main idea behind SVC is to maximize the margin between the inputs and the decision boundary (Boser et al., ). In the linear case, the decision boundary is a hyperplane of the input space. In general, however, this is not sufficient to properly separate the samples according to their classes. A common way to obtain more complex boundary decisions is to use a so-called “kernel trick” (Schölkopf and Smola, ) which is based on a mapping from the input space to a higher-dimensional space where the existence of a separating hyperplane is more likely. In the present case, the labels are “sodium antagonist,” “calcium antagonist,” and “potassium antagonist,” respectively associated with labels 0, 1, and 2. Among various possible choices of kernels, a Gaussian kernel is employed in this work.We used a Python implementation of SVC through the Scikit-learn (Pedregosa et al., ) machine learning library which itself uses the LIBSVM library (Chang and Lin, ). For a given training set, a so-called classifier is built. The classifier is then called to predict the labels of the validation set samples. The predictions are finally compared to the true labels. There exist several metrics to quantify the prediction quality. Two different metrics are considered here: the Cohen's kappa and the receiver operating characteristic area under curve (AUC). The Cohen's kappa is a single scalar designed to measure the performance of multi-class classifiers. Its value ranges from −1 (worst possible classifier) to 1 (perfect classifier), 0 corresponding to a coin-flip classifier. The AUC is defined for each class and measures how a classifier performs with respect to a given class. Its value ranges from 0 (worst) to 1 (best), 0.5 being a coin-flip. Because the classification is repeated several times with different data set splittings, the classification metrics are summarized using their means and standard deviations. The “averaged AUC” corresponds to the average of all AUCs (one AUC per class).Both metrics are described in detail in the Supplementary Material. We now present two different strategies to employ SVC in the context of drug classification. […]

Pipeline specifications

Software tools Scikit-Learn, LIBSVM
Applications Miscellaneous, Neuroimaging analysis
Organisms Homo sapiens
Chemicals Diltiazem, Flecainide, Mexiletine