Computational protocol: Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images

Similar protocols

Protocol publication

[…] The dataset consists of 27,558 cell images with equal instances of parasitized and uninfected cells. Positive samples contained Plasmodium and negative samples contained no Plasmodium but other types of objects including staining artifacts/impurities. We evaluated the predictive models through five-fold cross-validation. Cross-validation has been performed at the patient level to ensure alleviating model biasing and generalization errors. The count of cells for the different folds is shown in .The images were re-sampled to 100 × 100, 224 × 224, 227 × 227 and 299 × 299 pixel resolutions to suit the input requirements of customized and pre-trained CNNs and normalized to assist in faster convergence. The models were trained and tested on a Windows® system with Intel® Xeon® CPU E5-2640v3 2.60-GHz processor, 1 TB HDD, 16 GB RAM, a CUDA-enabled Nvidia® GTX 1080 Ti 11GB graphical processing unit (GPU), Matlab® R2017b, Python® 3.6.3, Keras® 2.1.1 with Tensorflow® 1.4.0 backend, and CUDA 8.0/cuDNN 5.1 dependencies for GPU acceleration. [...] We performed statistical analyses to choose the best model for deployment. Statistical methods like one-way analysis of variance (ANOVA) are used to determine the presence or absence of a statistically significant difference between the means of three or more individual, unrelated groups (). One-way ANOVA tests the null hypothesis (H0) given by H0: μ1 = μ2 = ⋯ = μk where μ = mean of parameters for the individual groups and k = total number of groups. If a statistically significant result is returned by the test, H0 is rejected and the alternative hypothesis (H1) is accepted to infer that a statistically significant difference exists between the means of at least two groups under study. However, it would be appropriate to use this parametric test only when the underlying data satisfies the assumptions of independence of observations, absence of significant outliers, normality of data and homogeneity of variances (). When the conditions are violated, a non-parametric alternative like Kruskal-Wallis H test (also called the one-way ANOVA on ranks) could be used (). This is an omnibus test that couldn’t identify the specific groups that demonstrate statistically significant differences in their mean values. A post-hoc analysis is needed to identify these groups that demonstrate statistically significant differences (). We performed Shapiro–Wilk test () to check for data normality and Levene’s statistic test () to study the homogeneity of variances for the performance metrics for the different models under study. Statistical analyses were performed using IBM® SPSS® statistical package (IBM SPSS Statistics for Windows, Version 23.0; IBM Corp., Armonk, NY, USA). […]

Pipeline specifications

Software tools TensorFlow, SPSS
Applications Miscellaneous, Computational neuroscience modelling
Organisms Toxoplasma gondii
Diseases Hematologic Diseases, Malaria