Computational protocol: In silico toxicology: computational methods for the prediction of chemical toxicity

Similar protocols

Protocol publication

[…] Structural alerts (SAs), (also called toxicophores/toxic fragments) are chemical structures that indicate or associate to toxicity., SAs can consist of only one atom or several connected atoms. A combination of SAs may contribute to toxicity more than a single SA. SAs are often used in rules defined in the form ‘if A is B then T,’ where A is an SA, B is the value of the SA, and T is the toxicity prediction with assigned certainty level, as illustrated in the following example:IF (chemical_substructure) IS (present) THEN (skin_sensitizer IS certain)There are two main types of rule‐based models that we will consider: human‐based rules (HBRs) and induction‐based rules (IBRs). HBRs are derived from human knowledge of field experts or from literature, but IBRs are derived computationally., HBRs are more accurate but are limited to human knowledge that could be incomplete or biased., Moreover, updating HBRs is often impractical as it requires detailed literature analysis. On the contrary, IBRs can be generated efficiently from large datasets. IBRs may propose hypotheses about associations between chemical structural properties (or their combinations) and toxicity endpoints, which may not be identified through human insights., IBRs are implemented using probabilities to determine if SAs correspond to the toxic or non‐toxic class. It is possible to have hybrid‐based rules systems that contain IBRs and HBRs, with new rules being generated computationally. It is easy to interpret and implement SAs. They are useful in drug design to determine how drugs should be altered to reduce their toxicity. Using structure to predict toxicity allows identifying the structure of potential metabolites. However, SAs have a number of limitations. SAs use only binary features (e.g., chemical structures are either present or absent) and only qualitative endpoints (e.g., carcinogenic or non‐carcinogenic). SAs do not provide insights into the biological pathways of toxicity and may not be sufficient for predicting toxicity. Depending on the concurrent absence or presence of other chemical properties, toxicity may decrease or increase. The list of SAs and rules may be incomplete, which may cause a large number of false negatives (i.e., toxic chemicals predicted as non‐toxic) in predictions., , The last point is particularly important. It is necessary to understand how to interpret the output of SA models. If a chemical does not include SAs or does not match any toxicity rules, this does not indicate non‐toxicity. This is especially true for HBRs that usually include SAs or rules that indicate toxicity but do not include SAs or rules that indicate non‐toxicity. Therefore, in developing such models, it is necessary to ensure that the list of SAs and rules are comprehensive and that they are refined when more experimental data becomes available. However, there should be a balance between the list of SAs and rules, their comprehensiveness, and predictive power. If SAs and rules are diverse, they can be applied to a large number of chemicals, but this may increase false positives (i.e., non‐toxic chemicals predicted as toxic). However, if they are too narrow, they can be applied only to a small group of chemicals, and this may increase false negatives (i.e., toxic chemicals predicted as non‐toxic).An example of SA list for skin sensitization was published in 1982 by Dupuis and Benezra. Another SA list was proposed by Ashby and Tennant, , in 1988 to predict carcinogenicity and mutagenicity. One of the most developed lists of carcinogenic SAs was proposed by Benigni and Bossa, , in the Organisation for Economic Co‐operation and Development (OECD) Quantitative Structure‐Activity Relationship (OSAR) Toolbox and in Toxtree., , Recently, Benigni and Bossa published a new list of non‐genotoxic carcinogenic SAs. Other lists of SAs and rule‐based models were developed for hepatotoxicity, cytotoxicity, irritation/corrosion of skin, and eye and skin sensitization., Several systems (also called ‘expert systems’) provide pre‐built rule‐based and SAs lists, for example, Oncologic Cancer Expert System (OCES), Toxtree,, Derek Nexus,, HazardExpert, and Meteor. Other tools that can extract SAs from datasets that contains toxic or non‐toxic chemicals are reviewed in, such as computer assisted structure elucidation (CASE), prediction of activity spectra for substances (PASS), and categorical‐structure activity relationship (cat‐SAR). Additionally, there are several approaches for extracting the longest frequent molecular substructures such as Apriori (based on breadth‐first search) and pattern growth (based on depth‐first search). Examples of algorithms that implement the pattern growth approach are reviewed in, such as molecular fragment miner (mofa), graph‐based substructure pattern mining (gSpan), fast frequent subgraph mining (FFSM), and Graph/Sequence/Tree Extraction (gaston). Significant substructures capable of discriminating between toxic and non‐toxic chemicals can be extracted using an emerging chemical pattern approach as explained in ref. . [...] A chemical category is a group of chemicals whose properties and toxicity effects are similar or follow a similar pattern., Chemicals in the category are also called source chemicals. The OECD Guidance On Grouping Of Chemicals lists several methods for grouping, such as chemical identity and composition, physicochemical and ADME properties, mechanism of action (MoA), and chemical/biological interactions. Structural similarity is described in the OECD guidelines as the starting point for grouping, but it is also criticized for lacking a ‘scientifically supportable basis’ for grouping, and it can be used if impurities or other constituents in the chemical composition would not change toxicity. Read‐across is a method of predicting unknown toxicity of a chemical using similar chemicals (called chemical analogs) with known toxicity from the same chemical category., , , , Trend analysis is a method of predicting toxicity of a chemical by analyzing toxicity trends (increase, decrease, or constant) of tested chemicals. A hypothetical example of trend analysis shows that when carbon chain length (CCL) increases, acute aquatic toxicity increases (Figure ). Here, we focus on the read‐across method. A summary of different parameters that must be considered when designing a read‐across model is depicted in Figure and explained later. Note, however, that the points discussed are similarly applicable to trend analysis.There are two ways to develop a read‐across method, , : analog approach (AN) (called one‐to‐one), which uses one or few analogs, and a category approach (CA) (called many‐to‐one), which uses many analogs. AN may be sensitive to outliers because two analogs may have different toxicity profiles. Using many analogs for CA is useful to detect trends within a category and may increase confidence in the toxicity predictions., , CA requires defining a category boundary to determine if a chemical belongs to the category and implementing a ‘combination of predictions’ method for analogs that have conflicting toxicity profiles. A combination of predictions can be done using (if applicable) minimum, maximum, mode, median, average, linear, quadratic, or other nonlinear combinations of the predictions. Read‐across can be qualitative if the toxicity endpoint is qualitative; otherwise, read‐across is quantitative., , Also, interpolation using source chemicals surrounding the target chemical (see Figure ) is better than extrapolation from one side. In Figure , interpolation is used with the chemical that has CCL of length 6, but extrapolation is used with a chemical that has CCL of length 12.Identifying similar chemicals can be done in two steps: representing chemicals as feature vectors of chemical properties, and then calculating similarity of chemicals. The first step is implemented using either binary or holographic fingerprints. A binary fingerprint is a feature vector of binary bits representing presence (1) or absence (0) of a property (e.g. presence of a methyl group)., However, a holographic fingerprint uses frequency of properties (e.g. number of methyl groups). Continuous chemical properties (e.g., melting point) can be used as well. A hierarchy of categories and subcategories can be better than a single feature vector. At each level of the hierarchy, a property is applied for category formation. Subsequently, categories are divided using another property to generate subcategories and so on. The hierarchy can allow for investigating the significance of properties and can simplify model interpretation. An example of hierarchal categories is provided in ref. . Statistical similarity of two chemicals can be calculated using different types of distances, such as Hamming, Euclidean, Cosine, Mahalanobis, Tanimoto distance, or linear or nonlinear relationships of the features., There are several advantages of read‐across. Read‐across is transparent, easy to interpret and implement. Read‐across can model quantitative and qualitative toxicity endpoints, and it allows for a wide range of types of descriptors and similarity measures to be used to express similarity between chemicals. However, there are also limitations. Statistical similarity measures do not provide biological insight of toxicity. Moreover, complex similarity measures may complicate model interpretation. In reality, read‐across uses small datasets compared to other approaches such as QSAR because there are usually only a few analogs for a given chemical. Additionally, accuracy depends on the number and choice of analogs, similarity metrics, strength in chemicals’ similarity, chemical properties, and category boundaries. These parameters are very subjective, mutually dependent, endpoint‐specific, and may require expert opinions., , , , Moreover, this approach could be inapplicable or inaccurate if analogs have conflicting toxicity profiles or the number of analog chemicals is insufficient. In these cases, the QSAR approach can be used., , , Read‐across was applied to predict carcinogenicity, hepatoxicity, aquatic toxicity, reproductive toxicity, skin sensitization, and environmental toxicity. Examples of tools implementing read‐across are The OECD QSAR Toolbox, Toxmatch, ToxTree, AMBIT, AmbitDiscovery, AIM, DSSTox, or ChemIDplus. A detailed explanation of some of these tools is available in refs. , , , , , , , . [...] Quantitative structure–activity relationship (QSAR) is a family of models that uses molecular descriptors to predict chemicals’ toxicity. It is assumed that chemicals that fit the same QSAR model may work through the same mechanism. A general QSAR model to predict toxicity (T) using a feature vector of chemical properties (θP) and a function f that calculates T given θP isT=fθP A local QSAR is generated from congeneric chemicals (i.e., similar chemicals); otherwise, the model is a global QSAR if it was made from diverse chemicals. Local QSARs are more accurate as they are customized for specific chemicals. However, there is an overhead to develop a local QSAR for each type of chemical. Therefore, global QSARs are more practical but may be less accurate. Local QSARs can also provide insight on the MoA of specific chemicals, which global QSARs may overlook.Quantitative Structure Toxicity/Property Relationship (QSTR/QSPR) models are QSAR models that predict toxicity and chemical properties, respectively., Structure activity relationships (SAR) are used for categorical endpoints. There are different types of models in the QSAR family as summarized in Supplementary Table S3.There are two main steps to develop a QSAR model: generating molecular descriptors and then generating models to fit the data. Several types of molecular descriptors can be used to describe chemicals as summarized in Supplementary Table S3. Therefore, feature selection algorithms based on, for example, simulated annealing, genetic algorithm, or principal component analysis can be used., If there are a small number of descriptors, using two‐dimensional scatter plots of each descriptor versus the biological activity can help identify significant descriptors (Figure ).There are several types of algorithms to generate QSAR models: linear models such as those based on linear regression analysis, multiple linear regression and partial least squares for continuous endpoints, and linear discriminant analysis for categorical endpoints, ; nonlinear models such as artificial neural networks or support vector machines, ; and data‐driven models such as those based on decision trees, clustering, Naïve Bayes, and K‐nearest neighbor. A comparison of different machine learning and regression models is provided in ref. . Linear models are simpler and, in general, require tuning fewer parameters than nonlinear models. However, many relationships between chemicals and toxicity are nonlinear. Therefore, nonlinear models are commonly used for developing QSARs. The two‐dimensional scatter plots can help identify the type of regression models as illustrated in Figure .Additionally, SAR landscapes are three‐dimensional plots through which one can visualize structure–activity relationships. The X–Y plane represents the molecular descriptors, and the Z‐axis represents response. Figure shows a hypothetical example of a SAR landscape. The smooth region corresponds to chemicals that have a similar structure and similar activity. However, the ragged region corresponds to chemicals that have a similar structure but different activity levels (also called activity cliffs). The activity cliffs are the most interesting part of the SAR landscape. They show that small structural changes correspond to huge changes in activity. Additionally, they affect the performance of machine learning models, either because these regions are discarded as outliers, cause over‐fitting, complicate the prediction models, or increase the prediction error while generating the model.SAR landscapes can be visualized using SAR maps. SAR maps are two‐dimensional plots of activity similarity versus structure similarity that characterize SAR landscapes through four regions as shown in Figure . Moreover, a structure activity landscape index (SALI) and a structure activity index (SARI) can be used to analyze SAR landscapes as explained in ref. .Historically, one of the early QSAR models was developed in 1962 by Hansch et al. in which the log of chemical concentration (C) was estimated using the octanol/water partition coefficient (π) and the Hammett constant (σ):Log1C=4.08π–2.14π2+2.78σ+3.36 If the coefficient of a descriptor is positive, there is a positive relationship between the toxicity endpoint and the descriptor; otherwise, there is a negative relationship. Examples of QSARs for predicting toxicity of aromatic nitro compounds, nitrobenzene compounds, cytotoxicity of TIBO derivatives, and carcinogenicity of sulfa drugs are explained in ref. . A discussion of the performance of QSAR models to predict carcinogenicity, mutagenicity, reproductive, and developmental toxicity endpoints are available in ref. . Case studies on applications of QSAR to skin sensitization and developmental toxicity are available in refs. and , respectively.There are many tools that provide pre‐built QSAR models such as OECD QSAR Toolbox, TopKat, Derek Nexus, HazardExpert, VEGA, and METEOR. Their characteristics are summarized together with those of other QSAR‐based tools in., , , Case studies on combining the results of different prediction tools are available in refs. , . However, specialized software tools for generating QSAR models such as ADAPT and TOPKAT include databases for toxicity data and can calculate molecular descriptors. Additionally, several stand‐alone databases have been compiled to provide toxicity data and/or molecular descriptors as summarized in refs. , , .There are several advantages of QSAR models. They are easy to interpret if the descriptors are meaningful. They can model categorical and continuous toxicity endpoints and molecular descriptors and toxic and non‐toxic chemicals. Using different types of descriptors allows for modeling complex endpoints., However, QSARs may not be always applicable. QSARs require a large number of chemicals in model development to achieve statistical significance. Additionally, QSARs require using feature selection to identify the most significant and independent molecular descriptors, and a large number of descriptors makes the multidimensional space complex and fragmented. QSARs cannot be used for extrapolation between species, routes of exposure, or doses unless biological data is used. Moreover, QSARs may not be biologically interpretable, and QSARs do not take dose, duration, or metabolites into consideration.A brief description of all the tools mentioned in the IN SILICO MODELING METHODS section is available in Table . […]

Pipeline specifications

Software tools Toxtree, Derek Nexus, HazardExpert, OECD QSAR Toolbox, Toxmatch, AMBIT, TOPKAT
Databases ChemIDplus DSSTox
Application Drug design
Organisms Homo sapiens
Diseases Drug-Related Side Effects and Adverse Reactions