INSIGHT Partners from the SME NovaMechanics have co-authored a paper entitled ‘Read-Across Structural Analysis of PFAS Acute Oral Toxicity in Rats Powered by the Isalos Analytics Platform’s Automated Machine Learning’, introducing read-Across based prediction and interpretation of acute oral toxicity class (high or low) of per- and polyfluoroalkyl substances (PFAS) in rats, enabling the identification of structural features associated with toxicity and supporting the discovery of lower-toxicity candidates for safe-and-sustainable-by-design (SSbD) applications.

The widespread presence and environmental persistence of PFAS have prompted growing concern regarding their adverse effects on human health. Human exposure occurs through multiple routes, including everyday consumer products such as non-stick cookware, contact lenses, and food packaging, as well as environmental contamination linked to industrial processes, wastewater treatment, and solid waste management. These diverse exposure pathways facilitate PFAS entry into the human body and promote bioaccumulation, raising concerns about their potential toxicity, including hormonal imbalances, immune system disruption, and developmental effects. In response to these concerns, strict regulations have been implemented to limit PFAS use and human exposure. Although progress toward phasing out PFAS is underway, complete elimination remains a distant goal. Consequently, it is critical that alternative strategies should be explored, with increasing efforts to harness data analysis and machine learning to elucidate PFAS toxicity mechanisms and identify lower-toxicity candidates.
The novelty of this work lies in the combination of highly accurate predictions with a fully interpretable model. Initially, we compiled data on PFAS acute oral toxicity in rats from three previously published studies1–3, creating a more extensive database which was enriched with Mold2 molecular descriptors. Consequently, we used the advanced capabilities of the in-house Isalos automated machine learning function to optimise four ML models -k-nearest neighbours (kNN), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and fully connected neural network (NN)- and used the top-performing model (kNN, k = 3) for toxicity class predictions, achieving high accuracy of 81.5% on the external testing set.
Most importantly, the selected model supports interpretability analysis keeping in line with the principles of explainable artificial intelligence. A neighbour-based, read-across model is particularly suitable for structural analyses, as it facilitates the identification of structurally related compounds under ML rules while integrating computational and expert-driven interpretation. Integrating the kNN method into the read-across framework automatically generates a data-driven similarity group for each compound, consisting of its k nearest neighbours in the descriptor space. The toxicity prediction is therefore based on these algorithmically defined neighbours rather than on manually constructed similarity groups derived from human-selected criteria.
Model evaluation revealed a robust, balanced, and highly accurate classification model (Accuracy = 0.815, Precision = 0.758, Sensitivity = 0.735, Specificity = 0.862, F1-Score = 0.746, MCC = 0.601). Additional model evaluation revealed good model stability against data exclusion during 10-fold cross validation, and a strong, meaningful relationship between the selected molecular descriptors and the toxicity endpoint (tested via y-randomisation). The SHAP analysis indicates that bulky, densely packed structures at short atomic ranges are potential markers of high toxicity, whereas a more balanced mass and electron density distribution at larger atomic ranges shows the opposite association. This study also highlighted the need to include fluorine-specific and terminal functional group descriptors in future work to enhance mechanistic interpretability. Furthermore, read-across structural analysis of the representative testing set PFAS identified polyaromatic, heterocyclic moieties in the PFAS structure as contributors to higher toxicity, in contrast to simpler linear PFAS molecules lacking heteroatoms.
This work is highly relevant to SSbD because it provides a data-driven approach to identify PFAS with lower toxicity while retaining structural functionality. The framework can screen PFAS candidates computationally before synthesis or large-scale use, highlighting compounds with lower predicted toxicity for further development. The integration of interpretable machine learning ensures that design decisions are informed by mechanistic insights, such as molecular structure or presence of heteroatoms. By identifying structural motifs that correlate with lower toxicity, this approach directly supports the SSbD goal of designing safer chemicals from the outset, reducing reliance on animal testing and minimizing human and environmental health risks.
The final model was also made freely available as the INSIGHT RatTox web service and as a component of the INSIGHT Cheminformatics Suite on the Enalos Cloud Platform, promoting the democratization of access to predictive tools for PFAS toxicity assessment and enabling researchers to explore safer chemical alternatives in a user-friendly, cloud-based environment. The INSIGHT RatTox web application is a valuable addition to a plethora of in-house SSbD models/platforms for nanomaterials and advanced materials development, which are recognised by OECD as ready-to-use tools at TLR 4-6, promoting the incorporation of SSbD principles into scientific innovation4.
Follow this link to read the full paper.
References
(1) Chen, S.; Fan, T.; Zhang, N.; Zhao, L.; Zhong, R.; Sun, G. The Oral Acute Toxicity of Per- and Polyfluoroalkyl Compounds (PFASs) to Rat and Mouse: A Mechanistic Interpretation and Prioritization Analysis of Untested PFASs by QSAR, q-RASAR and Interspecies Modelling Methods. Journal of Hazardous Materials 2024, 480, 136071. https://doi.org/10.1016/j.jhazmat.2024.136071.
(2) Da Silva, N. A. B. R.; De Melo, E. B. Analysis of Oral and Inhalation Toxicity of Per- and Polyfluoroalkylated Organic Compounds in Rats and Mice Using Multivariate QSAR. SAR and QSAR in Environmental Research 2024, 35 (10), 877–897. https://doi.org/10.1080/1062936X.2024.2417250.
(3) Lu, X.; Wang, X.; Chen, S.; Fan, T.; Zhao, L.; Zhong, R.; Sun, G. The Rat Acute Oral Toxicity of Trifluoromethyl Compounds (TFMs): A Computational Toxicology Study Combining the 2D-QSTR, Read-across and Consensus Modeling Methods. Arch Toxicol 2024, 98 (7), 2213–2229. https://doi.org/10.1007/s00204-024-03739-w.
(4) OECD. Safe and Sustainable by Design Tools, Integrative Systems and Platforms for Nanomaterials and Nano-Enabled Products; OECD Series on the Safety of Manufactured Nanomaterials and other Advanced Materials; OECD Publishing: Paris, 2025. https://doi.org/10.1787/e411a4b7-en.





