Cookies help us deliver our services. By using our services, you agree to our use of cookies. More information

Nguyen 2021 Brief Bioinform

From Bioblast
Publications in the MiPMap
Nguyen TT, Nguyen DK, Ou YY (2021) Addressing data imbalance problems in ligand-binding site prediction using a variational autoencoder and a convolutional neural network. Brief Bioinform 22:bbab277. https://doi.org/10.1093/bib/bbab277

» PMID: 34322702 Open Access

Nguyen TT, Nguyen DK, Ou YY (2021) Brief Bioinform

Abstract: Since 2015, a fast growing number of deep learning-based methods have been proposed for protein-ligand binding site prediction and many have achieved promising performance. These methods, however, neglect the imbalanced nature of binding site prediction problems. Traditional data-based approaches for handling data imbalance employ linear interpolation of minority class samples. Such approaches may not be fully exploited by deep neural networks on downstream tasks. We present a novel technique for balancing input classes by developing a deep neural network-based variational autoencoder (VAE) that aims to learn important attributes of the minority classes concerning nonlinear combinations. After learning, the trained VAE was used to generate new minority class samples that were later added to the original data to create a balanced dataset. Finally, a convolutional neural network was used for classification, for which we assumed that the nonlinearity could be fully integrated. As a case study, we applied our method to the identification of FAD- and FMN-binding sites of electron transport proteins. Compared with the best classifiers that use traditional machine learning algorithms, our models obtained a great improvement on sensitivity while maintaining similar or higher levels of accuracy and specificity. We also demonstrate that our method is better than other data imbalance handling techniques, such as SMOTE, ADASYN, and class weight adjustment. Additionally, our models also outperform existing predictors in predicting the same binding types. Our method is general and can be applied to other data types for prediction problems with moderate-to-heavy data imbalances.

Bioblast editor: Gnaiger E

Nguyen 2021 Brief Bioinform CORRECTION.png

Correction: FADH2 and Complex II

Ambiguity alert.png
FADH2 is shown as the substrate feeding electrons into Complex II (CII). This is wrong and requires correction - for details see Gnaiger (2024).
Gnaiger E (2024) Complex II ambiguities ― FADH2 in the electron transfer system. J Biol Chem 300:105470. https://doi.org/10.1016/j.jbc.2023.105470 - »Bioblast link«

Hydrogen ion ambiguities in the electron transfer system

Communicated by Gnaiger E (2023-10-08) last update 2023-11-10
Electron (e-) transfer linked to hydrogen ion (hydron; H+) transfer is a fundamental concept in the field of bioenergetics, critical for understanding redox-coupled energy transformations.
Ambiguity alert H+.png
However, the current literature contains inconsistencies regarding H+ formation on the negative side of bioenergetic membranes, such as the matrix side of the mitochondrial inner membrane, when NADH is oxidized during oxidative phosphorylation (OXPHOS). Ambiguities arise when examining the oxidation of NADH by respiratory Complex I or succinate by Complex II.
Ambiguity alert e-.png
Oxidation of NADH or succinate involves a two-electron transfer of 2{H++e-} to FMN or FAD, respectively. Figures indicating a single electron e- transferred from NADH or succinate lack accuracy.
Ambiguity alert NAD.png
The oxidized NAD+ is distinguished from NAD indicating nicotinamide adenine dinucleotide independent of oxidation state.
NADH + H+ → NAD+ +2{H++e-} is the oxidation half-reaction in this H+-linked electron transfer represented as 2{H++e-} (Gnaiger 2023). Putative H+ formation shown as NADH → NAD+ + H+ conflicts with chemiosmotic coupling stoichiometries between H+ translocation across the coupling membrane and electron transfer to oxygen. Ensuring clarity in this complex field is imperative to tackle the apparent ambiguity crisis and prevent confusion, particularly in light of the increasing number of interdisciplinary publications on bioenergetics concerning diagnostic and clinical applications of OXPHOS analysis.