Journal of Agricultural and Food Chemistry 72.18 (2024): 10537-10547.
Bitter compounds are common in nature and among drugs. Previously, machine learning tools were developed to predict bitterness from the chemical structure. However, known structures are estimated to represent only 5–10% of the metabolome, and the rest remain unassigned or “dark”. We present BitterMasS, a Random Forest classifier that was trained on 5414 experimental mass spectra of bitter and nonbitter compounds, achieving precision = 0.83 and recall = 0.90 for an internal test set. Next, the model was tested against spectra newly extracted from the literature 106 bitter and nonbitter compounds and for additional spectra measured for 26 compounds. For these external test cases, BitterMasS exhibited 67% precision and 93% recall for the first and 58% accuracy and 99% recall for the second. The spectrum–bitterness prediction strategy was more effective than the spectrum–structure–bitterness prediction strategy and covered more compounds. These encouraging results suggest that BitterMasS can be used to predict bitter compounds in the metabolome without the need for structural assignment of individual molecules. This may enable identification of bitter compounds from metabolomics analyses, for comparing potential bitterness levels obtained by different treatments of samples and for monitoring bitterness changes overtime.