The detailed assessment of the composition of plant-derived products is of primary interest. The metabolites in natural extracts (NE) constitute the metabolome, which can be divided into the core and the specialized metabolome. Plants produce specialized metabolites to ensure their survival in a competitive environment. To assess the composition of NEs, currently validated methods for rigorous annotation and quantification of metabolites require standards. However, among the known metabolites, the availability of commercial reference standards is heavily restricted. Given this limitation, common analytical methods for NE composition assessment focus on studying a few specific and often non-bioactive markers.
Liquid chromatography coupled to mass spectrometry (LC-MS) is a method of choice for NE metabolite analysis. Annotating data sets generated by LC-MS systems remains challenging. Dereplication allows focusing efforts on novel compounds, overcoming this challenge by leveraging prior knowledge and computational tools. In the frame of the present thesis, two resources to improve dereplication were developed. The first is the Taxonomically Informed Metabolite Annotation, which allows for better decision-making when multiple structural candidates are suggested by current MS-based annotation tools. The second is LOTUS, an initiative for open knowledge management in natural products research, that provides the largest collection of metabolite-taxon pairs.
In addition to annotation, semi-quantitative aspects are crucial for NE composition evaluation. They are needed to document the use of NEs as products and assess the presence and concentration level of potentially toxic compounds. Such information may also provide a rationale to justify specific molecules’ contribution to an extract’s overall bioactivity. Nevertheless, generic methods generating a semi-quantitative assessment of a large panel of metabolites are still lacking. Typically, only a dozen metabolites account for most of the extract’s mass, while hundreds are present in trace amounts.
Therefore, effective procedures providing a comprehensive analysis of the metabolome of NEs are needed, further addressing both qualitative and quantitative aspects. This work combines qualitative and semi-quantitative information in an automated manner, by integrating LC-MS-based metabolite profiling with generic universal detection methods. The impact of this strategy is evaluated on public data, collaborations, and well-known plants. Its application to different research questions is illustrated, i.e. through flavoring plants of industrial interest such as Swertia chirayita (Roxb.) H. Karst, containing large quantities of bitter principles. The presented workflow, integrating analytical and computational strategies, aims to make plant metabolomics research more effective for public health, food and beverage safety, as well as fundamental science.