{ "cells": [ { "cell_type": "markdown", "id": "f46a29cc", "metadata": {}, "source": [ "# Quick Start\n", "\n", "In this vignette we will demonstrate how to use `LAMP` python package. The\n", "input data and reference files are located in\n", "https://github.com/wanchanglin/lamp/tree/master/examples/data." ] }, { "cell_type": "markdown", "id": "35a5cd63", "metadata": {}, "source": [ "## Setup\n", "\n", "To use `LAMP`, the first step is to import some python libraries including\n", "`LAMP`." ] }, { "cell_type": "code", "execution_count": 1, "id": "0e95926d", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:39:53.045351Z", "iopub.status.busy": "2024-11-06T19:39:53.045351Z", "iopub.status.idle": "2024-11-06T19:39:54.257615Z", "shell.execute_reply": "2024-11-06T19:39:54.257615Z" } }, "outputs": [], "source": [ "import sqlite3\n", "import pandas as pd\n", "from lamp import anno, stats, utils" ] }, { "cell_type": "markdown", "id": "09ea0631", "metadata": {}, "source": [ "## Data Loading\n", "\n", "`LAMP` supports text files separated by comma (`,`) or tab (`\\t`). The\n", "Microsoft's XLSX is also supported, using argument `sheet_name` to\n", "indicate which sheet is used for input data. The default is 0 for the\n", "first sheet.\n", "\n", "Here we use a small example data set with `tsv` format. Load it into\n", "python and check its format:\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "e5788ad4", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:39:54.257615Z", "iopub.status.busy": "2024-11-06T19:39:54.257615Z", "iopub.status.idle": "2024-11-06T19:39:54.315499Z", "shell.execute_reply": "2024-11-06T19:39:54.315499Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namenamecustommzmzminmzmaxrtrtminrtmaxnpeaks....X210X209X208X207X206X205X204X203X202X201
0M151T34M150.8867T34150.886715150.886592150.88686334.15270033.63759535.4655489797...4.224942e+063.946599e+063.668948e+063.754321e+063.853724e+063.787350e+063.584464e+063.499711e+063.623205e+064.145770e+06
1M151T40M151.0402T40151.040235151.040092151.04035039.83817237.55607240.5323159595...1.419062e+061.251606e+061.214826e+068.143028e+055.331963e+051.930928e+061.479001e+061.076354e+069.293218e+055.298062e+05
2M152T40M152.0436T40152.043607152.043451152.04373740.30370038.09267840.9094288181...1.203919e+059.970442e+049.384000e+044.186335e+04NaN2.115447e+051.285713e+059.389346e+047.163655e+044.916483e+04
3M153T34M152.8838T34152.883824152.883678152.88395934.17464733.63759535.4655489898...5.592065e+065.761380e+065.845419e+065.576013e+065.552878e+066.132789e+065.891378e+065.418082e+065.036840e+065.733794e+06
4M153T36M153.0195T36153.019474153.019331153.01963335.78584734.13024436.2873549898...7.284938e+061.083289e+071.140072e+078.220552e+069.255154e+067.648211e+067.723814e+065.571163e+065.362560e+069.259675e+06
..................................................................
395M283T339M283.2646T339283.264583283.264341283.264809338.763489338.398380339.1659489494...3.509767e+054.117633e+053.948000e+054.338804e+055.335221e+056.224684e+057.009340e+053.005173e+053.133173e+058.204783e+05
396M284T60M284.1953T60284.195294284.194939284.19553659.59356158.84421760.1070585959...NaNNaNNaNNaNNaN2.558004e+044.020517e+04NaN3.162670e+045.446684e+04
397M284T108M284.2235T108284.223499284.223156284.223692108.406389107.880510108.9710467272...7.477652e+047.482219e+043.399667e+047.233564e+041.043879e+052.506785e+042.753769e+04NaNNaNNaN
398M284T339M284.268T339284.267962284.267634284.268204338.725056338.268300339.3700988484...3.697604e+045.398264e+045.340109e+046.557698e+047.656575e+041.040606e+051.063727e+05NaN3.059370e+041.358056e+05
399M285T34M284.775T34284.775031284.774635284.77528734.07964133.66717235.1981819797...3.439330e+063.359842e+063.375577e+063.789056e+063.478506e+063.391588e+065.067802e+063.497546e+063.316025e+063.906000e+06
\n", "

400 rows × 110 columns

\n", "
" ], "text/plain": [ " name namecustom mz mzmin mzmax rt \\\n", "0 M151T34 M150.8867T34 150.886715 150.886592 150.886863 34.152700 \n", "1 M151T40 M151.0402T40 151.040235 151.040092 151.040350 39.838172 \n", "2 M152T40 M152.0436T40 152.043607 152.043451 152.043737 40.303700 \n", "3 M153T34 M152.8838T34 152.883824 152.883678 152.883959 34.174647 \n", "4 M153T36 M153.0195T36 153.019474 153.019331 153.019633 35.785847 \n", ".. ... ... ... ... ... ... \n", "395 M283T339 M283.2646T339 283.264583 283.264341 283.264809 338.763489 \n", "396 M284T60 M284.1953T60 284.195294 284.194939 284.195536 59.593561 \n", "397 M284T108 M284.2235T108 284.223499 284.223156 284.223692 108.406389 \n", "398 M284T339 M284.268T339 284.267962 284.267634 284.268204 338.725056 \n", "399 M285T34 M284.775T34 284.775031 284.774635 284.775287 34.079641 \n", "\n", " rtmin rtmax npeaks . ... X210 X209 \\\n", "0 33.637595 35.465548 97 97 ... 4.224942e+06 3.946599e+06 \n", "1 37.556072 40.532315 95 95 ... 1.419062e+06 1.251606e+06 \n", "2 38.092678 40.909428 81 81 ... 1.203919e+05 9.970442e+04 \n", "3 33.637595 35.465548 98 98 ... 5.592065e+06 5.761380e+06 \n", "4 34.130244 36.287354 98 98 ... 7.284938e+06 1.083289e+07 \n", ".. ... ... ... .. ... ... ... \n", "395 338.398380 339.165948 94 94 ... 3.509767e+05 4.117633e+05 \n", "396 58.844217 60.107058 59 59 ... NaN NaN \n", "397 107.880510 108.971046 72 72 ... 7.477652e+04 7.482219e+04 \n", "398 338.268300 339.370098 84 84 ... 3.697604e+04 5.398264e+04 \n", "399 33.667172 35.198181 97 97 ... 3.439330e+06 3.359842e+06 \n", "\n", " X208 X207 X206 X205 X204 \\\n", "0 3.668948e+06 3.754321e+06 3.853724e+06 3.787350e+06 3.584464e+06 \n", "1 1.214826e+06 8.143028e+05 5.331963e+05 1.930928e+06 1.479001e+06 \n", "2 9.384000e+04 4.186335e+04 NaN 2.115447e+05 1.285713e+05 \n", "3 5.845419e+06 5.576013e+06 5.552878e+06 6.132789e+06 5.891378e+06 \n", "4 1.140072e+07 8.220552e+06 9.255154e+06 7.648211e+06 7.723814e+06 \n", ".. ... ... ... ... ... \n", "395 3.948000e+05 4.338804e+05 5.335221e+05 6.224684e+05 7.009340e+05 \n", "396 NaN NaN NaN 2.558004e+04 4.020517e+04 \n", "397 3.399667e+04 7.233564e+04 1.043879e+05 2.506785e+04 2.753769e+04 \n", "398 5.340109e+04 6.557698e+04 7.656575e+04 1.040606e+05 1.063727e+05 \n", "399 3.375577e+06 3.789056e+06 3.478506e+06 3.391588e+06 5.067802e+06 \n", "\n", " X203 X202 X201 \n", "0 3.499711e+06 3.623205e+06 4.145770e+06 \n", "1 1.076354e+06 9.293218e+05 5.298062e+05 \n", "2 9.389346e+04 7.163655e+04 4.916483e+04 \n", "3 5.418082e+06 5.036840e+06 5.733794e+06 \n", "4 5.571163e+06 5.362560e+06 9.259675e+06 \n", ".. ... ... ... \n", "395 3.005173e+05 3.133173e+05 8.204783e+05 \n", "396 NaN 3.162670e+04 5.446684e+04 \n", "397 NaN NaN NaN \n", "398 NaN 3.059370e+04 1.358056e+05 \n", "399 3.497546e+06 3.316025e+06 3.906000e+06 \n", "\n", "[400 rows x 110 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d_data = \"./data/df_pos_2.tsv\"\n", "data = pd.read_table(d_data, header=0, sep=\"\\t\")\n", "data" ] }, { "cell_type": "markdown", "id": "94f63b45", "metadata": {}, "source": [ "This data set includes peak list and intensity data matrix. `LAMP`\n", "requires peak list's name, m/z value and retention time. User needs to\n", "indicate the locations of feature name, m/z value, retention time and\n", "starting points of data matrix from data. Here they are 1, 3, 6 and 11,\n", "respectively.\n", "\n", "Load input data with `xlsx` format for `LAMP`:" ] }, { "cell_type": "code", "execution_count": 3, "id": "9b8f5d57", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:39:54.315499Z", "iopub.status.busy": "2024-11-06T19:39:54.315499Z", "iopub.status.idle": "2024-11-06T19:39:55.471357Z", "shell.execute_reply": "2024-11-06T19:39:55.471357Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namemzrtQC9QC5QC4QC3QC26QC25QC24...X210X209X208X207X206X205X204X203X202X201
0M151T34150.88671534.1527003.664879e+063.735147e+065.190263e+062.742966e+063.824723e+063.722932e+063.804188e+06...4.224942e+063.946599e+063.668948e+063.754321e+063.853724e+063.787350e+063.584464e+063.499711e+063.623205e+064.145770e+06
1M151T40151.04023539.8381727.406381e+057.524075e+05NaN6.429245e+051.167016e+061.175981e+061.122533e+06...1.419062e+061.251606e+061.214826e+068.143028e+055.331963e+051.930928e+061.479001e+061.076354e+069.293218e+055.298062e+05
2M152T40152.04360740.3037006.105241e+045.335546e+04NaNNaN6.875157e+047.807399e+048.943068e+04...1.203919e+059.970442e+049.384000e+044.186335e+04NaN2.115447e+051.285713e+059.389346e+047.163655e+044.916483e+04
3M153T34152.88382434.1746475.141479e+065.496344e+068.335846e+063.860588e+065.316874e+065.988232e+065.844917e+06...5.592065e+065.761380e+065.845419e+065.576013e+065.552878e+066.132789e+065.891378e+065.418082e+065.036840e+065.733794e+06
4M153T36153.01947435.7858475.336758e+065.558265e+061.118557e+076.876715e+069.967314e+069.073822e+069.328573e+06...7.284938e+061.083289e+071.140072e+078.220552e+069.255154e+067.648211e+067.723814e+065.571163e+065.362560e+069.259675e+06
..................................................................
395M283T339283.264583338.7634897.330602e+058.243956e+05NaN1.159506e+064.294760e+054.641813e+054.570657e+05...3.509767e+054.117633e+053.948000e+054.338804e+055.335221e+056.224684e+057.009340e+053.005173e+053.133173e+058.204783e+05
396M284T60284.19529459.5935612.310932e+04NaNNaNNaN1.759336e+042.645392e+042.727266e+04...NaNNaNNaNNaNNaN2.558004e+044.020517e+04NaN3.162670e+045.446684e+04
397M284T108284.223499108.4063893.748444e+042.993283e+04NaNNaN3.175596e+043.879604e+044.299529e+04...7.477652e+047.482219e+043.399667e+047.233564e+041.043879e+052.506785e+042.753769e+04NaNNaNNaN
398M284T339284.267962338.7250561.161886e+051.476514e+05NaNNaNNaN6.753490e+045.436219e+04...3.697604e+045.398264e+045.340109e+046.557698e+047.656575e+041.040606e+051.063727e+05NaN3.059370e+041.358056e+05
399M285T34284.77503134.0796414.063268e+063.807148e+064.645099e+062.232221e+064.576754e+064.533339e+064.559356e+06...3.439330e+063.359842e+063.375577e+063.789056e+063.478506e+063.391588e+065.067802e+063.497546e+063.316025e+063.906000e+06
\n", "

400 rows × 103 columns

\n", "
" ], "text/plain": [ " name mz rt QC9 QC5 \\\n", "0 M151T34 150.886715 34.152700 3.664879e+06 3.735147e+06 \n", "1 M151T40 151.040235 39.838172 7.406381e+05 7.524075e+05 \n", "2 M152T40 152.043607 40.303700 6.105241e+04 5.335546e+04 \n", "3 M153T34 152.883824 34.174647 5.141479e+06 5.496344e+06 \n", "4 M153T36 153.019474 35.785847 5.336758e+06 5.558265e+06 \n", ".. ... ... ... ... ... \n", "395 M283T339 283.264583 338.763489 7.330602e+05 8.243956e+05 \n", "396 M284T60 284.195294 59.593561 2.310932e+04 NaN \n", "397 M284T108 284.223499 108.406389 3.748444e+04 2.993283e+04 \n", "398 M284T339 284.267962 338.725056 1.161886e+05 1.476514e+05 \n", "399 M285T34 284.775031 34.079641 4.063268e+06 3.807148e+06 \n", "\n", " QC4 QC3 QC26 QC25 QC24 \\\n", "0 5.190263e+06 2.742966e+06 3.824723e+06 3.722932e+06 3.804188e+06 \n", "1 NaN 6.429245e+05 1.167016e+06 1.175981e+06 1.122533e+06 \n", "2 NaN NaN 6.875157e+04 7.807399e+04 8.943068e+04 \n", "3 8.335846e+06 3.860588e+06 5.316874e+06 5.988232e+06 5.844917e+06 \n", "4 1.118557e+07 6.876715e+06 9.967314e+06 9.073822e+06 9.328573e+06 \n", ".. ... ... ... ... ... \n", "395 NaN 1.159506e+06 4.294760e+05 4.641813e+05 4.570657e+05 \n", "396 NaN NaN 1.759336e+04 2.645392e+04 2.727266e+04 \n", "397 NaN NaN 3.175596e+04 3.879604e+04 4.299529e+04 \n", "398 NaN NaN NaN 6.753490e+04 5.436219e+04 \n", "399 4.645099e+06 2.232221e+06 4.576754e+06 4.533339e+06 4.559356e+06 \n", "\n", " ... X210 X209 X208 X207 \\\n", "0 ... 4.224942e+06 3.946599e+06 3.668948e+06 3.754321e+06 \n", "1 ... 1.419062e+06 1.251606e+06 1.214826e+06 8.143028e+05 \n", "2 ... 1.203919e+05 9.970442e+04 9.384000e+04 4.186335e+04 \n", "3 ... 5.592065e+06 5.761380e+06 5.845419e+06 5.576013e+06 \n", "4 ... 7.284938e+06 1.083289e+07 1.140072e+07 8.220552e+06 \n", ".. ... ... ... ... ... \n", "395 ... 3.509767e+05 4.117633e+05 3.948000e+05 4.338804e+05 \n", "396 ... NaN NaN NaN NaN \n", "397 ... 7.477652e+04 7.482219e+04 3.399667e+04 7.233564e+04 \n", "398 ... 3.697604e+04 5.398264e+04 5.340109e+04 6.557698e+04 \n", "399 ... 3.439330e+06 3.359842e+06 3.375577e+06 3.789056e+06 \n", "\n", " X206 X205 X204 X203 X202 \\\n", "0 3.853724e+06 3.787350e+06 3.584464e+06 3.499711e+06 3.623205e+06 \n", "1 5.331963e+05 1.930928e+06 1.479001e+06 1.076354e+06 9.293218e+05 \n", "2 NaN 2.115447e+05 1.285713e+05 9.389346e+04 7.163655e+04 \n", "3 5.552878e+06 6.132789e+06 5.891378e+06 5.418082e+06 5.036840e+06 \n", "4 9.255154e+06 7.648211e+06 7.723814e+06 5.571163e+06 5.362560e+06 \n", ".. ... ... ... ... ... \n", "395 5.335221e+05 6.224684e+05 7.009340e+05 3.005173e+05 3.133173e+05 \n", "396 NaN 2.558004e+04 4.020517e+04 NaN 3.162670e+04 \n", "397 1.043879e+05 2.506785e+04 2.753769e+04 NaN NaN \n", "398 7.656575e+04 1.040606e+05 1.063727e+05 NaN 3.059370e+04 \n", "399 3.478506e+06 3.391588e+06 5.067802e+06 3.497546e+06 3.316025e+06 \n", "\n", " X201 \n", "0 4.145770e+06 \n", "1 5.298062e+05 \n", "2 4.916483e+04 \n", "3 5.733794e+06 \n", "4 9.259675e+06 \n", ".. ... \n", "395 8.204783e+05 \n", "396 5.446684e+04 \n", "397 NaN \n", "398 1.358056e+05 \n", "399 3.906000e+06 \n", "\n", "[400 rows x 103 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cols = [1, 3, 6, 11]\n", "# d_data = \"./data/df_pos_2.tsv\"\n", "# df = anno.read_peak(d_data, cols, sep='\\t')\n", "d_data = \"./data/df_pos_2.xlsx\" # use xlsx file\n", "df = anno.read_peak(d_data, cols, sheet_name=0)\n", "df" ] }, { "cell_type": "markdown", "id": "c6c7df1f", "metadata": {}, "source": [ "The argument `sep` will be ignored if the input data is an `xlsx` file.\n", "Data frame `df` now includes only `name`, `mz`, `rt` and intensity data\n", "matrix.\n", "\n", "## Metabolite Annotation\n", "\n", "To perform metabolite annotation, users should provide their own\n", "reference file. Otherwise, `LAMP` will use its default reference file for\n", "annotation. Here we load the default reference file for compound\n", "annotation. Since the input data is positive mode here, we only\n", "use positive part of reference file. If `ion_mode` is empty, all reference\n", "items will be used for matching." ] }, { "cell_type": "code", "execution_count": 4, "id": "9ea322fd", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:39:55.471357Z", "iopub.status.busy": "2024-11-06T19:39:55.471357Z", "iopub.status.idle": "2024-11-06T19:40:10.367112Z", "shell.execute_reply": "2024-11-06T19:40:10.367112Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
compound_namemolecular_formulamonoisotopic_massexact_massion_typeion_modesmilesinchikeyinchikegg_idhmdb_idchebi_idpubchem_idlipidmaps_id
34230(-)-SalsolineC11H15NO2193.110265232.073425[M+39K]+positiveCOc1cc2c(cc1O)CCN[C@H]2CYTPRLBGPGZHUPD-ZETCQYMHSA-NInChI=1S/C11H15NO2/c1-7-9-6-11(14-2)10(13)5-8(...C09640-X-CHEBI:112442356-X-
34231(-)-trans-carveolC10H16O152.120110191.083270[M+39K]+positiveC=C(C)[C@@H]1CC=C(C)[C@@H](O)C1BAVONGHXFVOKBV-ZJUUUORDSA-NInChI=1S/C10H16O/c1-7(2)9-5-4-8(3)10(11)6-9/h4...C00964-X-CHEBI:15389-X--X-
34232(-)-ureidoglycolic acidC3H6N2O4134.032730172.995890[M+39K]+positiveNC(=O)N[C@@H](O)C(=O)ONWZYYCVIOKVTII-SFOWXEAESA-NInChI=1S/C3H6N2O4/c4-3(9)5-1(6)2(7)8/h1,6H,(H,...C00603HMDB0001005CHEBI:15412439269-X-
34233(11R)-11-hydroperoxylinoleic acidC18H32O4312.230040351.193200[M+39K]+positiveCCCCCC=CC(C=CCCCCCCCC(=O)O)OOPLWDMWAXENHPLY-UHFFFAOYSA-N-X--X--X-CHEBI:1342475230520-X-
34234(11Z,14Z)-eicosadienoylcarnitineC27H49NO4451.366135490.329295[M+39K]+positiveCCCCC/C=C\\C/C=C\\CCCCCCCCCC(=O)OC(CC(=O)[O-])C[...OLZWDVKTOGTVLC-UTJQPWESSA-NInChI=1S/C27H49NO4/c1-5-6-7-8-9-10-11-12-13-14...-X--X-CHEBI:73119-X--X-
.............................................
83155N(6),N(6),N(6)-trimethyl-L-lysineC9H21N2O2+189.160301189.159751M+positiveC[N+](C)(C)CCCC[C@H](N)C(=O)OMXNRLFUSFKVQSK-QMMMGPOBSA-OInChI=1S/C9H20N2O2/c1-11(2,3)7-5-4-6-8(10)9(12...C03793HMDB0001325CHEBI:17311440120-X-
83156nicotinic acid D-ribonucleotideC11H15NO9P+336.048436336.047886M+positiveO=C(O)c1ccc[n+]([C@@H]2O[C@H](COP(=O)(O)O)[C@@...JOUIQRNQJGXQDC-ZYUZMQFOSA-OInChI=1S/C11H14NO9P/c13-8-7(5-20-22(17,18)19)2...C01185-X-CHEBI:1576353477721-X-
83157phosphocholineC5H15NO4P+184.073866184.073316M+positiveC[N+](C)(C)CCOP(=O)(O)OYHHSONZFOIEMCP-UHFFFAOYSA-OInChI=1S/C5H14NO4P/c1-6(2,3)4-5-10-11(7,8)9/h4...C00588HMDB0001565CHEBI:181321014-X-
83158S-adenosyl-L-methionineC15H23N6O5S+399.145060399.144510M+positiveC[S+](CC[C@H](N)C(=O)O)C[C@H]1O[C@@H](n2cnc3c(...MEFKEPWMEQBLKI-AIRLBKTGSA-OInChI=1S/C15H22N6O5S/c1-27(3-2-7(16)15(24)25)4...C00019HMDB0001185CHEBI:1541416757548-X-
83159S-adenosylmethioninamineC14H23N6O3S+355.155232355.154682M+positiveC[S+](CCCN)C[C@H]1O[C@@H](n2cnc3c(N)ncnc32)[C@...ZUNBITIXDCPNSD-LSRJEVITSA-NInChI=1S/C14H23N6O3S/c1-24(4-2-3-15)5-8-10(21)...C01137HMDB0000988CHEBI:15625439415-X-
\n", "

39150 rows × 14 columns

\n", "
" ], "text/plain": [ " compound_name molecular_formula monoisotopic_mass \\\n", "34230 (-)-Salsoline C11H15NO2 193.110265 \n", "34231 (-)-trans-carveol C10H16O 152.120110 \n", "34232 (-)-ureidoglycolic acid C3H6N2O4 134.032730 \n", "34233 (11R)-11-hydroperoxylinoleic acid C18H32O4 312.230040 \n", "34234 (11Z,14Z)-eicosadienoylcarnitine C27H49NO4 451.366135 \n", "... ... ... ... \n", "83155 N(6),N(6),N(6)-trimethyl-L-lysine C9H21N2O2+ 189.160301 \n", "83156 nicotinic acid D-ribonucleotide C11H15NO9P+ 336.048436 \n", "83157 phosphocholine C5H15NO4P+ 184.073866 \n", "83158 S-adenosyl-L-methionine C15H23N6O5S+ 399.145060 \n", "83159 S-adenosylmethioninamine C14H23N6O3S+ 355.155232 \n", "\n", " exact_mass ion_type ion_mode \\\n", "34230 232.073425 [M+39K]+ positive \n", "34231 191.083270 [M+39K]+ positive \n", "34232 172.995890 [M+39K]+ positive \n", "34233 351.193200 [M+39K]+ positive \n", "34234 490.329295 [M+39K]+ positive \n", "... ... ... ... \n", "83155 189.159751 M+ positive \n", "83156 336.047886 M+ positive \n", "83157 184.073316 M+ positive \n", "83158 399.144510 M+ positive \n", "83159 355.154682 M+ positive \n", "\n", " smiles \\\n", "34230 COc1cc2c(cc1O)CCN[C@H]2C \n", "34231 C=C(C)[C@@H]1CC=C(C)[C@@H](O)C1 \n", "34232 NC(=O)N[C@@H](O)C(=O)O \n", "34233 CCCCCC=CC(C=CCCCCCCCC(=O)O)OO \n", "34234 CCCCC/C=C\\C/C=C\\CCCCCCCCCC(=O)OC(CC(=O)[O-])C[... \n", "... ... \n", "83155 C[N+](C)(C)CCCC[C@H](N)C(=O)O \n", "83156 O=C(O)c1ccc[n+]([C@@H]2O[C@H](COP(=O)(O)O)[C@@... \n", "83157 C[N+](C)(C)CCOP(=O)(O)O \n", "83158 C[S+](CC[C@H](N)C(=O)O)C[C@H]1O[C@@H](n2cnc3c(... \n", "83159 C[S+](CCCN)C[C@H]1O[C@@H](n2cnc3c(N)ncnc32)[C@... \n", "\n", " inchikey \\\n", "34230 YTPRLBGPGZHUPD-ZETCQYMHSA-N \n", "34231 BAVONGHXFVOKBV-ZJUUUORDSA-N \n", "34232 NWZYYCVIOKVTII-SFOWXEAESA-N \n", "34233 PLWDMWAXENHPLY-UHFFFAOYSA-N \n", "34234 OLZWDVKTOGTVLC-UTJQPWESSA-N \n", "... ... \n", "83155 MXNRLFUSFKVQSK-QMMMGPOBSA-O \n", "83156 JOUIQRNQJGXQDC-ZYUZMQFOSA-O \n", "83157 YHHSONZFOIEMCP-UHFFFAOYSA-O \n", "83158 MEFKEPWMEQBLKI-AIRLBKTGSA-O \n", "83159 ZUNBITIXDCPNSD-LSRJEVITSA-N \n", "\n", " inchi kegg_id hmdb_id \\\n", "34230 InChI=1S/C11H15NO2/c1-7-9-6-11(14-2)10(13)5-8(... C09640 -X- \n", "34231 InChI=1S/C10H16O/c1-7(2)9-5-4-8(3)10(11)6-9/h4... C00964 -X- \n", "34232 InChI=1S/C3H6N2O4/c4-3(9)5-1(6)2(7)8/h1,6H,(H,... C00603 HMDB0001005 \n", "34233 -X- -X- -X- \n", "34234 InChI=1S/C27H49NO4/c1-5-6-7-8-9-10-11-12-13-14... -X- -X- \n", "... ... ... ... \n", "83155 InChI=1S/C9H20N2O2/c1-11(2,3)7-5-4-6-8(10)9(12... C03793 HMDB0001325 \n", "83156 InChI=1S/C11H14NO9P/c13-8-7(5-20-22(17,18)19)2... C01185 -X- \n", "83157 InChI=1S/C5H14NO4P/c1-6(2,3)4-5-10-11(7,8)9/h4... C00588 HMDB0001565 \n", "83158 InChI=1S/C15H22N6O5S/c1-27(3-2-7(16)15(24)25)4... C00019 HMDB0001185 \n", "83159 InChI=1S/C14H23N6O3S/c1-24(4-2-3-15)5-8-10(21)... C01137 HMDB0000988 \n", "\n", " chebi_id pubchem_id lipidmaps_id \n", "34230 CHEBI:112 442356 -X- \n", "34231 CHEBI:15389 -X- -X- \n", "34232 CHEBI:15412 439269 -X- \n", "34233 CHEBI:134247 5230520 -X- \n", "34234 CHEBI:73119 -X- -X- \n", "... ... ... ... \n", "83155 CHEBI:17311 440120 -X- \n", "83156 CHEBI:15763 53477721 -X- \n", "83157 CHEBI:18132 1014 -X- \n", "83158 CHEBI:15414 16757548 -X- \n", "83159 CHEBI:15625 439415 -X- \n", "\n", "[39150 rows x 14 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ion_mode = \"pos\"\n", "ref_path = \"\" # if empty, use default reference file for matching\n", "# load reference library\n", "cal_mass = False\n", "ref = anno.read_ref(ref_path, ion_mode=ion_mode, calc=cal_mass)\n", "ref" ] }, { "cell_type": "markdown", "id": "f3c9c16a", "metadata": {}, "source": [ "The reference file must have one column: `molecular_formula` (or\n", "`formula`) if there is no column called `ion m/z` (or, `m/z`,\n", "`exact_mass`). The `exact_mass` is optional. if absent, `LAMP` will use\n", "`molecular_formula` to calculate 'exact_mass' based on the NIST Atomic\n", "Weights and Isotopic Compositions for All Elements. If your reference file\n", "has `exact_mass` and you still want to calculate it using NIST database,\n", "set `calc` as True. The `exact_mass` is used to match against a range of\n", "`mz`, controlled by `ppm`, in data frame `df`.\n", "\n", "As the same as input data, the reference file can be `xlsx` file. Another\n", "reference file is HMDB database for urine:" ] }, { "cell_type": "code", "execution_count": 5, "id": "810c3a5d", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:10.367112Z", "iopub.status.busy": "2024-11-06T19:40:10.367112Z", "iopub.status.idle": "2024-11-06T19:40:10.541818Z", "shell.execute_reply": "2024-11-06T19:40:10.541818Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idmolecular_formulacompound_nameinchiinchi_keyexact_mass
0HMDB0000001C7H11N3O21-MethylhistidineInChI=1S/C7H11N3O2/c1-10-3-5(9-4-10)2-6(8)7(11...BRMWTNUJHUMWMS-LURJTMIESA-N169.085127
1HMDB0000002C3H10N21,3-DiaminopropaneInChI=1S/C3H10N2/c4-2-1-3-5/h1-5H2XFNJVJPLKCPIBV-UHFFFAOYSA-N74.084398
2HMDB0000005C4H6O32-Ketobutyric acidInChI=1S/C4H6O3/c1-2-3(5)4(6)7/h2H2,1H3,(H,6,7)TYEYBOSBBBHJIV-UHFFFAOYSA-N102.031694
3HMDB0000008C4H8O32-Hydroxybutyric acidInChI=1S/C4H8O3/c1-2-3(5)4(6)7/h3,5H,2H2,1H3,(...AFENDNXGAFYKQO-VKHMYHEASA-N104.047344
4HMDB0000010C19H24O32-MethoxyestroneInChI=1S/C19H24O3/c1-19-8-7-12-13(15(19)5-6-18...WHEUWNKSCXYKBU-QPWUGHHJSA-N300.172545
.....................
1606HMDB0012308C8H8O3VanillinInChI=1S/C8H8O3/c1-11-8-4-6(5-9)2-3-7(8)10/h2-...MWOOGOJBHIARFG-UHFFFAOYSA-N152.047344
1607HMDB0012322C10H8O2-NaphtholInChI=1S/C10H8O/c11-10-6-5-8-3-1-2-4-9(8)7-10/...JWAZRIHNYRIHIV-UHFFFAOYSA-N144.057515
1608HMDB0012325C5H10O5ArabinofuranoseInChI=1S/C5H10O5/c6-1-2-3(7)4(8)5(9)10-2/h2-9H...HMFHBZSHGGEWLO-HWQSCIPKSA-N150.052823
1609HMDB0012451C20H28O3all-trans-5,6-Epoxyretinoic acidInChI=1S/C20H28O3/c1-15(8-6-9-16(2)14-17(21)22...KEEHJLBAOLGBJZ-WEDZBJJJSA-N316.203845
1610HMDB0012467C15H13O9S(-)-Epicatechin sulfateInChI=1S/C15H14O9S/c16-9-3-8-5-13(24-25(20,21)...WTXWEAXATVSZQX-AFYYWNPRSA-M369.028028
\n", "

1611 rows × 6 columns

\n", "
" ], "text/plain": [ " id molecular_formula compound_name \\\n", "0 HMDB0000001 C7H11N3O2 1-Methylhistidine \n", "1 HMDB0000002 C3H10N2 1,3-Diaminopropane \n", "2 HMDB0000005 C4H6O3 2-Ketobutyric acid \n", "3 HMDB0000008 C4H8O3 2-Hydroxybutyric acid \n", "4 HMDB0000010 C19H24O3 2-Methoxyestrone \n", "... ... ... ... \n", "1606 HMDB0012308 C8H8O3 Vanillin \n", "1607 HMDB0012322 C10H8O 2-Naphthol \n", "1608 HMDB0012325 C5H10O5 Arabinofuranose \n", "1609 HMDB0012451 C20H28O3 all-trans-5,6-Epoxyretinoic acid \n", "1610 HMDB0012467 C15H13O9S (-)-Epicatechin sulfate \n", "\n", " inchi \\\n", "0 InChI=1S/C7H11N3O2/c1-10-3-5(9-4-10)2-6(8)7(11... \n", "1 InChI=1S/C3H10N2/c4-2-1-3-5/h1-5H2 \n", "2 InChI=1S/C4H6O3/c1-2-3(5)4(6)7/h2H2,1H3,(H,6,7) \n", "3 InChI=1S/C4H8O3/c1-2-3(5)4(6)7/h3,5H,2H2,1H3,(... \n", "4 InChI=1S/C19H24O3/c1-19-8-7-12-13(15(19)5-6-18... \n", "... ... \n", "1606 InChI=1S/C8H8O3/c1-11-8-4-6(5-9)2-3-7(8)10/h2-... \n", "1607 InChI=1S/C10H8O/c11-10-6-5-8-3-1-2-4-9(8)7-10/... \n", "1608 InChI=1S/C5H10O5/c6-1-2-3(7)4(8)5(9)10-2/h2-9H... \n", "1609 InChI=1S/C20H28O3/c1-15(8-6-9-16(2)14-17(21)22... \n", "1610 InChI=1S/C15H14O9S/c16-9-3-8-5-13(24-25(20,21)... \n", "\n", " inchi_key exact_mass \n", "0 BRMWTNUJHUMWMS-LURJTMIESA-N 169.085127 \n", "1 XFNJVJPLKCPIBV-UHFFFAOYSA-N 74.084398 \n", "2 TYEYBOSBBBHJIV-UHFFFAOYSA-N 102.031694 \n", "3 AFENDNXGAFYKQO-VKHMYHEASA-N 104.047344 \n", "4 WHEUWNKSCXYKBU-QPWUGHHJSA-N 300.172545 \n", "... ... ... \n", "1606 MWOOGOJBHIARFG-UHFFFAOYSA-N 152.047344 \n", "1607 JWAZRIHNYRIHIV-UHFFFAOYSA-N 144.057515 \n", "1608 HMFHBZSHGGEWLO-HWQSCIPKSA-N 150.052823 \n", "1609 KEEHJLBAOLGBJZ-WEDZBJJJSA-N 316.203845 \n", "1610 WTXWEAXATVSZQX-AFYYWNPRSA-M 369.028028 \n", "\n", "[1611 rows x 6 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ref_path = \"./data/hmdb_urine_v4_0_20200910_v1.tsv\"\n", "ref = anno.read_ref(ref_path, calc=True)\n", "ref" ] }, { "cell_type": "markdown", "id": "2e2a666f", "metadata": {}, "source": [ "Next we use HMDB reference file for compounds match. Here function argument\n", "`ppm` is used to control the m/z matching tolerance(range)." ] }, { "cell_type": "code", "execution_count": 6, "id": "f5df20ee", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:10.541818Z", "iopub.status.busy": "2024-11-06T19:40:10.541818Z", "iopub.status.idle": "2024-11-06T19:40:10.588193Z", "shell.execute_reply": "2024-11-06T19:40:10.588193Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idmzmolecular_formulacompound_nameinchiinchi_keyexact_massppm_error
0M154T37154.062402C8H10O3HydroxytyrosolInChI=1S/C8H10O3/c9-4-3-6-1-2-7(10)8(11)5-6/h1...JUUBCHWRXWPFFH-UHFFFAOYSA-N154.06-3.84
1M164T119164.046774C9H8O3Phenylpyruvic acidInChI=1S/C9H8O3/c10-8(9(11)12)6-7-4-2-1-3-5-7/...BTNMPGBKDVTSJY-UHFFFAOYSA-N164.05-3.47
2M164T119164.046774C9H8O3m-Coumaric acidInChI=1S/C9H8O3/c10-8-3-1-2-7(6-8)4-5-9(11)12/...KKSDGJDHHZEWEP-SNAWJCMRSA-N164.05-3.47
3M164T119164.046774C9H8O34-Hydroxycinnamic acidInChI=1S/C9H8O3/c10-8-4-1-7(2-5-8)3-6-9(11)12/...NGSWKAQJJWESNS-ZZXKWVIFSA-N164.05-3.47
4M164T119164.046774C9H8O32-Hydroxycinnamic acidInChI=1S/C9H8O3/c10-8-4-2-1-3-7(8)5-6-9(11)12/...PMOWTIHVNWZYFI-AATRIKPKSA-N164.05-3.47
5M164T233164.046832C9H8O3Phenylpyruvic acidInChI=1S/C9H8O3/c10-8(9(11)12)6-7-4-2-1-3-5-7/...BTNMPGBKDVTSJY-UHFFFAOYSA-N164.05-3.12
6M164T233164.046832C9H8O3m-Coumaric acidInChI=1S/C9H8O3/c10-8-3-1-2-7(6-8)4-5-9(11)12/...KKSDGJDHHZEWEP-SNAWJCMRSA-N164.05-3.12
7M164T233164.046832C9H8O34-Hydroxycinnamic acidInChI=1S/C9H8O3/c10-8-4-1-7(2-5-8)3-6-9(11)12/...NGSWKAQJJWESNS-ZZXKWVIFSA-N164.05-3.12
8M164T233164.046832C9H8O32-Hydroxycinnamic acidInChI=1S/C9H8O3/c10-8-4-2-1-3-7(8)5-6-9(11)12/...PMOWTIHVNWZYFI-AATRIKPKSA-N164.05-3.12
9M164T53164.046825C9H8O3Phenylpyruvic acidInChI=1S/C9H8O3/c10-8(9(11)12)6-7-4-2-1-3-5-7/...BTNMPGBKDVTSJY-UHFFFAOYSA-N164.05-3.16
10M164T53164.046825C9H8O3m-Coumaric acidInChI=1S/C9H8O3/c10-8-3-1-2-7(6-8)4-5-9(11)12/...KKSDGJDHHZEWEP-SNAWJCMRSA-N164.05-3.16
11M164T53164.046825C9H8O34-Hydroxycinnamic acidInChI=1S/C9H8O3/c10-8-4-1-7(2-5-8)3-6-9(11)12/...NGSWKAQJJWESNS-ZZXKWVIFSA-N164.05-3.16
12M164T53164.046825C9H8O32-Hydroxycinnamic acidInChI=1S/C9H8O3/c10-8-4-2-1-3-7(8)5-6-9(11)12/...PMOWTIHVNWZYFI-AATRIKPKSA-N164.05-3.16
13M167T35167.021095C7H5NO4Quinolinic acidInChI=1S/C7H5NO4/c9-6(10)4-2-1-3-8-5(4)7(11)12...GJAWHXHKYYXBSV-UHFFFAOYSA-N167.02-4.56
14M173T36_3173.104423C8H15NO3HexanoylglycineInChI=1S/C8H15NO3/c1-2-3-4-5-7(10)9-6-8(11)12/...UPCKIPHSXMXJOX-UHFFFAOYSA-N173.11-4.45
15M174T35174.088395C8H14O4Suberic acidInChI=1S/C8H14O4/c9-7(10)5-3-1-2-4-6-8(11)12/h...TYFQFVWCELRYAO-UHFFFAOYSA-N174.09-4.67
16M181T36181.060407C6H7N5O28-Hydroxy-7-methylguanineInChI=1S/C6H7N5O2/c1-11-2-3(9-6(11)13)8-5(7)10...VHPXSVXJBWZORQ-UHFFFAOYSA-N181.062.39
17M212T39212.067866C10H12O5Vanillactic acidInChI=1S/C10H12O5/c1-15-9-5-6(2-3-7(9)11)4-8(1...SVYIZYRTOYHQRE-UHFFFAOYSA-N212.07-2.87
18M276T36276.077397C10H16N2O5SBiotin sulfoneInChI=1S/C10H16N2O5S/c13-8(14)4-2-1-3-7-9-6(5-...QPFQYMONYBAUCY-ZKWXMUAHSA-N276.08-2.16
\n", "
" ], "text/plain": [ " id mz molecular_formula compound_name \\\n", "0 M154T37 154.062402 C8H10O3 Hydroxytyrosol \n", "1 M164T119 164.046774 C9H8O3 Phenylpyruvic acid \n", "2 M164T119 164.046774 C9H8O3 m-Coumaric acid \n", "3 M164T119 164.046774 C9H8O3 4-Hydroxycinnamic acid \n", "4 M164T119 164.046774 C9H8O3 2-Hydroxycinnamic acid \n", "5 M164T233 164.046832 C9H8O3 Phenylpyruvic acid \n", "6 M164T233 164.046832 C9H8O3 m-Coumaric acid \n", "7 M164T233 164.046832 C9H8O3 4-Hydroxycinnamic acid \n", "8 M164T233 164.046832 C9H8O3 2-Hydroxycinnamic acid \n", "9 M164T53 164.046825 C9H8O3 Phenylpyruvic acid \n", "10 M164T53 164.046825 C9H8O3 m-Coumaric acid \n", "11 M164T53 164.046825 C9H8O3 4-Hydroxycinnamic acid \n", "12 M164T53 164.046825 C9H8O3 2-Hydroxycinnamic acid \n", "13 M167T35 167.021095 C7H5NO4 Quinolinic acid \n", "14 M173T36_3 173.104423 C8H15NO3 Hexanoylglycine \n", "15 M174T35 174.088395 C8H14O4 Suberic acid \n", "16 M181T36 181.060407 C6H7N5O2 8-Hydroxy-7-methylguanine \n", "17 M212T39 212.067866 C10H12O5 Vanillactic acid \n", "18 M276T36 276.077397 C10H16N2O5S Biotin sulfone \n", "\n", " inchi \\\n", "0 InChI=1S/C8H10O3/c9-4-3-6-1-2-7(10)8(11)5-6/h1... \n", "1 InChI=1S/C9H8O3/c10-8(9(11)12)6-7-4-2-1-3-5-7/... \n", "2 InChI=1S/C9H8O3/c10-8-3-1-2-7(6-8)4-5-9(11)12/... \n", "3 InChI=1S/C9H8O3/c10-8-4-1-7(2-5-8)3-6-9(11)12/... \n", "4 InChI=1S/C9H8O3/c10-8-4-2-1-3-7(8)5-6-9(11)12/... \n", "5 InChI=1S/C9H8O3/c10-8(9(11)12)6-7-4-2-1-3-5-7/... \n", "6 InChI=1S/C9H8O3/c10-8-3-1-2-7(6-8)4-5-9(11)12/... \n", "7 InChI=1S/C9H8O3/c10-8-4-1-7(2-5-8)3-6-9(11)12/... \n", "8 InChI=1S/C9H8O3/c10-8-4-2-1-3-7(8)5-6-9(11)12/... \n", "9 InChI=1S/C9H8O3/c10-8(9(11)12)6-7-4-2-1-3-5-7/... \n", "10 InChI=1S/C9H8O3/c10-8-3-1-2-7(6-8)4-5-9(11)12/... \n", "11 InChI=1S/C9H8O3/c10-8-4-1-7(2-5-8)3-6-9(11)12/... \n", "12 InChI=1S/C9H8O3/c10-8-4-2-1-3-7(8)5-6-9(11)12/... \n", "13 InChI=1S/C7H5NO4/c9-6(10)4-2-1-3-8-5(4)7(11)12... \n", "14 InChI=1S/C8H15NO3/c1-2-3-4-5-7(10)9-6-8(11)12/... \n", "15 InChI=1S/C8H14O4/c9-7(10)5-3-1-2-4-6-8(11)12/h... \n", "16 InChI=1S/C6H7N5O2/c1-11-2-3(9-6(11)13)8-5(7)10... \n", "17 InChI=1S/C10H12O5/c1-15-9-5-6(2-3-7(9)11)4-8(1... \n", "18 InChI=1S/C10H16N2O5S/c13-8(14)4-2-1-3-7-9-6(5-... \n", "\n", " inchi_key exact_mass ppm_error \n", "0 JUUBCHWRXWPFFH-UHFFFAOYSA-N 154.06 -3.84 \n", "1 BTNMPGBKDVTSJY-UHFFFAOYSA-N 164.05 -3.47 \n", "2 KKSDGJDHHZEWEP-SNAWJCMRSA-N 164.05 -3.47 \n", "3 NGSWKAQJJWESNS-ZZXKWVIFSA-N 164.05 -3.47 \n", "4 PMOWTIHVNWZYFI-AATRIKPKSA-N 164.05 -3.47 \n", "5 BTNMPGBKDVTSJY-UHFFFAOYSA-N 164.05 -3.12 \n", "6 KKSDGJDHHZEWEP-SNAWJCMRSA-N 164.05 -3.12 \n", "7 NGSWKAQJJWESNS-ZZXKWVIFSA-N 164.05 -3.12 \n", "8 PMOWTIHVNWZYFI-AATRIKPKSA-N 164.05 -3.12 \n", "9 BTNMPGBKDVTSJY-UHFFFAOYSA-N 164.05 -3.16 \n", "10 KKSDGJDHHZEWEP-SNAWJCMRSA-N 164.05 -3.16 \n", "11 NGSWKAQJJWESNS-ZZXKWVIFSA-N 164.05 -3.16 \n", "12 PMOWTIHVNWZYFI-AATRIKPKSA-N 164.05 -3.16 \n", "13 GJAWHXHKYYXBSV-UHFFFAOYSA-N 167.02 -4.56 \n", "14 UPCKIPHSXMXJOX-UHFFFAOYSA-N 173.11 -4.45 \n", "15 TYFQFVWCELRYAO-UHFFFAOYSA-N 174.09 -4.67 \n", "16 VHPXSVXJBWZORQ-UHFFFAOYSA-N 181.06 2.39 \n", "17 SVYIZYRTOYHQRE-UHFFFAOYSA-N 212.07 -2.87 \n", "18 QPFQYMONYBAUCY-ZKWXMUAHSA-N 276.08 -2.16 " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ppm = 5.0\n", "match = anno.comp_match_mass(df, ppm, ref)\n", "match" ] }, { "cell_type": "markdown", "id": "e8165866", "metadata": {}, "source": [ "`match` gives the compound matching results. `LAMP` also provides a mass\n", "adjust option by adduct library. You can provide your own adducts library\n", "otherwise `LAMP` uses its default adducts library.\n", "\n", "The adducts library's format looks like:" ] }, { "cell_type": "code", "execution_count": 7, "id": "65915d73", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:10.588193Z", "iopub.status.busy": "2024-11-06T19:40:10.588193Z", "iopub.status.idle": "2024-11-06T19:40:10.609426Z", "shell.execute_reply": "2024-11-06T19:40:10.609426Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
labelexact_masschargeion_mode
0[M+H]+1.0072761pos
1[M+NH4]+18.0338261pos
2[M+Na]+22.9892211pos
3[M+Mg]+23.9844931pos
4[M+K]+38.9631581pos
5[M+Fe]+55.9343881pos
6[M+Cu]+62.9290491pos
7[M+2H]+2.0151011pos
8[M+3H]+3.0229261pos
9[M-H]--1.0072761neg
10[M+35Cl]-34.9694011neg
11[M+Formate]-44.9982031neg
12[M+Acetate]-59.0138531neg
\n", "
" ], "text/plain": [ " label exact_mass charge ion_mode\n", "0 [M+H]+ 1.007276 1 pos\n", "1 [M+NH4]+ 18.033826 1 pos\n", "2 [M+Na]+ 22.989221 1 pos\n", "3 [M+Mg]+ 23.984493 1 pos\n", "4 [M+K]+ 38.963158 1 pos\n", "5 [M+Fe]+ 55.934388 1 pos\n", "6 [M+Cu]+ 62.929049 1 pos\n", "7 [M+2H]+ 2.015101 1 pos\n", "8 [M+3H]+ 3.022926 1 pos\n", "9 [M-H]- -1.007276 1 neg\n", "10 [M+35Cl]- 34.969401 1 neg\n", "11 [M+Formate]- 44.998203 1 neg\n", "12 [M+Acetate]- 59.013853 1 neg" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "add_path = './data/adducts_short.tsv'\n", "lib_df = pd.read_csv(add_path, sep=\"\\t\")\n", "lib_df" ] }, { "cell_type": "markdown", "id": "61a69dbe", "metadata": {}, "source": [ "The adducts library must have columns of `label`, `exact_mass`, `charge`\n", "and `ion_mode`.\n", "\n", "We use this adducts file to adjust mass:" ] }, { "cell_type": "code", "execution_count": 8, "id": "4618cd90", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:10.609426Z", "iopub.status.busy": "2024-11-06T19:40:10.609426Z", "iopub.status.idle": "2024-11-06T19:40:10.625522Z", "shell.execute_reply": "2024-11-06T19:40:10.625522Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
labelexact_masscharge
0[M+H]+1.0072761
1[M+NH4]+18.0338261
2[M+Na]+22.9892211
3[M+Mg]+23.9844931
4[M+K]+38.9631581
5[M+Fe]+55.9343881
6[M+Cu]+62.9290491
7[M+2H]+2.0151011
8[M+3H]+3.0229261
\n", "
" ], "text/plain": [ " label exact_mass charge\n", "0 [M+H]+ 1.007276 1\n", "1 [M+NH4]+ 18.033826 1\n", "2 [M+Na]+ 22.989221 1\n", "3 [M+Mg]+ 23.984493 1\n", "4 [M+K]+ 38.963158 1\n", "5 [M+Fe]+ 55.934388 1\n", "6 [M+Cu]+ 62.929049 1\n", "7 [M+2H]+ 2.015101 1\n", "8 [M+3H]+ 3.022926 1" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# if empty, use default adducts library\n", "add_path = \"./data/adducts_short.tsv\"\n", "lib_add = anno.read_lib(add_path, ion_mode)\n", "lib_add" ] }, { "cell_type": "markdown", "id": "b8157cd9", "metadata": {}, "source": [ "Now use function `comp_match_mass_add` to match compounds:" ] }, { "cell_type": "code", "execution_count": 9, "id": "59fe8f04", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:10.625522Z", "iopub.status.busy": "2024-11-06T19:40:10.625522Z", "iopub.status.idle": "2024-11-06T19:40:10.772219Z", "shell.execute_reply": "2024-11-06T19:40:10.772219Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idmzmolecular_formulacompound_nameinchiinchi_keyexact_massadductppm_error
0M152T40152.043607C5H8N2O2DihydrothymineInChI=1S/C5H8N2O2/c1-3-2-6-5(9)7-4(3)8/h3H,2H2...NBAKTGXDIBVZOO-VKHMYHEASA-N152.04[M+Mg]+3.52
1M154T37154.062402C8H8O3p-Hydroxyphenylacetic acidInChI=1S/C8H8O3/c9-7-3-1-6(2-4-7)5-8(10)11/h1-...XQXPVVBIMDBYFF-UHFFFAOYSA-N154.06[M+2H]+-0.28
2M154T37154.062402C8H8O33-Hydroxyphenylacetic acidInChI=1S/C8H8O3/c9-7-3-1-2-6(4-7)5-8(10)11/h1-...FVMDYYGIDFPZAX-UHFFFAOYSA-N154.06[M+2H]+-0.28
3M154T37154.062402C8H8O3ortho-Hydroxyphenylacetic acidInChI=1S/C8H8O3/c9-7-4-2-1-3-6(7)5-8(10)11/h1-...CCVYRRGZDBSHFU-UHFFFAOYSA-N154.06[M+2H]+-0.28
4M154T37154.062402C8H8O3Mandelic acidInChI=1S/C8H8O3/c9-7(8(10)11)6-4-2-1-3-5-6/h1-...IWYDHOAUDWTVEP-ZETCQYMHSA-N154.06[M+2H]+-0.28
5M154T37154.062402C8H8O33-Cresotinic acidInChI=1S/C8H8O3/c1-5-3-2-4-6(7(5)9)8(10)11/h2-...WHSXTWFYRGOBGO-UHFFFAOYSA-N154.06[M+2H]+-0.28
6M154T37154.062402C8H8O34-Hydroxy-3-methylbenzoic acidInChI=1S/C8H8O3/c1-5-4-6(8(10)11)2-3-7(5)9/h2-...LTFHNKUKQYVHDX-UHFFFAOYSA-N154.06[M+2H]+-0.28
7M154T37154.062402C8H8O3VanillinInChI=1S/C8H8O3/c1-11-8-4-6(5-9)2-3-7(8)10/h2-...MWOOGOJBHIARFG-UHFFFAOYSA-N154.06[M+2H]+-0.28
8M157T35157.036819C4H10N2O22,4-Diaminobutyric acidInChI=1S/C4H10N2O2/c5-2-1-3(6)4(7)8/h3H,1-2,5-...OGNSCSPNOLGXSM-UHFFFAOYSA-N157.04[M+K]+-3.61
9M157T35157.036819C4H10N2O2L-2,4-diaminobutyric acidInChI=1S/C4H10N2O2/c5-2-1-3(6)4(7)8/h3H,1-2,5-...OGNSCSPNOLGXSM-VKHMYHEASA-N157.04[M+K]+-3.61
10M167T35167.021095C5H8N2O2DihydrothymineInChI=1S/C5H8N2O2/c1-3-2-6-5(9)7-4(3)8/h3H,2H2...NBAKTGXDIBVZOO-VKHMYHEASA-N167.02[M+K]+-3.83
11M174T35174.088395C9H13NOPhenylpropanolamineInChI=1S/C9H13NO/c1-7(10)9(11)8-5-3-2-4-6-8/h2...DLNKOYKMWOXYQA-VXNVDRBHSA-N174.09[M+Na]+-3.10
12M174T35174.088395C10H14OThymolInChI=1S/C10H14O/c1-7(2)9-5-4-8(3)6-10(9)11/h4...MGSRCZKZVOBKFT-UHFFFAOYSA-N174.09[M+Mg]+-3.23
13M174T35174.088395C10H14O(S)-CarvoneInChI=1S/C10H14O/c1-7(2)9-5-4-8(3)10(11)6-9/h4...ULDHMXUKGWMISQ-VIFPVBQESA-N174.09[M+Mg]+-3.23
14M174T35174.088395C8H12O42-Octenedioic acidInChI=1S/C8H12O4/c9-7(10)5-3-1-2-4-6-8(11)12/h...BNTPVRGYUHJFHN-HWKANZROSA-N174.09[M+2H]+-1.52
15M174T35174.088395C8H12O4cis-4-Octenedioic acidInChI=1S/C8H12O4/c9-7(10)5-3-1-2-4-6-8(11)12/h...LQVYKEXVMZXOAH-UPHRSURJSA-N174.09[M+2H]+-1.52
16M181T36181.060407C8H8N2O3Nicotinuric acidInChI=1S/C8H8N2O3/c11-7(12)5-10-8(13)6-2-1-3-9...ZBSGKPYXQINNGF-UHFFFAOYSA-N181.06[M+H]+-2.00
17M184T38184.097942C10H13N2Nicotine imineInChI=1S/C10H13N2/c1-12-7-3-5-10(12)9-4-2-6-11...GTQXYYYOJZZJHL-UHFFFAOYSA-N184.10[M+Na]+4.60
18M185T39_2185.082034C5H15NO4PPhosphorylcholineInChI=1S/C5H14NO4P/c1-6(2,3)4-5-10-11(7,8)9/h4...YHHSONZFOIEMCP-UHFFFAOYSA-O185.08[M+H]+4.80
19M186T36186.045606C6H14N2ON-AcetylputrescineInChI=1S/C6H14N2O/c1-6(9)8-5-3-2-4-7/h2-5,7H2,...KLZGKIDSEJWEDW-UHFFFAOYSA-N186.05[M+Fe]+3.25
20M187T38187.097642C5H15NO4PPhosphorylcholineInChI=1S/C5H14NO4P/c1-6(2,3)4-5-10-11(7,8)9/h4...YHHSONZFOIEMCP-UHFFFAOYSA-O187.10[M+3H]+4.52
21M193T40193.050761C5H14N4AgmatineInChI=1S/C5H14N4/c6-3-1-2-4-9-5(7)8/h1-4,6H2,(...QYPPJABKJHAVHS-UHFFFAOYSA-N193.05[M+Cu]+-0.70
22M200T36200.061328C7H16N2ON-AcetylcadaverineInChI=1S/C7H16N2O/c1-7(10)9-6-4-2-3-5-8/h2-6,8...RMOIHHAKNOFHOE-UHFFFAOYSA-N200.06[M+Fe]+3.39
23M201T39_1201.051849C10H10O34-Methoxycinnamic acidInChI=1S/C10H10O3/c1-13-9-5-2-8(3-6-9)4-7-10(1...AFDXODALSZRGIH-QPJJXVBHSA-N201.05[M+Na]+-1.82
24M203T36_1203.002108C9H9NOIndole-3-carbinolInChI=1S/C9H9NO/c11-6-7-5-10-9-4-2-1-3-8(7)9/h...IVYPNXXAYMYVSP-UHFFFAOYSA-N203.00[M+Fe]+-3.42
25M212T39212.067866C8H15NO3HexanoylglycineInChI=1S/C8H15NO3/c1-2-3-4-5-7(10)9-6-8(11)12/...UPCKIPHSXMXJOX-UHFFFAOYSA-N212.07[M+K]+-2.29
26M212T39212.067866C10H10O5Vanilpyruvic acidInChI=1S/C10H10O5/c1-15-9-5-6(2-3-7(9)11)4-8(1...YGQHQTMRZPHIBB-UHFFFAOYSA-N212.07[M+2H]+-0.28
27M217T37_1217.018279C10H11NOTryptopholInChI=1S/C10H11NO/c12-6-5-8-7-11-10-4-2-1-3-9(...MBBOMCVGYCRMEA-UHFFFAOYSA-N217.02[M+Fe]+-0.79
28M221T37221.012328C9H11NO2L-PhenylalanineInChI=1S/C9H11NO2/c10-8(9(11)12)6-7-4-2-1-3-5-...COLNVLDHVKWLRT-QMMMGPOBSA-N221.01[M+Fe]+-4.70
29M223T38223.008162C4H10NO6PO-PhosphothreonineInChI=1S/C4H10NO6P/c1-2(3(5)4(6)7)11-12(8,9)10...USRGIUJOYOXOQJ-GBXIJSLDSA-N223.01[M+Mg]+-4.06
30M223T40223.096863C12H14O4Monoisobutyl phthalic acidInChI=1S/C12H14O4/c1-8(2)7-16-12(15)10-6-4-3-5...RZJSUWQGFCHNFS-UHFFFAOYSA-N223.10[M+H]+1.69
31M226T44226.128007C8H18N4O2Asymmetric dimethylarginineInChI=1S/C8H18N4O2/c1-12(2)8(10)11-5-3-4-6(9)7...YDGMGEXADBMOMJ-LURJTMIESA-N226.13[M+Mg]+2.38
32M226T44226.128007C8H18N4O2Symmetric dimethylarginineInChI=1S/C8H18N4O2/c1-10-8(11-2)12-5-3-4-6(9)7...HVPFXCBJHIIJGS-LURJTMIESA-N226.13[M+Mg]+2.38
33M227T36227.066175C9H10N2O53-NitrotyrosineInChI=1S/C9H10N2O5/c10-6(9(13)14)3-5-1-2-8(12)...FBTSQILOGYXGMD-LURJTMIESA-N227.07[M+H]+-0.32
34M229T38229.069418C4H10N3O5PPhosphocreatineInChI=1S/C4H10N3O5P/c1-7(2-3(8)9)4(5)6-13(10,1...DRBBFCLWYRJSJZ-UHFFFAOYSA-N229.07[M+NH4]+-0.94
35M233T38233.043479C8H10N4O2CaffeineInChI=1S/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)...RYYVLZVUVIJVGH-UHFFFAOYSA-N233.04[M+K]+-0.23
36M245T44245.045772C7H15N3O3HomocitrullineInChI=1S/C7H15N3O3/c8-5(6(11)12)3-1-2-4-10-7(9...XIGSAGMEBXLVJJ-YFKPBYRVSA-N245.05[M+Fe]+0.17
37M245T37_2245.093315C13H18O2IbuprofenInChI=1S/C13H18O2/c1-9(2)8-11-4-6-12(7-5-11)10...HEFNNWSXXWATRW-UHFFFAOYSA-N245.09[M+K]+-2.13
38M249T38249.038309C8H10N4O31,3,7-Trimethyluric acidInChI=1S/C8H10N4O3/c1-10-4-5(9-7(10)14)11(2)8(...BYXCFUMGEBZDDI-UHFFFAOYSA-N249.04[M+K]+-0.56
39M261T43260.972975C10H7NO4Xanthurenic acidInChI=1S/C10H7NO4/c12-7-3-1-2-5-8(13)4-6(10(14...FBZONXHGGPHHIY-UHFFFAOYSA-N260.97[M+Fe]+4.14
40M269T37_2269.088048C10H12N4O5InosineInChI=1S/C10H12N4O5/c15-1-4-6(16)7(17)10(19-4)...UGQMRVRMYYASKQ-KQYNXXCUSA-N269.09[M+H]+0.01
41M275T168275.201932C18H24O2EstradiolInChI=1S/C18H24O2/c1-18-9-8-14-13-5-3-12(19)10...VOXZDWNPVJITMN-ZBRFXRBCSA-N275.20[M+3H]+5.00
42M275T168275.201932C18H24O217a-EstradiolInChI=1S/C18H24O2/c1-18-9-8-14-13-5-3-12(19)10...VOXZDWNPVJITMN-SFFUCWETSA-N275.20[M+3H]+5.00
43M277T181277.217564C18H28O219-NorandrosteroneInChI=1S/C18H28O2/c1-18-9-8-14-13-5-3-12(19)10...UOUIARGWRPHDBX-CQZDKXCPSA-N277.22[M+H]+4.90
44M277T181277.217564C18H28O219-NoretiocholanoloneInChI=1S/C18H28O2/c1-18-9-8-14-13-5-3-12(19)10...UOUIARGWRPHDBX-DHMVHTBWSA-N277.22[M+H]+4.90
45M278T71278.148195C11H20N2O6SaccharopineInChI=1S/C11H20N2O6/c12-7(10(16)17)3-1-2-6-13-...ZDGJAHTZVHVLOT-YUMQZZPRSA-N278.15[M+2H]+3.44
46M279T233279.233232C18H30O2alpha-Linolenic acidInChI=1S/C18H30O2/c1-2-3-4-5-6-7-8-9-10-11-12-...DTOSIQBPPRVQHS-PDBXOOCHSA-N279.23[M+H]+4.93
47M279T233279.233232C18H28O219-NorandrosteroneInChI=1S/C18H28O2/c1-18-9-8-14-13-5-3-12(19)10...UOUIARGWRPHDBX-CQZDKXCPSA-N279.23[M+3H]+4.93
48M279T233279.233232C18H28O219-NoretiocholanoloneInChI=1S/C18H28O2/c1-18-9-8-14-13-5-3-12(19)10...UOUIARGWRPHDBX-DHMVHTBWSA-N279.23[M+3H]+4.93
49M281T287281.248903C18H32O2Linoleic acidInChI=1S/C18H32O2/c1-2-3-4-5-6-7-8-9-10-11-12-...OYHQOLUKZRVURQ-HZJYTTRNSA-N281.25[M+H]+4.97
50M281T287281.248903C18H30O2alpha-Linolenic acidInChI=1S/C18H30O2/c1-2-3-4-5-6-7-8-9-10-11-12-...DTOSIQBPPRVQHS-PDBXOOCHSA-N281.25[M+3H]+4.97
51M282T61282.070271C10H14N2O6RibothymidineInChI=1S/C10H14N2O6/c1-4-2-12(10(17)11-8(4)16)...DWRXFEITVBNRMK-JXOAFFINSA-N282.07[M+Mg]+2.10
52M282T61282.070271C10H14N2O63-MethyluridineInChI=1S/C10H14N2O6/c1-11-6(14)2-3-12(10(11)17...UTQUILVPBZEHTK-UHFFFAOYSA-N282.07[M+Mg]+2.10
53M283T37283.103695C11H14N4O51-MethylinosineInChI=1S/C11H14N4O5/c1-14-3-13-9-6(10(14)19)12...WJNGQIYEQLPJMN-IOSLPCCCSA-N283.10[M+H]+-0.00
\n", "
" ], "text/plain": [ " id mz molecular_formula compound_name \\\n", "0 M152T40 152.043607 C5H8N2O2 Dihydrothymine \n", "1 M154T37 154.062402 C8H8O3 p-Hydroxyphenylacetic acid \n", "2 M154T37 154.062402 C8H8O3 3-Hydroxyphenylacetic acid \n", "3 M154T37 154.062402 C8H8O3 ortho-Hydroxyphenylacetic acid \n", "4 M154T37 154.062402 C8H8O3 Mandelic acid \n", "5 M154T37 154.062402 C8H8O3 3-Cresotinic acid \n", "6 M154T37 154.062402 C8H8O3 4-Hydroxy-3-methylbenzoic acid \n", "7 M154T37 154.062402 C8H8O3 Vanillin \n", "8 M157T35 157.036819 C4H10N2O2 2,4-Diaminobutyric acid \n", "9 M157T35 157.036819 C4H10N2O2 L-2,4-diaminobutyric acid \n", "10 M167T35 167.021095 C5H8N2O2 Dihydrothymine \n", "11 M174T35 174.088395 C9H13NO Phenylpropanolamine \n", "12 M174T35 174.088395 C10H14O Thymol \n", "13 M174T35 174.088395 C10H14O (S)-Carvone \n", "14 M174T35 174.088395 C8H12O4 2-Octenedioic acid \n", "15 M174T35 174.088395 C8H12O4 cis-4-Octenedioic acid \n", "16 M181T36 181.060407 C8H8N2O3 Nicotinuric acid \n", "17 M184T38 184.097942 C10H13N2 Nicotine imine \n", "18 M185T39_2 185.082034 C5H15NO4P Phosphorylcholine \n", "19 M186T36 186.045606 C6H14N2O N-Acetylputrescine \n", "20 M187T38 187.097642 C5H15NO4P Phosphorylcholine \n", "21 M193T40 193.050761 C5H14N4 Agmatine \n", "22 M200T36 200.061328 C7H16N2O N-Acetylcadaverine \n", "23 M201T39_1 201.051849 C10H10O3 4-Methoxycinnamic acid \n", "24 M203T36_1 203.002108 C9H9NO Indole-3-carbinol \n", "25 M212T39 212.067866 C8H15NO3 Hexanoylglycine \n", "26 M212T39 212.067866 C10H10O5 Vanilpyruvic acid \n", "27 M217T37_1 217.018279 C10H11NO Tryptophol \n", "28 M221T37 221.012328 C9H11NO2 L-Phenylalanine \n", "29 M223T38 223.008162 C4H10NO6P O-Phosphothreonine \n", "30 M223T40 223.096863 C12H14O4 Monoisobutyl phthalic acid \n", "31 M226T44 226.128007 C8H18N4O2 Asymmetric dimethylarginine \n", "32 M226T44 226.128007 C8H18N4O2 Symmetric dimethylarginine \n", "33 M227T36 227.066175 C9H10N2O5 3-Nitrotyrosine \n", "34 M229T38 229.069418 C4H10N3O5P Phosphocreatine \n", "35 M233T38 233.043479 C8H10N4O2 Caffeine \n", "36 M245T44 245.045772 C7H15N3O3 Homocitrulline \n", "37 M245T37_2 245.093315 C13H18O2 Ibuprofen \n", "38 M249T38 249.038309 C8H10N4O3 1,3,7-Trimethyluric acid \n", "39 M261T43 260.972975 C10H7NO4 Xanthurenic acid \n", "40 M269T37_2 269.088048 C10H12N4O5 Inosine \n", "41 M275T168 275.201932 C18H24O2 Estradiol \n", "42 M275T168 275.201932 C18H24O2 17a-Estradiol \n", "43 M277T181 277.217564 C18H28O2 19-Norandrosterone \n", "44 M277T181 277.217564 C18H28O2 19-Noretiocholanolone \n", "45 M278T71 278.148195 C11H20N2O6 Saccharopine \n", "46 M279T233 279.233232 C18H30O2 alpha-Linolenic acid \n", "47 M279T233 279.233232 C18H28O2 19-Norandrosterone \n", "48 M279T233 279.233232 C18H28O2 19-Noretiocholanolone \n", "49 M281T287 281.248903 C18H32O2 Linoleic acid \n", "50 M281T287 281.248903 C18H30O2 alpha-Linolenic acid \n", "51 M282T61 282.070271 C10H14N2O6 Ribothymidine \n", "52 M282T61 282.070271 C10H14N2O6 3-Methyluridine \n", "53 M283T37 283.103695 C11H14N4O5 1-Methylinosine \n", "\n", " inchi \\\n", "0 InChI=1S/C5H8N2O2/c1-3-2-6-5(9)7-4(3)8/h3H,2H2... \n", "1 InChI=1S/C8H8O3/c9-7-3-1-6(2-4-7)5-8(10)11/h1-... \n", "2 InChI=1S/C8H8O3/c9-7-3-1-2-6(4-7)5-8(10)11/h1-... \n", "3 InChI=1S/C8H8O3/c9-7-4-2-1-3-6(7)5-8(10)11/h1-... \n", "4 InChI=1S/C8H8O3/c9-7(8(10)11)6-4-2-1-3-5-6/h1-... \n", "5 InChI=1S/C8H8O3/c1-5-3-2-4-6(7(5)9)8(10)11/h2-... \n", "6 InChI=1S/C8H8O3/c1-5-4-6(8(10)11)2-3-7(5)9/h2-... \n", "7 InChI=1S/C8H8O3/c1-11-8-4-6(5-9)2-3-7(8)10/h2-... \n", "8 InChI=1S/C4H10N2O2/c5-2-1-3(6)4(7)8/h3H,1-2,5-... \n", "9 InChI=1S/C4H10N2O2/c5-2-1-3(6)4(7)8/h3H,1-2,5-... \n", "10 InChI=1S/C5H8N2O2/c1-3-2-6-5(9)7-4(3)8/h3H,2H2... \n", "11 InChI=1S/C9H13NO/c1-7(10)9(11)8-5-3-2-4-6-8/h2... \n", "12 InChI=1S/C10H14O/c1-7(2)9-5-4-8(3)6-10(9)11/h4... \n", "13 InChI=1S/C10H14O/c1-7(2)9-5-4-8(3)10(11)6-9/h4... \n", "14 InChI=1S/C8H12O4/c9-7(10)5-3-1-2-4-6-8(11)12/h... \n", "15 InChI=1S/C8H12O4/c9-7(10)5-3-1-2-4-6-8(11)12/h... \n", "16 InChI=1S/C8H8N2O3/c11-7(12)5-10-8(13)6-2-1-3-9... \n", "17 InChI=1S/C10H13N2/c1-12-7-3-5-10(12)9-4-2-6-11... \n", "18 InChI=1S/C5H14NO4P/c1-6(2,3)4-5-10-11(7,8)9/h4... \n", "19 InChI=1S/C6H14N2O/c1-6(9)8-5-3-2-4-7/h2-5,7H2,... \n", "20 InChI=1S/C5H14NO4P/c1-6(2,3)4-5-10-11(7,8)9/h4... \n", "21 InChI=1S/C5H14N4/c6-3-1-2-4-9-5(7)8/h1-4,6H2,(... \n", "22 InChI=1S/C7H16N2O/c1-7(10)9-6-4-2-3-5-8/h2-6,8... \n", "23 InChI=1S/C10H10O3/c1-13-9-5-2-8(3-6-9)4-7-10(1... \n", "24 InChI=1S/C9H9NO/c11-6-7-5-10-9-4-2-1-3-8(7)9/h... \n", "25 InChI=1S/C8H15NO3/c1-2-3-4-5-7(10)9-6-8(11)12/... \n", "26 InChI=1S/C10H10O5/c1-15-9-5-6(2-3-7(9)11)4-8(1... \n", "27 InChI=1S/C10H11NO/c12-6-5-8-7-11-10-4-2-1-3-9(... \n", "28 InChI=1S/C9H11NO2/c10-8(9(11)12)6-7-4-2-1-3-5-... \n", "29 InChI=1S/C4H10NO6P/c1-2(3(5)4(6)7)11-12(8,9)10... \n", "30 InChI=1S/C12H14O4/c1-8(2)7-16-12(15)10-6-4-3-5... \n", "31 InChI=1S/C8H18N4O2/c1-12(2)8(10)11-5-3-4-6(9)7... \n", "32 InChI=1S/C8H18N4O2/c1-10-8(11-2)12-5-3-4-6(9)7... \n", "33 InChI=1S/C9H10N2O5/c10-6(9(13)14)3-5-1-2-8(12)... \n", "34 InChI=1S/C4H10N3O5P/c1-7(2-3(8)9)4(5)6-13(10,1... \n", "35 InChI=1S/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)... \n", "36 InChI=1S/C7H15N3O3/c8-5(6(11)12)3-1-2-4-10-7(9... \n", "37 InChI=1S/C13H18O2/c1-9(2)8-11-4-6-12(7-5-11)10... \n", "38 InChI=1S/C8H10N4O3/c1-10-4-5(9-7(10)14)11(2)8(... \n", "39 InChI=1S/C10H7NO4/c12-7-3-1-2-5-8(13)4-6(10(14... \n", "40 InChI=1S/C10H12N4O5/c15-1-4-6(16)7(17)10(19-4)... \n", "41 InChI=1S/C18H24O2/c1-18-9-8-14-13-5-3-12(19)10... \n", "42 InChI=1S/C18H24O2/c1-18-9-8-14-13-5-3-12(19)10... \n", "43 InChI=1S/C18H28O2/c1-18-9-8-14-13-5-3-12(19)10... \n", "44 InChI=1S/C18H28O2/c1-18-9-8-14-13-5-3-12(19)10... \n", "45 InChI=1S/C11H20N2O6/c12-7(10(16)17)3-1-2-6-13-... \n", "46 InChI=1S/C18H30O2/c1-2-3-4-5-6-7-8-9-10-11-12-... \n", "47 InChI=1S/C18H28O2/c1-18-9-8-14-13-5-3-12(19)10... \n", "48 InChI=1S/C18H28O2/c1-18-9-8-14-13-5-3-12(19)10... \n", "49 InChI=1S/C18H32O2/c1-2-3-4-5-6-7-8-9-10-11-12-... \n", "50 InChI=1S/C18H30O2/c1-2-3-4-5-6-7-8-9-10-11-12-... \n", "51 InChI=1S/C10H14N2O6/c1-4-2-12(10(17)11-8(4)16)... \n", "52 InChI=1S/C10H14N2O6/c1-11-6(14)2-3-12(10(11)17... \n", "53 InChI=1S/C11H14N4O5/c1-14-3-13-9-6(10(14)19)12... \n", "\n", " inchi_key exact_mass adduct ppm_error \n", "0 NBAKTGXDIBVZOO-VKHMYHEASA-N 152.04 [M+Mg]+ 3.52 \n", "1 XQXPVVBIMDBYFF-UHFFFAOYSA-N 154.06 [M+2H]+ -0.28 \n", "2 FVMDYYGIDFPZAX-UHFFFAOYSA-N 154.06 [M+2H]+ -0.28 \n", "3 CCVYRRGZDBSHFU-UHFFFAOYSA-N 154.06 [M+2H]+ -0.28 \n", "4 IWYDHOAUDWTVEP-ZETCQYMHSA-N 154.06 [M+2H]+ -0.28 \n", "5 WHSXTWFYRGOBGO-UHFFFAOYSA-N 154.06 [M+2H]+ -0.28 \n", "6 LTFHNKUKQYVHDX-UHFFFAOYSA-N 154.06 [M+2H]+ -0.28 \n", "7 MWOOGOJBHIARFG-UHFFFAOYSA-N 154.06 [M+2H]+ -0.28 \n", "8 OGNSCSPNOLGXSM-UHFFFAOYSA-N 157.04 [M+K]+ -3.61 \n", "9 OGNSCSPNOLGXSM-VKHMYHEASA-N 157.04 [M+K]+ -3.61 \n", "10 NBAKTGXDIBVZOO-VKHMYHEASA-N 167.02 [M+K]+ -3.83 \n", "11 DLNKOYKMWOXYQA-VXNVDRBHSA-N 174.09 [M+Na]+ -3.10 \n", "12 MGSRCZKZVOBKFT-UHFFFAOYSA-N 174.09 [M+Mg]+ -3.23 \n", "13 ULDHMXUKGWMISQ-VIFPVBQESA-N 174.09 [M+Mg]+ -3.23 \n", "14 BNTPVRGYUHJFHN-HWKANZROSA-N 174.09 [M+2H]+ -1.52 \n", "15 LQVYKEXVMZXOAH-UPHRSURJSA-N 174.09 [M+2H]+ -1.52 \n", "16 ZBSGKPYXQINNGF-UHFFFAOYSA-N 181.06 [M+H]+ -2.00 \n", "17 GTQXYYYOJZZJHL-UHFFFAOYSA-N 184.10 [M+Na]+ 4.60 \n", "18 YHHSONZFOIEMCP-UHFFFAOYSA-O 185.08 [M+H]+ 4.80 \n", "19 KLZGKIDSEJWEDW-UHFFFAOYSA-N 186.05 [M+Fe]+ 3.25 \n", "20 YHHSONZFOIEMCP-UHFFFAOYSA-O 187.10 [M+3H]+ 4.52 \n", "21 QYPPJABKJHAVHS-UHFFFAOYSA-N 193.05 [M+Cu]+ -0.70 \n", "22 RMOIHHAKNOFHOE-UHFFFAOYSA-N 200.06 [M+Fe]+ 3.39 \n", "23 AFDXODALSZRGIH-QPJJXVBHSA-N 201.05 [M+Na]+ -1.82 \n", "24 IVYPNXXAYMYVSP-UHFFFAOYSA-N 203.00 [M+Fe]+ -3.42 \n", "25 UPCKIPHSXMXJOX-UHFFFAOYSA-N 212.07 [M+K]+ -2.29 \n", "26 YGQHQTMRZPHIBB-UHFFFAOYSA-N 212.07 [M+2H]+ -0.28 \n", "27 MBBOMCVGYCRMEA-UHFFFAOYSA-N 217.02 [M+Fe]+ -0.79 \n", "28 COLNVLDHVKWLRT-QMMMGPOBSA-N 221.01 [M+Fe]+ -4.70 \n", "29 USRGIUJOYOXOQJ-GBXIJSLDSA-N 223.01 [M+Mg]+ -4.06 \n", "30 RZJSUWQGFCHNFS-UHFFFAOYSA-N 223.10 [M+H]+ 1.69 \n", "31 YDGMGEXADBMOMJ-LURJTMIESA-N 226.13 [M+Mg]+ 2.38 \n", "32 HVPFXCBJHIIJGS-LURJTMIESA-N 226.13 [M+Mg]+ 2.38 \n", "33 FBTSQILOGYXGMD-LURJTMIESA-N 227.07 [M+H]+ -0.32 \n", "34 DRBBFCLWYRJSJZ-UHFFFAOYSA-N 229.07 [M+NH4]+ -0.94 \n", "35 RYYVLZVUVIJVGH-UHFFFAOYSA-N 233.04 [M+K]+ -0.23 \n", "36 XIGSAGMEBXLVJJ-YFKPBYRVSA-N 245.05 [M+Fe]+ 0.17 \n", "37 HEFNNWSXXWATRW-UHFFFAOYSA-N 245.09 [M+K]+ -2.13 \n", "38 BYXCFUMGEBZDDI-UHFFFAOYSA-N 249.04 [M+K]+ -0.56 \n", "39 FBZONXHGGPHHIY-UHFFFAOYSA-N 260.97 [M+Fe]+ 4.14 \n", "40 UGQMRVRMYYASKQ-KQYNXXCUSA-N 269.09 [M+H]+ 0.01 \n", "41 VOXZDWNPVJITMN-ZBRFXRBCSA-N 275.20 [M+3H]+ 5.00 \n", "42 VOXZDWNPVJITMN-SFFUCWETSA-N 275.20 [M+3H]+ 5.00 \n", "43 UOUIARGWRPHDBX-CQZDKXCPSA-N 277.22 [M+H]+ 4.90 \n", "44 UOUIARGWRPHDBX-DHMVHTBWSA-N 277.22 [M+H]+ 4.90 \n", "45 ZDGJAHTZVHVLOT-YUMQZZPRSA-N 278.15 [M+2H]+ 3.44 \n", "46 DTOSIQBPPRVQHS-PDBXOOCHSA-N 279.23 [M+H]+ 4.93 \n", "47 UOUIARGWRPHDBX-CQZDKXCPSA-N 279.23 [M+3H]+ 4.93 \n", "48 UOUIARGWRPHDBX-DHMVHTBWSA-N 279.23 [M+3H]+ 4.93 \n", "49 OYHQOLUKZRVURQ-HZJYTTRNSA-N 281.25 [M+H]+ 4.97 \n", "50 DTOSIQBPPRVQHS-PDBXOOCHSA-N 281.25 [M+3H]+ 4.97 \n", "51 DWRXFEITVBNRMK-JXOAFFINSA-N 282.07 [M+Mg]+ 2.10 \n", "52 UTQUILVPBZEHTK-UHFFFAOYSA-N 282.07 [M+Mg]+ 2.10 \n", "53 WJNGQIYEQLPJMN-IOSLPCCCSA-N 283.10 [M+H]+ -0.00 " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "match_1 = anno.comp_match_mass_add(df, ppm, ref, lib_add)\n", "match_1" ] }, { "cell_type": "markdown", "id": "00d07499", "metadata": {}, "source": [ "Note that this adducts library is also used to adjust mass calculation in\n", "loading reference file if there is a column called `ion_type`.\n", "\n", "## Correlation Analysis\n", "\n", "Next step is correlation analysis, based on intensity data matrix along all\n", "peaks. All results are filtered by the correlation coefficient, p-values\n", "and retention time difference. That is: keep correlation results in an\n", "retention time differences/window (such as 1 second) with correlation\n", "coefficient larger than a threshold (such as 0.5) and their correlation\n", "p-values less than a threshold (such as 0.05).\n", "\n", "`LAMP` supports two correlation methods, `pearson` and `spearman`. Also\n", "parameter `positive` allows user to select only positive correlation\n", "results, otherwise positive and negative correlations will be used.\n", "\n", "Two functions, `_tic` and `_toc`, record the correlation computation time in\n", "seconds." ] }, { "cell_type": "code", "execution_count": 10, "id": "693165ab", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:10.775728Z", "iopub.status.busy": "2024-11-06T19:40:10.775728Z", "iopub.status.idle": "2024-11-06T19:40:10.780324Z", "shell.execute_reply": "2024-11-06T19:40:10.780324Z" } }, "outputs": [], "source": [ "thres_rt = 1.0\n", "thres_corr = 0.5\n", "thres_pval = 0.05\n", "method = \"spearman\" # \"pearson\"\n", "positive = True" ] }, { "cell_type": "code", "execution_count": 11, "id": "b1431924", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:10.780324Z", "iopub.status.busy": "2024-11-06T19:40:10.780324Z", "iopub.status.idle": "2024-11-06T19:40:15.177207Z", "shell.execute_reply": "2024-11-06T19:40:15.177207Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Elapsed time: 4.374748706817627 seconds.\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
name_aname_br_valuep_valuert_diff
0M151T34M153T340.801.267076e-230.02
1M151T34M155T340.711.752854e-160.20
2M151T34M161T340.781.869949e-210.14
3M151T34M163T340.693.239594e-150.20
4M151T34M167T350.515.776482e-080.73
..................
1783M283T34_1M283T34_20.624.214876e-120.29
1784M283T34_1M285T340.825.937139e-260.08
1785M283T34_2M285T340.667.898957e-140.37
1786M283T60M284T600.861.033010e-290.15
1787M283T339M284T3390.914.031333e-390.04
\n", "

1788 rows × 5 columns

\n", "
" ], "text/plain": [ " name_a name_b r_value p_value rt_diff\n", "0 M151T34 M153T34 0.80 1.267076e-23 0.02\n", "1 M151T34 M155T34 0.71 1.752854e-16 0.20\n", "2 M151T34 M161T34 0.78 1.869949e-21 0.14\n", "3 M151T34 M163T34 0.69 3.239594e-15 0.20\n", "4 M151T34 M167T35 0.51 5.776482e-08 0.73\n", "... ... ... ... ... ...\n", "1783 M283T34_1 M283T34_2 0.62 4.214876e-12 0.29\n", "1784 M283T34_1 M285T34 0.82 5.937139e-26 0.08\n", "1785 M283T34_2 M285T34 0.66 7.898957e-14 0.37\n", "1786 M283T60 M284T60 0.86 1.033010e-29 0.15\n", "1787 M283T339 M284T339 0.91 4.031333e-39 0.04\n", "\n", "[1788 rows x 5 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "utils._tic()\n", "corr = stats.comp_corr_rt(df, thres_rt, thres_corr, thres_pval, method,\n", " positive)\n", "utils._toc()\n", "corr" ] }, { "cell_type": "markdown", "id": "e4675797", "metadata": {}, "source": [ "`corr` gives results of correlation coefficient(`r_value`), correlation\n", "p-values(`p_value`) and retention time difference(`rt_diff`).\n", "\n", "Based on the correlation analysis, we can extract the groups and their\n", "sizes by:" ] }, { "cell_type": "code", "execution_count": 12, "id": "941c47f6", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:15.177207Z", "iopub.status.busy": "2024-11-06T19:40:15.177207Z", "iopub.status.idle": "2024-11-06T19:40:15.413072Z", "shell.execute_reply": "2024-11-06T19:40:15.413072Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namecor_grp_sizecor_grp
0M219T3552M221T34::M223T34::M225T35::M226T35::M229T34::M...
1M217T3552M218T35::M219T34::M219T35::M221T34::M223T34::M...
2M216T3552M217T35::M218T35::M219T34::M219T35::M221T34::M...
3M215T3552M216T35::M217T35::M218T35::M219T34::M219T35::M...
4M218T3551M219T34::M219T35::M221T34::M223T34::M225T35::M...
............
335M171T1801M173T181
336M257T511M258T51
337M163T4151M219T415
338M203T341M229T35
339M171T1191M173T119
\n", "

340 rows × 3 columns

\n", "
" ], "text/plain": [ " name cor_grp_size cor_grp\n", "0 M219T35 52 M221T34::M223T34::M225T35::M226T35::M229T34::M...\n", "1 M217T35 52 M218T35::M219T34::M219T35::M221T34::M223T34::M...\n", "2 M216T35 52 M217T35::M218T35::M219T34::M219T35::M221T34::M...\n", "3 M215T35 52 M216T35::M217T35::M218T35::M219T34::M219T35::M...\n", "4 M218T35 51 M219T34::M219T35::M221T34::M223T34::M225T35::M...\n", ".. ... ... ...\n", "335 M171T180 1 M173T181\n", "336 M257T51 1 M258T51\n", "337 M163T415 1 M219T415\n", "338 M203T34 1 M229T35\n", "339 M171T119 1 M173T119\n", "\n", "[340 rows x 3 columns]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# get correlation group and size\n", "corr_df = stats.corr_grp_size(corr)\n", "corr_df" ] }, { "cell_type": "markdown", "id": "f5ff3143", "metadata": {}, "source": [ "## Summarize Results\n", "\n", "The final step gets the summary table in different format and save for the\n", "further analysis." ] }, { "cell_type": "code", "execution_count": 13, "id": "7328b8af", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:15.413072Z", "iopub.status.busy": "2024-11-06T19:40:15.413072Z", "iopub.status.idle": "2024-11-06T19:40:15.441625Z", "shell.execute_reply": "2024-11-06T19:40:15.441625Z" } }, "outputs": [], "source": [ "# get summary of metabolite annotation\n", "sr, mr = anno.comp_summ(df, match)" ] }, { "cell_type": "markdown", "id": "5473bc05", "metadata": {}, "source": [ "This function combines peak table with compound matching results and\n", "returns two results in different formats. `sr` is single row results for\n", "each peak id in peak table `df`:" ] }, { "cell_type": "code", "execution_count": 14, "id": "c4ec8ec8", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:15.441625Z", "iopub.status.busy": "2024-11-06T19:40:15.441625Z", "iopub.status.idle": "2024-11-06T19:40:15.461149Z", "shell.execute_reply": "2024-11-06T19:40:15.461149Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namemzrtexact_massppm_errormolecular_formulacompound_nameinchiinchi_key
0M151T34150.88671534.152700NaNNaNNaNNaNNaNNaN
1M151T40151.04023539.838172NaNNaNNaNNaNNaNNaN
2M152T40152.04360740.303700NaNNaNNaNNaNNaNNaN
3M153T34152.88382434.174647NaNNaNNaNNaNNaNNaN
4M153T36153.01947435.785847NaNNaNNaNNaNNaNNaN
..............................
395M283T61283.06847460.739869NaNNaNNaNNaNNaNNaN
396M284T108284.223499108.406389NaNNaNNaNNaNNaNNaN
397M284T339284.267962338.725056NaNNaNNaNNaNNaNNaN
398M284T60284.19529459.593561NaNNaNNaNNaNNaNNaN
399M285T34284.77503134.079641NaNNaNNaNNaNNaNNaN
\n", "

400 rows × 9 columns

\n", "
" ], "text/plain": [ " name mz rt exact_mass ppm_error \\\n", "0 M151T34 150.886715 34.152700 NaN NaN \n", "1 M151T40 151.040235 39.838172 NaN NaN \n", "2 M152T40 152.043607 40.303700 NaN NaN \n", "3 M153T34 152.883824 34.174647 NaN NaN \n", "4 M153T36 153.019474 35.785847 NaN NaN \n", ".. ... ... ... ... ... \n", "395 M283T61 283.068474 60.739869 NaN NaN \n", "396 M284T108 284.223499 108.406389 NaN NaN \n", "397 M284T339 284.267962 338.725056 NaN NaN \n", "398 M284T60 284.195294 59.593561 NaN NaN \n", "399 M285T34 284.775031 34.079641 NaN NaN \n", "\n", " molecular_formula compound_name inchi inchi_key \n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", ".. ... ... ... ... \n", "395 NaN NaN NaN NaN \n", "396 NaN NaN NaN NaN \n", "397 NaN NaN NaN NaN \n", "398 NaN NaN NaN NaN \n", "399 NaN NaN NaN NaN \n", "\n", "[400 rows x 9 columns]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sr" ] }, { "cell_type": "markdown", "id": "67fbd9ee", "metadata": {}, "source": [ "`mr` is multiple rows format if the match more than once from the reference\n", "file:" ] }, { "cell_type": "code", "execution_count": 15, "id": "a6e8f2aa", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:15.461149Z", "iopub.status.busy": "2024-11-06T19:40:15.461149Z", "iopub.status.idle": "2024-11-06T19:40:15.479448Z", "shell.execute_reply": "2024-11-06T19:40:15.479448Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namemzrtmolecular_formulacompound_nameinchiinchi_keyexact_massppm_error
0M151T34150.88671534.152700NaNNaNNaNNaNNaNNaN
1M151T40151.04023539.838172NaNNaNNaNNaNNaNNaN
2M152T40152.04360740.303700NaNNaNNaNNaNNaNNaN
3M153T34152.88382434.174647NaNNaNNaNNaNNaNNaN
4M153T36153.01947435.785847NaNNaNNaNNaNNaNNaN
..............................
404M283T61283.06847460.739869NaNNaNNaNNaNNaNNaN
405M284T108284.223499108.406389NaNNaNNaNNaNNaNNaN
406M284T339284.267962338.725056NaNNaNNaNNaNNaNNaN
407M284T60284.19529459.593561NaNNaNNaNNaNNaNNaN
408M285T34284.77503134.079641NaNNaNNaNNaNNaNNaN
\n", "

409 rows × 9 columns

\n", "
" ], "text/plain": [ " name mz rt molecular_formula compound_name inchi \\\n", "0 M151T34 150.886715 34.152700 NaN NaN NaN \n", "1 M151T40 151.040235 39.838172 NaN NaN NaN \n", "2 M152T40 152.043607 40.303700 NaN NaN NaN \n", "3 M153T34 152.883824 34.174647 NaN NaN NaN \n", "4 M153T36 153.019474 35.785847 NaN NaN NaN \n", ".. ... ... ... ... ... ... \n", "404 M283T61 283.068474 60.739869 NaN NaN NaN \n", "405 M284T108 284.223499 108.406389 NaN NaN NaN \n", "406 M284T339 284.267962 338.725056 NaN NaN NaN \n", "407 M284T60 284.195294 59.593561 NaN NaN NaN \n", "408 M285T34 284.775031 34.079641 NaN NaN NaN \n", "\n", " inchi_key exact_mass ppm_error \n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", ".. ... ... ... \n", "404 NaN NaN NaN \n", "405 NaN NaN NaN \n", "406 NaN NaN NaN \n", "407 NaN NaN NaN \n", "408 NaN NaN NaN \n", "\n", "[409 rows x 9 columns]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mr" ] }, { "cell_type": "markdown", "id": "236bc49a", "metadata": {}, "source": [ "\n", "Now we merges single format results with correlation results:\n" ] }, { "cell_type": "code", "execution_count": 16, "id": "51f1cef3", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:15.479448Z", "iopub.status.busy": "2024-11-06T19:40:15.479448Z", "iopub.status.idle": "2024-11-06T19:40:15.505860Z", "shell.execute_reply": "2024-11-06T19:40:15.505860Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namemzrtexact_massppm_errormolecular_formulacompound_nameinchiinchi_keycor_grp_sizecor_grp
0M167T35167.02109534.882147167.02-4.56C7H5NO4Quinolinic acidInChI=1S/C7H5NO4/c9-6(10)4-2-1-3-8-5(4)7(11)12...GJAWHXHKYYXBSV-UHFFFAOYSA-N25.0M171T34::M197T36::M209T34::M211T34::M213T34::M...
1M276T36276.07739736.385373276.08-2.16C10H16N2O5SBiotin sulfoneInChI=1S/C10H16N2O5S/c13-8(14)4-2-1-3-7-9-6(5-...QPFQYMONYBAUCY-ZKWXMUAHSA-N13.0M277T36_2::M278T36::M173T36_2::M186T36::M187T3...
2M154T37154.06240237.183625154.06-3.84C8H10O3HydroxytyrosolInChI=1S/C8H10O3/c9-4-3-6-1-2-7(10)8(11)5-6/h1...JUUBCHWRXWPFFH-UHFFFAOYSA-N12.0M155T38::M158T37_2::M164T36::M171T37_2::M173T3...
3M181T36181.06040735.734801181.062.39C6H7N5O28-Hydroxy-7-methylguanineInChI=1S/C6H7N5O2/c1-11-2-3(9-6(11)13)8-5(7)10...VHPXSVXJBWZORQ-UHFFFAOYSA-N9.0M224T36::M225T35::M226T35::M227T36::M269T37_2:...
4M174T35174.08839535.001130174.09-4.67C8H14O4Suberic acidInChI=1S/C8H14O4/c9-7(10)5-3-1-2-4-6-8(11)12/h...TYFQFVWCELRYAO-UHFFFAOYSA-N9.0M211T34::M213T34::M219T34::M221T34::M229T35::M...
....................................
395M279T50279.15993050.055451NaNNaNNaNNaNNaNNaNNaNNaN
396M279T79279.16391078.758079NaNNaNNaNNaNNaNNaNNaNNaN
397M282T85282.20785984.719202NaNNaNNaNNaNNaNNaNNaNNaN
398M283T47283.11087146.822069NaNNaNNaNNaNNaNNaNNaNNaN
399M284T108284.223499108.406389NaNNaNNaNNaNNaNNaNNaNNaN
\n", "

400 rows × 11 columns

\n", "
" ], "text/plain": [ " name mz rt exact_mass ppm_error \\\n", "0 M167T35 167.021095 34.882147 167.02 -4.56 \n", "1 M276T36 276.077397 36.385373 276.08 -2.16 \n", "2 M154T37 154.062402 37.183625 154.06 -3.84 \n", "3 M181T36 181.060407 35.734801 181.06 2.39 \n", "4 M174T35 174.088395 35.001130 174.09 -4.67 \n", ".. ... ... ... ... ... \n", "395 M279T50 279.159930 50.055451 NaN NaN \n", "396 M279T79 279.163910 78.758079 NaN NaN \n", "397 M282T85 282.207859 84.719202 NaN NaN \n", "398 M283T47 283.110871 46.822069 NaN NaN \n", "399 M284T108 284.223499 108.406389 NaN NaN \n", "\n", " molecular_formula compound_name \\\n", "0 C7H5NO4 Quinolinic acid \n", "1 C10H16N2O5S Biotin sulfone \n", "2 C8H10O3 Hydroxytyrosol \n", "3 C6H7N5O2 8-Hydroxy-7-methylguanine \n", "4 C8H14O4 Suberic acid \n", ".. ... ... \n", "395 NaN NaN \n", "396 NaN NaN \n", "397 NaN NaN \n", "398 NaN NaN \n", "399 NaN NaN \n", "\n", " inchi \\\n", "0 InChI=1S/C7H5NO4/c9-6(10)4-2-1-3-8-5(4)7(11)12... \n", "1 InChI=1S/C10H16N2O5S/c13-8(14)4-2-1-3-7-9-6(5-... \n", "2 InChI=1S/C8H10O3/c9-4-3-6-1-2-7(10)8(11)5-6/h1... \n", "3 InChI=1S/C6H7N5O2/c1-11-2-3(9-6(11)13)8-5(7)10... \n", "4 InChI=1S/C8H14O4/c9-7(10)5-3-1-2-4-6-8(11)12/h... \n", ".. ... \n", "395 NaN \n", "396 NaN \n", "397 NaN \n", "398 NaN \n", "399 NaN \n", "\n", " inchi_key cor_grp_size \\\n", "0 GJAWHXHKYYXBSV-UHFFFAOYSA-N 25.0 \n", "1 QPFQYMONYBAUCY-ZKWXMUAHSA-N 13.0 \n", "2 JUUBCHWRXWPFFH-UHFFFAOYSA-N 12.0 \n", "3 VHPXSVXJBWZORQ-UHFFFAOYSA-N 9.0 \n", "4 TYFQFVWCELRYAO-UHFFFAOYSA-N 9.0 \n", ".. ... ... \n", "395 NaN NaN \n", "396 NaN NaN \n", "397 NaN NaN \n", "398 NaN NaN \n", "399 NaN NaN \n", "\n", " cor_grp \n", "0 M171T34::M197T36::M209T34::M211T34::M213T34::M... \n", "1 M277T36_2::M278T36::M173T36_2::M186T36::M187T3... \n", "2 M155T38::M158T37_2::M164T36::M171T37_2::M173T3... \n", "3 M224T36::M225T35::M226T35::M227T36::M269T37_2:... \n", "4 M211T34::M213T34::M219T34::M221T34::M229T35::M... \n", ".. ... \n", "395 NaN \n", "396 NaN \n", "397 NaN \n", "398 NaN \n", "399 NaN \n", "\n", "[400 rows x 11 columns]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# merge summery table with correlation analysis\n", "res = anno.comp_summ_corr(sr, corr_df)\n", "res" ] }, { "cell_type": "markdown", "id": "39fa7829", "metadata": {}, "source": [ "The result data frame `res` is re-arranged as four parts from top to bottom:\n", "\n", " - 1st part: identified metabolites, satisfied with correlation analysis\n", " - 2nd part: identified metabolites, not satisfied with correlation\n", " - 3rd part: no identified metabolites, satisfied with correlation\n", " - 4th part: no identified metabolites, not satisfied with correlation\n", "\n", "The users should focus on the first part and perform their further analysis." ] }, { "cell_type": "markdown", "id": "f8ebf811", "metadata": {}, "source": [ "You can save all results in different forms, such as text format TSV or CSV.\n", "You can also save all results into a `sqlite3` database and use\n", "[DB Browser for SQLite](https://sqlitebrowser.org/) to view:" ] }, { "cell_type": "code", "execution_count": 17, "id": "f850a382", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:15.505860Z", "iopub.status.busy": "2024-11-06T19:40:15.505860Z", "iopub.status.idle": "2024-11-06T19:40:15.513306Z", "shell.execute_reply": "2024-11-06T19:40:15.513306Z" } }, "outputs": [], "source": [ "f_save = False # here we do NOT save results\n", "db_out = \"test.db\"\n", "sr_out = \"test_s.tsv\"" ] }, { "cell_type": "code", "execution_count": 18, "id": "1d5da41a", "metadata": { "execution": { "iopub.execute_input": "2024-11-06T19:40:15.516313Z", "iopub.status.busy": "2024-11-06T19:40:15.516313Z", "iopub.status.idle": "2024-11-06T19:40:15.522185Z", "shell.execute_reply": "2024-11-06T19:40:15.522185Z" } }, "outputs": [], "source": [ "if f_save:\n", " # save all results into a sqlite3 database\n", " conn = sqlite3.connect(db_out)\n", " df[[\"name\", \"mz\", \"rt\"]].to_sql(\"peaklist\",\n", " conn,\n", " if_exists=\"replace\",\n", " index=False)\n", " corr_df.to_sql(\"corr_grp\", conn, if_exists=\"replace\", index=False)\n", " corr.to_sql(\"corr_pval_rt\", conn, if_exists=\"replace\", index=False)\n", " match.to_sql(\"match\", conn, if_exists=\"replace\", index=False)\n", " mr.to_sql(\"anno_mr\", conn, if_exists=\"replace\", index=False)\n", " res.to_sql(\"anno_sr\", conn, if_exists=\"replace\", index=False)\n", "\n", " conn.commit()\n", " conn.close()\n", "\n", " # save final results\n", " res.to_csv(sr_out, sep=\"\\t\", index=False)" ] }, { "cell_type": "markdown", "id": "8c550a0e", "metadata": {}, "source": [ "## End User Usages\n", "\n", "For end users, `LAMP` provides two computation options: command line\n", "interface(CLI) and graphical user interface (GUI).\n", "\n", "To use GUI, you need to open a terminal and type in:\n", "\n", "```bash\n", "$ lamp gui\n", "```\n", "\n", "To use CLI, open a terminal and type in command with required arguments,\n", "something like:\n", "\n", "```bash\n", "$ lamp cli \\\n", " --input-data \"./data/df_pos_3.tsv\" \\\n", " --sep \"tab\" \\\n", " --col-idx \"1, 2, 3, 4\" \\\n", " --add-path \"\" \\\n", " --ref-path \"\" \\\n", " --ion-mode \"pos\" \\\n", " --cal-mass \\\n", " --thres-rt \"1.0\" \\\n", " --thres-corr \"0.5\" \\\n", " --thres-pval \"0.05\" \\\n", " --method \"pearson\" \\\n", " --positive \\\n", " --ppm \"5.0\" \\\n", " --save-db \\\n", " --save-mr \\\n", " --db-out \"./res/test.db\" \\\n", " --sr-out \"./res/test_s.tsv\" \\\n", " --mr-out \"./res/test_m.tsv\"\n", "```\n", "\n", "For the best practice, you can create a bash script `.sh` (Linux\n", "and MacOS) or Windows script `.bat` to contain these CLI\n", "arguments. Change parameters in these files each time when processing new\n", "data set.\n", "\n", "For example, there are `lamp_cli.sh` and `lamp_cli.bat` in\n", "https://github.com/wanchanglin/lamp/tree/master/examples. You can run them\n", "and check the results in directory `examples/res`:\n", "\n", "- For Linux and MacOS terminal:\n", "\n", " ```bash\n", " $ chmod +x lamp_cli.sh\n", " $ ./lamp_cli.sh\n", " ```\n", "\n", "- For Windows terminal:\n", "\n", " ```bash\n", " $ lamp_cli.bat\n", " ```\n", "\n", "Note that if users use `xlsx` files for input data and reference file\n", "when using GUI or CLI, all data must be in the first sheet. If you use\n", "`LAMP` functions in your python scripts, there are no such requirementss." ] } ], "metadata": { "jupytext": { "cell_metadata_filter": "-all", "main_language": "python", "notebook_metadata_filter": "-all" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.6" } }, "nbformat": 4, "nbformat_minor": 5 }