Institute for Advanced Biosciences Keio University
MathDAMP Mathematica package for differential analysis of metabolite profiles
Home Overview Examples Downloads TriDAMP References Contact
MathDAMP > Examples > 03-MathDAMP-TwoDatasets


The preceding notebook 02-MathDAMP-Elements.nb introduced the basic functionality of the MathDAMP package. The core functionality was demonstrated on a comparison of two datasets. This notebook provides a template for such comparison, the core functions are however wrapped into the DAMPTwoDatasets function for a more convenient use. The same two datasets used in the previous notebook will be used here as well.
Additional notebooks from the MathDAMP package (04-MathDAMP-Outliers.nb, 05-MathDAMP-TwoGroups, and 06-MathDAMP-MultipleGroups.nb) provide templates for locating outliers within a group of datasets, for the comparison of two groups of replicate datasets, and for the comparison of multiple groups of replicate datasets.

Step 1 : Loading the Data

First, the MathDAMP package has to be loaded. Please assign the path leading to the MathDAMP files to the MathDAMPPath variable.

MathDAMPPath = "/home/baran/math/ms/MathDAMP.1.0.0/" ;

<< (MathDAMPPath<>"MathDAMP.m")

MathDAMP version 1.0.0 loaded (2006/04/26)

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Two datasets acquired by capillary electrophoresis coupled to a quadrupole mass spectrometer (CE-QMS) operated in selected ion monitoring mode (SIM) are used for the demonstration in this notebook. The datafiles are part of the MathDAMP package.

{ctrl, smpl} = DAMPImportMS[MathDAMPPath<>"/data/"<>#] &/@{"", ""} ;

Optional : Exploring the data, locating the peak of the internal standard in the reference dataset

Step 2 : Performing the Differential Analysis

The function DAMPTwoDatasets compares two datasets and returns the aligned and normalized datasets, the absolute, relative, and absolute×relative differences along with the aligned annotation tables (as a list of rules). DAMPNormalizeGroup function is used internally to align and normalize the datasets along with the annotation tables. This function is also used internally by the functions DAMPOutliers, DAMPTwoGroups, and DAMPMultiGroups. The usage of these functions is demonstrated in the notebooks 04-MathDAMP-Outliers.nb, 05-MathDAMP-TwoGroups, and 06-MathDAMP-MultipleGroups.nb. Please refer to the MathDAMP.nb notebook for more details about the implementation of the functions DAMPNormalizeGroup and DAMPTwoDatasets. Execute ?FunctionName to list a brief description of the respective function's available options.

? DAMPNormalizeGroup

? DAMPTwoDatasets

The loaded datasets are preprocessed prior to differential analysis in the same way as in the 02-MathDAMP-Elements.nb notebook (baseline subtraction and noise removal).

{ppctrl, ppsmpl} = DAMPRemoveNoise[DAMPSubtractBaselines[#]] &/@{ctrl, smpl} ;

Most of the options for the DAMPTwoDatasets and DAMPNormalizeGroup are specified explicitly in the following command to allow easy editing of the options. The annotation table for the cation mode CE-MS analysis is used. This table was assembled according to a CE-TOFMS analysis of a mixture of standard compounds. Methioninesulfone is used as an internal standard. Its short name (in the annotation table) 363 is passed to the DAMPNormalizeGroup function via the InternalStandard option. The location of the peak of the internal standard will be extrapolated from the aligned annotation table. Overlaid electropherograms of the vicinities of the expected peaks of the internal standard are plotted along with indicators of the beginning and the end of blindly integrated regions for visual confirmation. To specify the location of the peak explicitly, use the notation {mz,{starttime,endtime}} instead of the short name. In this case it would be {182,{13.3,13.7}} (according to the electropherogram at the end of the optional section).
The ppctrl dataset will be used as the reference dataset. To use the ppsmpl dataset as the reference dataset, set the option Reference to 2.
The annotation table is reduced to contain only items with m/z values relevant to the analyzed datasets.

rslt = DAMPTwoDatasets[ppctrl, ppsmpl, NormalizeGroupOptions {Reference1, Al ... ngeAll, ExternalNormalizationCoefficientsNone}, ThresholdForRelative0] ;






IS normalization coefficients : {1., 0.952856}

Step 3 : Exploring the Results, Listing the Candidates

The visualization of the results returned by the DAMPTwoDatasets is shown below (the parallel plot, the absolute, relative, and absolute×relative differences). The signal intensity threshold for calculating the relative difference was set to 0 this time (in contrast to the example in the 02-MathDAMP-Elements.nb notebook). Although the relative difference dataset contains a higher number of signals and smears, the absolute×relative difference dataset does not seem to be significantly affected.
Annotation is not shown on the plots below. On how to show the annotation or to alter the appearance of the plots, please refer to the 02-MathDAMP-Elements.nb notebook.

DAMPParallelPlot[NormalizedDatasets/.rslt] ;

DAMPDensityPlot[#/.rslt] &/@{Absolute, Relative, AbsoluteRelative} ;





For the visual confirmation of significant differences between the datasets (and for the rejection of false positives), overlaid electropherograms are plotted in descending order of significance. Below are the electropherograms of the top 12 differences from the absolute×relative difference result. The vertical dashed line indicates the position of the most significant difference according to the selected criteria.

DAMPPlotCandidates[NormalizedDatasets/.rslt, AbsoluteRelative/.rslt, PlotCount12, Pl ... amOptions {AnnotationTable (AlignedAnnotationTables/.rslt) 〚1〛}] ;