The first step in mass spectra processing is data resampling. It allows to discriminate the excessive data and to bring the mi values to common scale. As a result, different spectra will have the same m value counts, and, thus, will be comparable. Reduction in number of spectrum points allows to lower the noise and to eliminate excessive data, but, at the same time, to keep the spectrum shape. The common data scale after conversion is located between the minimal and maximal m values of spectrum. The number of data that will be resampled from original set is determined by the 'Binning percent' parameter, that represents the percentage of spectrum points remained after conversion (default value is 25). Example of data resampling is shown in figure 1.
Figure 1. Result of data resampling for small spectrum interval. Original data are shown as blue squares, resampled ones - as red circles. The 'Binning percent' for this case was set to 25.
Input: m/z - Intensity data
Output: Resampled m/z - Intensity data in the same format as input data.
Binning percent - This parameter specify the fraction of data in percent that will remain after resampling. The default value is 25.
Mass spectra data represent the sets of following pairs of values: mass to charge relation (m/z, further, for more convenience, it will be referred to as m, mass) and corresponding signal intensity (I). On a spectrum plot, the mass corresponds to X coordinate, and signal intensity- to Y one. A typical spectrum consists of several thousand of such value pairs (points). Data are represented as text files, where for each pair (mi,Ii) of mass-intensity values the string is assigned, and data in this string are separated by special separator symbol. The SMS package allows several separators types: space (SSV, space separated values, file format), comma (CSV, comma separated values, file format) and tabulation (TSV, tab-separated values, file format). In files with data, the string with comments are allowed; during the file reading these strings are to be skipped. The commentary strings should begin with "#" symbol at the first position. In the figure 2 the example of file with data in CSV format is shown.
#M/Z,Intensity -7.8602611e-005,4.1126194 2.1773576e-007,4.0764203 9.6021472e-005,4.0040221 0.00036601382,4.1186526 0.00081019477,4.0040221 0.0014285643,3.9617898 .... 19742.941,4.077895 19745.564,4.0772248 19748.187,4.0772248
Figure 2. Example file with mass spectra data in CSV format.