Binning, or creating a "histogram" (HI) (requires 1D version)

This is a new tool for reducing or digesting (also referred to as "binning" or "bucketing") a complicated spectrum for input into software that performs principal component analysis.

Background

Spectra such as 1H spectra of biofluids are so complex that it is not possible to assign all peaks, yet they still contain valuable information.  The spectra must be reduced to a simpler form that is more tractable to enable comparison of spectra and identification of correlations.  

This is done by segmenting the spectrum into narrow frequency regions, usually 0.04 ppm wide, and summing all points in each region.  Each region is thereby reduced to a single number, called a descriptor.  This not only reduces the number of data points to a more manageable number, but also allows for small differences in chemical shifts, such as might be caused by variations in pH.  

It may be desirable to eliminate some descriptors (such as the residual water peak, which varies from spectrum to spectrum).   The values are normalized, to allow comparison of different spectra.  The resulting data are output as an ascii text file consisting of 2 columns of numbers, chemical shift and intensity.  This data can then be analyzed by other software.

HI - (or histogram)  This command generates a text file which is a list of intensity descriptors. 

Sample xyz
Minimum for printing = 0.00000010
DataStart
  7.40379190	  7.36473011	  0.00045174
  7.35821962	  7.31915779	  0.00225408
[etc]

The first 2 columns in each line of the data section of the descriptor file are the PPM limits of the descriptor.  The third number is a relative sum of the intensity in the descriptor PPM range, which by default is 0.04 PPM wide. The descriptor file has the following characteristics:

All negative sums are zeroed
The sum of all descriptors is 1.0

By default, the size of each descriptor is 0.04 PPM. By default the total spectrum is used for the descriptor file and all data points in the file are used.  It is possible to customize this by reading a "template" file containing the relevant information.  This is done with the command hi file. A sample file is shown here:

Histogram_Template
Descriptor_Size 0.1
Include 10.0 0.0
Eliminate 9.50 9.0
Eliminate 5.00 4.00

The first line of the file must be "Histogram_Template".

The line starting "Descriptor_Size" is used to set the width of each segment, or "bin".

A line starting "Include" defines a region that is to be binned, and includes 2 chemical shift values, in ppm.  Similarly, regions to be ignored are defined on lines starting with "Eliminate".

It is also possible to have the command operate on several NMR files in batch mode.  To do this, all files to be processed must be copied into a separate directory, and each must have the file extension "nmr".  The command "hi serial" is used to process the series of files.  The resulting binned data is saved in a subdirectory of that directory called bins.  The files will have the same base name as the original data file, but with file extension "txt".

Command line arguments:

hi min x (or hi minimum x) -

hi size x - where x is the width of each segment, or "bin", in ppm

hi log - logarithmic scaling of intensity (emphasizes small peaks)

hi nolog - linear scaling of intensity (default)

hi info n text- Set information line n to text

hi help - displays a summary of allowed commands

hi file - where file is the name of a template file containing parameters to be used in the binning process

hi file1 file2 - where file1 is the name of a template file containing parameters to be used in the binning process, and file2 is the name of a file to which the binned data will be saved.

hi serial - batch processing of all files with .nmr file extension in the chosen directory.  If present, a file with name template.txt will be used as the template file.  

References

"An NMR-Based Metabonomic Study of Biochemical Changes in Chronically Stressed Marine Molluscs", M.R. Viant, E.S. Rosenblum, R.S. Tjeerdema, University of California, Davis, http://www.envtox.ucdavis.edu/tjeerdema/Metabonomics%20slideshow.ppt

Metabolic profiling of chronic cadmium exposure in the rat, Griffin JL, Walker LA, Shore RF, Nicholson JK, CHEMICAL RESEARCH IN TOXICOLOGY, 14 (10): 1428-1434, Oct 2001

An NMR-based metabonomic approach to the investigation of coelomic fluid biochemistry in earthworms under toxic stress, Bundy JG, Osborn D, Weeks JM, Lindon JC, Nicholson JK, FEBS LETTERS, 500 (1-2): 31-35, Jun 29, 2001

Metabonomics: Metabolic processes studied by NMR spectroscopy of biofluids, Lindon JC, Nicholson JK, Holmes E, Everett JR, CONCEPTS IN MAGNETIC RESONANCE, 12 (5): 289-320, 2000

Pattern recognition methods and applications in biomedical magnetic resonance, Lindon JC, Holmes E, Nicholson JK, PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY, 39 (1): 1-40 Jul 2, 2001

Last updated: 10/15/06