Data Analysis

Our research in data analysis emphasizes the importance of combining the mathematics we have developed with biomedical research applications, and delivering these tools and techniques to the research community. We are developing methods for analyzing various biomedical data. We have identified four key areas on which we will focus our efforts; these include methods for image segmentation, surface analysis, diffusion tensor imaging analysis, and biosequence analysis.

The ultimate goal of structural image analysis is to extract, compare, and represent (i.e., model) anatomical information from volumetric data. Extending our tissue classification methods, the CCB processes multiple modalities and identifies pathologic structures such as lesions. Segmentation methods also aid in identifying various structures in MRI. Validating these methods using expert-labeled data, CCB explores the performance of these methods on sub-groups of data. By combining level-set segmentation approaches, atlas methods elucidate neuroanatomical structures such as subcortical gray matter bodies. We are examining various segmentation methods. We validate these methods on expert-labeled and phantom data models, and then archive this information in our Knowledge Management system. By investigating the performance of different strategies on various biomedical imaging segmentation problems and the impact of these on the results of subsequent analyses, we are able to provide recommended practices to our Investigators as well as other users of our tools. Finally, we are applying multi-layer level-set segmentation approaches to the variety of methods for surface extraction.

Diffusion Tensor Imaging methods
Diffusion Tensor Imaging is a relatively new technology that takes advantage of the fact that water diffusion in the brain is not isotropic in all tissues. Specifically, white matter tracts, which are organized into parallel fiber bundles, allow for significantly greater diffusivity along their paths than perpendicular to them. This anisotropy can not only distinguish white matter from gray matter, but the principal axis, obtained from a diagonalization of the diffusion tensor, can be utilized to generate fiber pathways in the brain. As part of this effort, we are developing a collection of tools for computing standard DTI measures (e.g., fractional anisotropy, mean diffusivity). We also are developing a novel approach to segment tracts within the DTI data collected from the brain. Our proposed method uses a physical model of the diffusion process itself, based on the measurements made during a DTI imaging sequence. To assess the validity of this algorithm, we will develop a digital phantom model. This model will be built from multiple data sources, including MRI, DTI, histological stains, anatomical atlases, and the expertise of experienced neuroanatomists. We will model a brain with multiple fiber tracts, and then simulate the effects of an acquisition protocol on that brain. This model will provide a validation standard similar to that provided by the Montreal Neurological Institute’s BrainWeb Phantom for structural imaging and segmentation.

Biosequence Analysis
Alternative splicing has recently emerged as a major mechanism of functional regulation in the human genome. Previously considered to be an unusual event, it has been detected by many genomics studies in 40-60% of human genes. Moreover, it appears to be of central importance for the nervous system and for temporal development of differentiated cells and tissues. However, of the more than 30,000 alternative splice events detected in the human transcripts, fewer than 20% have been characterized functionally. To give biologists a basis for choosing which alternative splice forms to focus on, we are developing tools needed for identifying meaningful patterns and associations with specific tissues, developmental stages and disease states. We demonstrate the value of interval logic and alignment query tools for analyzing alternative splicing. We are building these basic capabilities into a power tool that everyone can use, by integrating them into an extensible, open-source database system (PostgreSQL) that will make it relatively fast and easy for biologists to query alternative splice functions.