Research

Understanding Higher-Order Chromatin Structure

A major focus of the lab is to understand how chromosomes fold in the interphase nucleus and how this affects patterns of gene expression. We used the Hi-C method to study higher order chromatin structure in human and mouse cells during the course of cellular differentiation. We observed that chromosomes appeared to be folded into a series of self-associating regions that we termed “Topological Domains” (also terms Topologically Associating Domains or TADs). These structures were defined based on the fact that they were regions of the genome where there were many interactions within the domain but few between neighboring domains. In this regard, they represent a mechanism for compartmentalization of our genome. We observed that the patterns of TADs were highly stable between cell types, and appeared to be well conserved between mice and humans, suggesting that these are a fundamental organizing principle of mammalian genomes. Functionally, TADs appear to restrict enhancer-promoter interactions in the genome, suggesting that they serve a role in functional compartmentalization of regulatory regions. Future work in the lab will focus on understanding the mechanism of TAD formation, the consequences of disruption to TADs, and the mechanisms that TADs use to refine the landscape of enhancer-promoter communication.

Using 3D chromatin structure as a genome sequencing tool

During the course of our work characterizing higher-order chromatin structure, we realized that the method we used, Hi-C, could be co-opted to address a fundamentally different problem, that of haplotype phasing. While current whole genome sequencing strategies are adept at identifying genotypes in an individual, determining which genotypes occur on each parental allele remains challenging. This process of determining how genotypes co-occur on each parental allele (or haplotype), is termed haplotype phasing. In the past, scientists have used a variety of methods to phase haplotypes, ranging from using population genetics based approaches to next-gen sequencing methodologies. However, many of these methods either yield low quality phasing information or are technically challenging and limited in scale. We were able to demonstrate that Hi-C data could be used to phase chromosome-span haplotypes within an individual with high resolution and accuracy. We have taken advantage of this approach in order to use haplotype information within an individual to study variation in gene expression between alleles. Using this information, we, as well as others, have observed evidence of extensive allele-biased gene expression. Future work in the lab will focus on developing methodologies to further improve haplotype phasing, as well as using Hi-C data to address additional challenges in whole genome sequencing. The information derived from phased haplotypes can be used to understand how genetic variation can contribute to differences in gene expression between individuals, and how this may be related to human health and disease.