Recent segmental and gene duplications in the mouse genome

dc.contributor.authorCheung, Jospeh
dc.contributor.authorWilson, Michael D
dc.contributor.authorZhang, Junjun
dc.contributor.authorKhaja, Razi
dc.contributor.authorMacDonald, Jeffrey R
dc.contributor.authorHeng, Henry H Q
dc.contributor.authorKoop, Ben F
dc.contributor.authorScherer, Stephen W
dc.date.accessioned2014-08-15T00:07:07Z
dc.date.available2014-08-15T00:07:07Z
dc.date.copyright2003en_US
dc.date.issued2003-07-09
dc.descriptionBioMed Centralen_US
dc.description.abstractBackground: The high quality of the mouse genome draft sequence and its associated annotations are an invaluable biological resource. Identifying recent duplications in the mouse genome, especially in regions containing genes, may highlight important events in recent murine evolution. In addition, detecting recent sequence duplications can reveal potentially problematic regions of the genome assembly. We use BLAST-based computational heuristics to identify large (≥ 5 kb) and recent (≥ 90% sequence identity) segmental duplications in the mouse genome sequence. Here we present a database of recently duplicated regions of the mouse genome found in the mouse genome sequencing consortium (MGSC) February 2002 and February 2003 assemblies. Results: We determined that 33.6 Mb of 2,695 Mb (1.2%) of sequence from the February 2003 mouse genome sequence assembly is involved in recent segmental duplications, which is less than that observed in the human genome (around 3.5-5%). From this dataset, 8.9 Mb (26%) of the duplication content consisted of 'unmapped' chromosome sequence. Moreover, we suspect that an additional 18.5 Mb of sequence is involved in duplication artifacts arising from sequence misassignment errors in this genome assembly. By searching for genes that are located within these regions, we identified 675 genes that mapped to duplicated regions of the mouse genome. Sixteen of these genes appear to have been duplicated independently in the human genome. From our dataset we further characterized a 42 kb recent segmental duplication of Mater, a maternal-effect gene essential for embryogenesis in mice. Conclusion: Our results provide an initial analysis of the recently duplicated sequence and gene content of the mouse genome. Many of these duplicated loci, as well as regions identified to be involved in potential sequence misassignment errors, will require further mapping and sequencing to achieve accuracy. A Genome Browser database was set up to display the identified duplication content presented in this work. This data will also be relevant to the growing number of investigators who use the draft genome sequence for experimental design and analysis.en_US
dc.description.reviewstatusRevieweden_US
dc.description.scholarlevelFacultyen_US
dc.description.sponsorshipThis work was supported by the Canadian Institutes of Health Research (CIHR) and Genome Canada to S.W.S. B.F.K. is supported by the CIHR and M.D.W. is supported by the Michael Smith Foundation for Health Research (MSFHR)en_US
dc.identifier.citationCheung et al. Recent segmental and gene duplications in the mouse genome. Genome Biology 2003, 4: R47en_US
dc.identifier.urihttp://genomebiology.com/2003/4/8/R47
dc.identifier.urihttp://hdl.handle.net/1828/5557
dc.language.isoenen_US
dc.publisherBioMeden_US
dc.subjectCentre for Biomedical Research
dc.subject.departmentDepartment of Biology
dc.titleRecent segmental and gene duplications in the mouse genomeen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Cheung_Joseph_Genome Biol_2003.pdf
Size:
440.48 KB
Format:
Adobe Portable Document Format
Description:
Cheung_Joseph_Genome Biol_2003.pdf
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.74 KB
Format:
Item-specific license agreed upon to submission
Description: