Wednesday, April 20, 2011

Data mining your gut

A study has been conducted recently by EMBL that involved a large-scale data analysis of, well, the DNA of your gut. Yes, at first glance, you might think that this is not exactly computer science, that is, until you understand the methods.

The article from the NY Times is a decent summary of the work: Gut Bacteria Divide People Into 3 Types

The research team had an enormous amount of data to work with after getting DNA sequence fragments from tissue sampled from the guts of 22 people. They then mapped the fragments to the genomes of 1,511 species of bacteria that have a reference genome publicly available. Doing a clustering of the results among these 22 people revealed an interesting pattern -- all bacteria fell into one of three clusterings over all people they analyzed.

It will be interesting to perform a much larger sample of people to see if this pattern holds. I'd also be interested in seeing if there is any relationship between these clusters and the rate of occurrence of various diseases. Finally, the study suggests that they may have discovered some species of bacteria that were unknown to date. That's not very surprising, given that every one of us hosts 100 trillion microbes. There's a pretty good chance that some of our microbes are generating some interesting mutations over time.

Once again, we have another example of data mining helping medical researchers discover more information about us.


No comments: