Thursday, March 12, 2009

Challenges for Theoretical Computer Science

http://www.research.att.com/~dsj/nsflist.html

This is a report maintained by Prof. David S. Johnson. I like the following paragraph, and the two "guesses" which I have made it bold:

Challenge: Protein Folding
[Drafted 6/9, Revised 6/22]

In one sense a protein is simply a sequence of amino acids. This is how proteins are encoded in our DNA and how they are initially constructed in our bodies. However, once this linear chain is first constructed, it automatically folds into a complicated shape (determined by the sequence in ways we do not yet fully understand), and it is the geometric properties of this shape that to a large extent determine how the protein functions.

The problem of predicting how a protein folds, given its amino acid sequence, is one of the premier problems of science. Once the 3-dimensional structure of a protein is known, it becomes possible to design drugs that inhibit or enhance a protein's activity by fitting into niches in the surface of the protein. It may also be possible to design new proteins with useful properties. But first we need algorithms both for determining the geometric structure from the sequence and, perhaps even more difficult, to determine sequences that give rise to desired structures (a problem similar in nature to the self-assembly problem listed below). We also need to understand how and which small changes in a gene's sequence can cause major changes in its 3-dimensional structure. Limited progress has been made, but the main challenges remain before us.

One particular theoretical challenge is that, in general, the problem of determining the minimum-energy folded structure for a protein is now known to be NP-hard. This suggests various interesting possibilities:

1) P = NP and Nature knows and exploits this (unlikely, but something with exciting consequences if we can figure out the algorithms Nature uses).

2) Although P does not equal NP, Nature can solve NP-hard problems efficiently using the massively parallel computation inherent in the protein folding process (also unlikely, but something with exciting consequences for biomolecular computing if true).

3) Evolution has given us proteins whose structure is sufficiently restricted so that the folding problem is easier to solve (this suggests trying to discover just what those restrictions are and why they work).

4) Proteins don't necessarily fold to their minimum-energy configurations, but instead just go to a very attractive local optimum in the energy landscape (in which case we also have important design criteria left to discover).

No comments:

Post a Comment