link to homepage

Institute for Advanced Simulation (IAS)

Navigation and service

Folding of Top7-CFr

Top7 is a designed 93 residue protein, whose topology is not found among natural proteins. It was found to be monomeric, extremely stable, and its structure was experimentally determined to be very close to the target of the design procedure. It was also reported that a fragment consisting of the 49 C-terminal residues of Top7 is efficiently mistranslated in E.coli. This mistranslated fragment, CFr, was found to adopt an extremely stable homo-dimeric structure. Both chains in the dimer have a structure similar to the corresponding residues in Top7.

The PDB structure for Top7-CFr (PDB id: 2gjh) contains 62 residues in each chain, out of which the fragment 2—50 adopts the β–α–β–β structure of the corresponding part of Top7. This segment has been extensively studied with ProFASi. As usual, these were unbiased all atom parallel tempering Monte Carlo simulations with random initial conformations for each replica. This 49 residue segment folds readily in such simulations. The global free energy minimum appears to be very close to the structure of one chain in the CFr dimer, as shown in the figure below. The backbone RMSD between the chain A of the PDB entry 2gjh and the centre of the native free energy minimum is about 2 Å.

The following animation shows one instance of this fragment folding in a parallel tempering MC simulation with ProFASi. The reason why this should be interesting is that our all-atom simulations are not small perturbations about a given structure. Neither does the force field make use of any information about the native structure. The movie covers a short segment of the Markov chain generated in the simulation just before the protein folds.


One instance of CFr folding in a parallel tempering MC simulation in ProFASi. Each frame in this movie is separated by 236000 Monte Carlo updates from the previous frame.

Although there is a large number of MC updates between consecutive frames in the above animation, it is possible to see that there appears to be a preference for a helix and the C-terminal hairpin. The N-terminal β-strand joins the hairpin only at the last stage of folding. It is also possible to see that this N-terminal strand was often folded into an extension of the native helix. These observations hold for a large number of folding events seen in our simulations. This is an indication that the most likely path to the native state found by our Markov chain MC simulations passed through an unexpected non-native extension of the helix. Using simulations of fragments corresponding to different secondary structure elements, we proposed a folding mechanism, which makes energetic sense in our model. The figure below (right), sketches this mechanism (which we call the "caching mechanism") using snapshots from the simulation.

Top7-CFr schematic representation of proposed pathway
Left: Structure of the Top7-CFr dimer (gray) and the free energy minimum structure (coloured) for a monomer at 274 K as seen in ProFASi simulations. Right: A sequence of events leading up to the folded structure in MC simulations. This figure illustrates what we call the "caching mechanism". The N-terminal (blue) strand, which is in contact with the C-terminal strand in the native state, spends its time as a non-native extension of the native α-helix until the native C-terminal β-hairpin forms. The native state is then formed by unfolding of the non-native helix extension and attachment of the released N-terminal residues as the third strand of the β-sheet.

The caching mechanism is a folding scenario involving the transient folding of a short segment of a protein chain into a non-native secondary structure element, until the native environment of the segment forms. In the case of CFr folding in our model, it depends on the chameleon like behaviour of the N-terminal segment, which prefers a helical form when there is only a helix in its neighbourhood, and a β-strand form in presence of the β-hairpin. This mechanism appears spontaneously in the simulations. The following smoothed out version of the above animation, by eliminating the rapid fluctuations makes the caching mechanism easier to see.


A smooth version of the movie above. It is obtained by taking key frames from the last part of the previous movie and creating intermediate "morph" snapshots with Pymol. The morph snapshots do not preserve the geometrical constraint of the model, and do not have any physical basis. But since they get rid of a lot of random jiggles of the Monte Carlo evolution, the movie is easier on the eye, and brings out the proposed caching mechanism more clearly.



The results discussed in this page have been published in the following articles.


last change 20.11.2009 | SL Biology | Print