Research Overview

New Enzymes from Heme Proteins 

Nature’s diverse heme-binding proteins carry out a vast array of tasks critical to life, from transporting gases and electrons to participating in plant-animal warfare.  Although nature selected these proteins to perform their native functions, many also have ‘promiscuous’ capabilities that are not used in the natural setting but that can fuel future functional evolution. We use this diversity of natural heme proteins and their non-natural promiscuous functions as the basis for generating new enzymes in the laboratory. We use directed evolution to create new enzymes that catalyze non-biological reactions with high efficiencies and selectivities. 

Nitrene Transfer

A transition-metal nitrenoid is an important reactive intermediate involved in a variety of carbon‒nitrogen bond-forming processes that are pivotal to synthesis of nitrogen-containing functional molecules. Nitrene transfer processes are widely used in synthetic chemistry, but were unknown in biology until recently. Chemists typically use expensive and unsustainable transition-metal catalysts based on Rh, Ru and Ir to accomplish stereoselective transformations via nitrene transfer. Our enzymes, on the other hand, can use earth-abundant iron.

We have repurposed cytochromes P450 to catalyze a variety of asymmetric nitrene transfer processes, including aziridination, alkene aminohydroxylation, and C‒H amination. Importantly, the catalytic activities and selectivities of these genetically encoded enzymes can be tuned easily using directed evolution to achieve unprecedented reactivities and chemo-, site- and stereoselectivities. 

Some Papers to Start with: 

An enzymatic platform for the asymmetric amination of primary, secondary and tertiary C(sp3)‒H bonds” Y. Yang, I. Cho, X. Qi, P. Liu, F. H. Arnold. Nature Chemistry, in press.

Enantioselective Aminohydroxylation of Styrenyl Olefins Catalyzed by an Engineered Hemoprotein” I. Cho, C. K. Prier, Z. Jia, R. K. Zhang, T. Görbe, F. H. Arnold. Angewandte Chemie, January 2, 2019. 10.1002/anie.201812968.

Enantioselective, Intermolecular Benzylic C-H Amination Catalysed by an Engineered Iron-Haem Enzyme”  C. K. Prier, R. K. Zhang, A. R. Buller, S. Brinkmann-Chen, F. H. Arnold. Nature Chemistry, May 29, 2017. 10.1038/NCHEM.278. Selected by the Editorial Board of Synfacts for its important insights as “Synfact of the month” (09/2017).

Enantioselective Enzyme-Catalyzed Aziridination Enabled by Active-Site Evolution of a Cytochrome P450”   C. C. Farwell, R. K. Zhang, J. A. McIntosh, T. K. Hyster, F. H. Arnold. ACS Central Science, April 22, 2015. 10.1021/acscentsci.5b00056.

Enantioselective Intramolecular C-H Amination Catalyzed by Engineered Cytochrome P450 Enzymes in vitro and in vivo”  J. A. McIntosh, P. S. Coelho, C. C. Farwell, Z. J. Wang, J. C. Lewis, T. R. Brown, F. H. Arnold. Angewandte Chemie International Edition, July 24, 2013. 10.1002/anie.201304401.

Carbene Transfer

The formation of new carbon–carbon bonds is critically important to novel compound discovery and manufacturing. However, achieving high chemo-, regio-, and stereoselectivity in chemical synthesis is often problematic. In our lab, we leverage the ability of hemoproteins to form reactive iron-carbenoid intermediates to construct new carbon–carbon bonds through carbene transfer reactions. With directed evolution, we engineer these biocatalysts to achieve levels of activity and selectivity that are often superior to their synthetic counterparts. 

We have been able to engineer a variety of heme proteins, including cytochromes P450, cytochromes c, myoglobins, hemoglobins, and protoglobins, to perform carbon–carbon bond forming reactions that are not known to occur in biological systems. Examples include cyclopropanation, cyclopropenation, bicyclobutanation, and even carbene C–H insertion. We also evolved the first biocatalysts to form C–Si and C–B bonds. 

Some Papers to Start with: 

Enantiodivergent α-Amino C-H Fluoroalkylation Catalyzed by Engineered Cytochrome P450s” J. Zhang, X. Huang, R. K. Zhang, F. H. Arnold. J. Am. Chem. Soc. June 12, 2019. 10.1021/jacs.9b04344.

Enzymatic assembly of carbon–carbon bonds via iron-catalysed sp3 C–H functionalization” R. K. Zhang, K. Chen, X. Huang, L. Wohlschlager, H. Renata, F. H. Arnold. Nature. December 19, 2018. 10.1038/s41586-018-0808-5

Enzymatic construction of highly strained carbocycles” K. Chen, X. Huang, S. B. J. Kan, R. K. Zhang, F. H. Arnold. Science, April 6, 2018. 10.1126/science.aar4239.

New Uses for Tryptophan Synthase

We believe that enzymes can tackle some of the biggest challenges in synthetic chemistry. Recently we have begun to engineer enzymes to produce valuable building blocks for the synthesis of bioactive compounds. Noncanonical amino acids (ncAAs) are important components of natural and artificial products; they are also useful tools for chemical biology. Enzymes can produce ncAAs with pristine enantioselectivity, without the need for protecting groups or expensive reagents.  

To create a scalable enzymatic platform for synthesis of Trp analogs, we identified a TrpB subunit of the tryptophan synthase (TrpS) complex and engineered it to function as an independent enzyme.

This stand-alone enzyme has served as the basis for engineering a panel of catalysts that have been used to make more than 70 different ncAAs (and counting)!,/p>  

We are now expanding the substrate scope of TrpB to create a general platform for synthesis of noncanonical amino acids. 

Some Papers to Start with: 

Tailoring tryptophan synthase for selective quaternary carbon bond formation M. Dick, N. S. Sarai, M. W. Martynowycz, T. Gonen, F. H. Arnold. Journal of the American Chemical Society, submitted.

Unlocking reactivity of TrpB: A general biocatalytic platform for synthesis of tryptophan analogues” D. K. Romney, J. Murciano-Calles, J. E. Wehrmuller, F. H. Arnold. Journal of the American Chemical Society, July 14, 2017. 10.1021/jacs.7b05007

Engineered biosynthesis of βalkyl tryptophan analogs C. E. Boville, R. A. Scheele, P. Koch, S. Brinkmann-Chen, A. R. Buller, F. H. Arnold. Angewandte Chemie, September 14, 2018. 10.1002/anie.201807998 

A panel of TrpB biocatalysts derived from tryptophan synthase through the transfer of mutations that mimic allosteric activation“   J. Murciano-Calles, D. K. Romney, S. Brinkmann-Chen, A. R. Buller, F. H. Arnold. Angewandte Chemie,  August 11, 2016. 10.1002/anie.201606242R1 

Directed evolution of the tryptophan synthase β-subunit for stand-alone function recapitulates allosteric activation”   A. R. Buller, S. Brinkmann-Chen, D. K. Romney, M. Herger, J. Murciano-Calles, F. H. Arnold. Proceedings of the National Academy of Sciences, November 9, 2015. 10.1073/pnas.1516401112  

Computational Tools for Protein Engineering

Machine learning-guided directed evolution

Directed protein evolution can be thought of as a search on a fitness landscape, where every protein sequence is assigned some “fitness” value based on the level of its performance in a desired task. In directed evolution, we generally navigate this landscape with a single-step uphill walk: we make a library of mostly single mutations, screen for the best variant, fix the mutation, then continue the search. However, this approach cannot effectively explore mutations that are only beneficial in the context of another, a phenomenon known as epistasis. Epistasis can be explored by Combinatorial Site-Saturation Mutagenesis (CSSM), in which multiple positions in a protein are mutated simultaneously. Such libraries are usually prepared in a single tube with 20n possible variants, where n is the number of amino acid residues that are simultaneously randomized. As more sites are mutated, exponentially more samples need to be tested to ensure rare beneficial combinations are captured. Thus, in the absence of a very high-throughput screen, exhaustive exploration of combinatorial libraries quickly becomes experimentally impractical, and the engineer must settle for very local improvements.

We are using modern machine learning technologies to navigate this sequence space more efficiently and effectively. By sequencing and screening subsets of these combinatorial libraries, we can use machine learning to predict improved variants from a non-optimal subset, reaching fitness optima more often than with a simple uphill walk. Our team continues to improve this workflow and incorporate it into more of our evolutionary efforts. We are also developing other tools for modern data-driven protein engineering.

Some Papers to Start with: 

Machine-learning-guided directed evolution for protein engineering” K. Yang, Z. Wu, F. H. Arnold. Nature Methods. July 15, 2019. 10.1038/s41592-019-0496-6.

Machine-Learning-Assisted Directed Protein Evolution with Combinatorial Libraries” Z. Wu, S. B. J. Kan, R. D. Lewis, B. J. Wittmann, F. H. Arnold. PNAS,  April 12, 2019. 10.1073/pnas.1901979116Highlighted in Synfacts (10.1055/s-0039-1689894).

Gaussian Processes

Our lab has also demonstrated the utility of Gaussian processes for optimizing proteins, particularly those for which directed evolution is challenging (e.g. membrane proteins). Predictive models can be trained on the fitnesses of screened variants in order to predict fitnesses of unscreened ones. We have been able to improve thermostability, ligand binding, and catalytic activity in cytochromes P450. Most recently, we designed new channelrhodopsins that localize correctly to the plasma membrane in mammalian cells with a broad range of characteristics desirable for optogenetic applications. The engineered ChRs, for example, exhibit much higher sensitivity to light activation and enable minimally-invasive optogenetics.

Some Papers to Start with: 

Machine learning-guided channelrhodopsin engineering enables minimally-invasive optogenetics” C. N. Bedbrook, K. Yang, J. E. Robinson, V. Gradinaru, F. H. Arnold. Nature Methods, in press.

Machine Learning to Design Integral Membrane Channelrhodopsins for Efficient Eukaryotic Expression and Plasma Membrane Localization” C.N. Bedbrook, K.K. Yang, A. J. Rice, V. Gradinaru, F. H. Arnold, PLoS Computational Biology, October 23, 2017. 10.1371/journal.pcbi.1005786.

Navigating the Protein Fitness Landscape With Gaussian Processes” P. A. Romero, A. Krause, F. H. Arnold. Proceedings of the National Academy of Sciences, December 31, 2012. 10.1073/pnas.1215251110.