指導教授:楊安綏臺灣大學:生化科學研究所張弘儒Chang, Hung-JuHung-JuChang2014-11-262018-07-062014-11-262018-07-062014http://ntur.lib.ntu.edu.tw//handle/246246/261645儘管相關研究已證實環狀區對於蛋白質摺疊過程的重要性,但是環狀區本身序列之多變性,對於憑藉資料庫中有限結構資訊的傳統研究方式來說依舊是一大挑戰,特別是在於如何找出環狀區中對於結構穩定性有決定性影響的序列這一方面。因此,我們希望能透過結合全面性突變 (comprehensive mutagenesis)、次世代定序以及高通量功能性分析的策略來解決這個難題。由此策略所取得的大量資訊可以幫助我們有效地去進一步探討蛋白質目標區域中有關序列-結構-功能之間的關係。本研究以一個經過抗體共通性序列 (consensus sequence) 設計過的單鏈多變域片段抗體 (single chain variable fragment, scFv) 上的十數個環狀區作為我們研究的目標。與先前針對蛋白質上每一個序列位置做單點飽和突變之策略不同的是,本實驗採用同時對蛋白質上目標環狀區突變六到七個胺基酸的全面性突變方式來建構噬菌體呈現資料庫。在早期的研究當中,認為在環狀區裡重要的是其共通性序列或者是偏好形成轉折 (turn) 的胺基酸序列,但在本研究結果當中發現真正重要的應該是與形成蛋白質三級結構相關的序列。我們同時藉由電腦計算以及實驗的方式去分析各種不同的突變株群,它們其中大多數都比原生株有更好的熱穩定性與表現量。總結來說,本研究提供了一種新的策略去探索在自然界當中尚未被發現的序列與功能相關性。Despite growing evidence indicates the importance of loop in protein folding process, the hyper variety of loop sequence composition is still the major challenge for the traditional studies which are based on limited structural information to determine the functional key residues within the loop region. This conundrum can be resolved by a new strategy based on the combination of comprehensive mutational analysis, next generation sequencing (NGS), and high throughput functional assay. The tremendous data resulting from this strategy will be able to help us exploring the sequence-structure-function relationships in local protein regions effectively. The 6 CDRs and 10 non-CDR loops in variable domains of a model single chain variable fragment (scFv) which have been optimized with consensus sequence approach are used as our study platform. Different from previous studies based on numerous sets of single site saturated mutagenesis, we have constructed the scFv synthetic libraries with 6-7 amino acids mutated within the target loop region simultaneously. The results indicate that the essential sequences within the loop regions are dictated by the residues involving tertiary interactions, rather than the consensus sequences or the sequences with high turn propensities. Groups of variants with different sequence features from the wild type sequence are characterized by computational or experimental methods, and they reveal significant improvement in thermal stability and expression level. This study elaborates a new strategy for exploring the protein fitness landscape and possible sequence-function relationship unseen in nature.口試委員審定書 ............................................ i 誌謝......................................................ii 中文摘要 ................................................. iv ABSTRACT.................................................. v Highlights .............................................. vi Chapter 1 Introduction ...................................... 1 1.1 The trends of protein engineering ............ 1 1.1.0 Overview of protein engineering ....... 1 1.1.1 Rational design .................. 1 1.1.2 Directed evolution ...................... 2 1.1.3 Next generation sequencing and protein fitness landscape ............................................. 4 1.2 Experimental model ............... 6 1.2.1 Single chain variable fragment (scFv) .... 6 1.2.2 Loops and turns in scFv .............. 8 1.3 Experimental design ....................... 9 Chapter2 Result and Discussion........ 13 2.1 Selection of functional scFv variants from 24 Phage – Displayed libraries by VEGF and Protein A panning ....... 13 2.2 Thermal Stability Assessment of Functional scFv Variants with HTTI experiments................ 14 2.3 Amino Acid Sequence Preferences of the Loops in Functional scFvs ....................... 17 2.4 Comparison of Sequence and Structural Features among the corresponding loops in the VH and VL domain ............. 19 2.4.1 Similar sequence and structural features shared by the corresponding loops in the VH and VL domain ...... 20 2.4.2 Difference between corresponding loops in the VH and VL domain ........... 21 2.5 The comparison of Loop-Sequence features from NGS, Sanger Sequencing, and Antibody Consensus Sequence in the non-CDR loop regions ......... 23 2.6 Thermal Stability Measurements of VEGF-Binding scFv Variants with HTTI ................ 25 2.7 The un-interchangeable hydrogen bonding network connected by polar loop residues which cap the bottom of the lower hydrophobic core .................. 28 2.8 The CDR regions affect the scFv folding and stability ................ 29 2.9 The Structural Differences in scFv variants and their correlation with stability enhancements ..... 31 2.10 The sequence features of another model scFv CR6261 .............. 34 2.11 The stability enhancement Ile-H15 is a deleterious mutation for the wild type scFv Av1.2 .............. 37 2.12 Difference between the Protein fitness landscapes of template scFv Av1.2 and VHL1 thermal stable scFv variant ALKQSI .... 40 Chapter 3 Conclusion .................. 43 References ................ 46 Material and methods ....... 52 Phage scFv library construction...................... 52 b. Functional scFv selection ........................... 52 c. 454 deep sequencing of scFv variants ............. 53 d. Information content from sequence profiles ...... 54 e. High throughput thermal inactivation (HTTI) measurement ....... 55 f. Expression level and thermal inactivation measurements of secreted soluble scFv from a CDR library culture ..... 56 g. Expression and purification scFv protein ............. 57 h. Consensus sequences alignment of huVH3 family .........57 i. Differential Scanning Calorimetry measurements ......................... 57 j. Relative phage expression ratios for phage-displayed libraries .... 58 k. Sequence LOGO generation .............. 59 FIGURES .................... 60 Figure 1. Concise description for the experimental procedure ......... 60 Figure 2. Composite structure for the template scFv Av1.2 binding Protein L. Protein A, and VEGF ........... 61 Figure 3. Comparisons of Phage-Display Selections for Protein A Binding and VEGF Binding ............ 62 FIgure4. Thermal inactivation measurements for scFv-VEGF binding and for scFv-protein A binding ...... 63 Figure 5. Sequence determinants in Av1.2 scFv loop regions ........... 65 Figure 6. Superimposed VH and VL domains of the Av1.2 scFv ....... 67 Figure 7. Comparisons of NGS profiles, Sanger sequence profiles and consensus sequence profiles from natural sequence database ...... 68 Figure 8. Thermal stability measurements of the functional scFv variants selected from each of non-CDR loop libraries ..... 70 Figure 9. Sequence preferences for the hydrogen-bonded networks capping the lower hydrophobic core of the variable domains ......... 72 Figure 10. Expression levels of the soluble scFv libraries and thermal inactivation measurements of the scFvs from the CDR libraries ............ 73 Figure 11. Prism analysis for the key residues which is important for the backbone conformation of Av1.2 non-CDR loop regions ............................ 75 Figure12. Optimization of CR6261 non-CDR loop regions .............. 76 Figure 13. The stability enhancement Ile-H15 is a deleterious mutation to the template scFv Av1.2 ................................. 78 Figure 14. Sequence features of template scFv Av1.2 and VHL1 thermal stable variant ALKQSI ......... 79 APPENDIX 1 ........................................... 81 Figure A1. The template Av1.2 scFv and the locations of the diversified sequence segments in the phage-displayed libraries ...... 82 Figure A2. Panning results of Av1.2 phage-displayed non-CDR loop libraries against VEGF binding ............ 83 Figure A3. Panning results of Av1.2 phage-displayed CDR libraries against Protein A binding .... 84 Figure A4. Relative phage expression ratios of Av1.2 phage-displayed CDR and non-CDR loop libraries ....... 85 Figure A5. The sequence of control model scFv CR6261 and the locations of the diversified sequence segments in the phage-displayed libraries ................... 86 FigureA6. Phylogenesis analysis of VHL1 NGS data ........................ 87 Table A1. Statistics of neucleotide distribution in synthetic NNK sequences........ 88 Table A2. Complexities of Av1.2 phage-displayed CDR and non-CDR loop libraries ................................... 89 Table A3. The number of reads from the NGS data ...... 90 Table A4. Number of sequences for the CDR sequence preferences 91 Table A5. HTTI measurements of scFv variants ........... 92 Table A6. Stability enhancements emerged from HTTI-filtered scFv variants and their correlated q(j,i) values............... 104 Table A7. Low or rare consensus sequences in scFv CR6261 ......... 105 Table A8. The number of structural fragments collected from database...... 1064681186 bytesapplication/pdf論文公開時間:2014/03/08論文使用權限:同意有償授權(權利金給回饋學校)噬菌體呈現次世代定序高通量功能性分析技術單鏈多變域片段抗體抗體多變域中環狀區的序列特徵與其穩定性決定因子Loop sequence features and stability determinants in antibody variable domainsthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/261645/1/ntu-103-D96b46016-1.pdf