24-appendix-d.Rmd 64.3 KB
Newer Older
mk11g11's avatar
mk11g11 committed
# Expression of the nicotinic acetylcholine receptor extracellular domain in *E. coli* 


## Introduction

nAChRs are the molecular targets of neonicotinoid insecticides. The adverse effects of neonicotinoids on non-target insects led to the restricted use and an eventual ban of these chemicals in the EU, which highlights the need for the synthesis of more selective compounds. The first step towards the generation of compounds effective on pest and not beneficial insects, is the understanding of the interactions between the neonicotinoid and their targets based on the knowledge of the agonist binding site structure.

### Structral basis of acetylcholine and agonist binding
Crystal structure of the mollusc homologue of nAChR extracellular domain (ECD) AChBP bound to imidacloprid [@ihara2008; @talley2008] and genetic analysis of the insect resistant strains [@liu2005] highlighted the importance of residues in both the principal and complementary site of the ligand binding site. The key determinants for acetylcholine are well known (cross-ref). Imidalcoprid forms similar interactions with the conserved residues in the binding pocket (cross-ref). There are cation-pi interactions with the aromatic amino acids, hydrogen bond via water molecule and Wan der Walls interactions with loop E residues - these are common to interactions made with other agonist of nAChRs. There are also some proposed unique interactions between the imidacloprid and AChBP, such as hydrogen bond with Gln 55 of loop D. However, this Gln is not conserved and corresponds to basic residue in many insect nAChR subunits, thus the unique interaction may be present in some, and not all insect receptors. The importance of this residue was highlighted in the genetic analysis of naturally occuring naturally occurring resistant strains. Mutation from basic arginine to threonine at the corresponding position of $\beta1$ nAChR subunit of Myzus Persicae confers imidacloprid resistance [@bass2011]. 

Although AChBP is a useful model to study nAChR-drug interactions, as indicated above, there are differences in the amino acid sequence in the ligand binding pocket of AChBP and nAChR (Appendix A), including those that appear as key determinants of drug sensitivity. Therefore, high resolution structure of nAChRs are needed to fully understand the underpinnings of selectivity of neonicotinoids and the chemical space in which imporoved selectivity might be achived. Structures of $\alpha4\beta2$ [@morales-perez2016] and nAChR extracellular domain (ECD) of $\alpha2$ [@kouvatsos2016] and mouse $\alpha1$ [@dellisanti2007] human $\alpha9$ [@zouridakis2014] are known. Structures of nAChR extracellular domain bacterial homologes GLIC [@hilf2009] and ELIC [@hilf2008] as well as pentameric ion channels GABA [@miller2014] and glycine [@huang2015] have also been described. However, crystal structure of insect receptors is not available. This may be caused by the difficulties in expression and purification of folded and soluble recombinant proteins [@rosano2014], which are essential and first steps towards the structural characterisation. 

## Biological systems for recombinant protein expression 
The requirement for the structural analysis of any protein is the high level expression of stable and correctly folded protein [ref]. There is an array of biological systems used for the recombinant protein expression, however there is no single system that subserves as the single route to the best production of all proteins. This often means a trial and error approach, until the right system is undetified. Some of the commonly used systems are: isolated cell lines of mammals or insects, and whole organisms such as yeast, fungi or algae. *E. coli* is one of the most commonly used host organisms for heterologous soluble protein expression [@berman2000] due to its simplicity, low cost, relatively good knowledge of transcription, translation and protein folding processes as well as ease of manipulation. According to Protein Data Bank, almost 75 % of all structures have been derived from proteins expressed in *E. coli* [@berman2000]. This includes the structure of GLIC [@hilf2009] and ELIC [@hilf2008]. In addition, an expression of functional AChBPs [@abraham2016] in *E. coli* cells and full length human $\alpha7$ [tillman2016] have been reported. 

(ref:e-coli-structure) **The *E. coli* cell.** *E. coli* is a Gram negative bacteria shielded by a capsule and encapsulated by 2 layers of the membrane separated by the periplasmic space. Main organelles in the cytosole are nucleoid containing the genetic material, inclusion bodies typically containing protein aggregates and ribosomes for protein translation.

```{r e-coli-structure-label, fig.cap="(ref:e-coli-structure)", fig.scap='The \\textit{E. coli} cell.', fig.align='center', echo=FALSE}

<!-- ### Expression strains of *E. coli* -->

<!-- There is a number of *E. coli* strains used in molecular biology for protein expression, most of which are descendants of K-12 and B *E. coli* strains [@daegelen2009, bachmann1996]. K-12 cells were first isolated in the early 20th century in California [@bachmann1996]. The origin of B-cells is unknown, but it has been in use since the early 20th century in multiple labs [reviewed in @daegelen2009]. The mutation of B-cells led to generation of multiple strains, one of which is BL21 [@daegelen2009].  -->
<!-- Over the years, strains used for protein expression have been extensively mutagenized to allow recombinant protein expression and to aid proteins' stability and folding. A major development in the generation of expression strains was a development of BL21(DE3). DE3 is a genomic sequence derived from bacteriophage. It was integrated into the chromosomal sequence of BL21 to produce BL21(DE3). DE3 containing bacteriophage T7 RNA polymerase system. Under normal conditions, the expression of T7 is inhibited by a repressor. However, T7 can be produced, should an inducer, such as Isopropyl $\beta$-D-1-thiogalactopyranoside (IPTG) be added to the growth medium. IPTG binds to the repressor, allowing native *E. coli* T3 RNA polymerase to transcribe the T7 polymerase. T7 polymerase can then transcribe any gene under the control of T7 promoter in the expression vector (section below).  -->

<!-- There is a number of other genomic alterations introduced into BL21(DE3) cells to increase their efficiency in recombinant protein production. To increase stability of the plasmid DNA, genes involved in DNA restriction and methylation processes (hsdS and dmc methylase) have been altered. Whereas deletions and mutation of genes encoding for some proteases, for example outer membrane bound OmpT and cytoplasmic Lon, should reduce proteolysis of expressed recombinant proteins. BL21(DE3) cells have been further modified to generate more specialised strains to accommodate expression of problematic proteins. For example, Rosetta BL21(DE3) pLysS reduced background expression of the recombinant protein. and is an attractive system for the expression of toxic proteins. Whereas Origami B allows for formation of disulphide bonding in the cytoplasm of *E. coli*, therefore can be used for the production of disulphide-bond rich proteins in the cytoplasm.  -->

<!-- ### Plasmid DNA -->
<!-- Molecular biologists take advantage of genetically engineered expression plasmids to produce the protein of interest in *E. coli* cells. Expression vectors have been synthesised based on native plasmids. Native plasmids are bacterial small, circular and double stranded extra-chromosomal pieces of DNA. They encode for elements not necessary for bacterial survival, but beneficial in certain conditions. For example, they may contain an antibiotic resistant gene, allowing cells to proliferate in the presence of an antibiotic. They also contain an origin of replication sequence. This sequence is recognisable by the native bacterial replication machinery, which copies the entire plasmid DNA during growth and cell-replication. This ensures plasmid is retained in the cells and passed to the daughter cell.  -->

<!-- Expression plasmids take advantage of these two features. They also contain additional sequences enabling insertion of genes encoding for the protein of interest into the plasmid. Specifically, there are multiple restriction sites. These sites are recognisable by the restriction enzymes some of which cleave the double-stranded circular DNA producing a linear DNA fragment containing overhangs. A gene of interest with complementary overhangs can then be incorporated into the linearised plasmid. This process requires the presence of T4 ligase enzyme, which two pieces of linearised DNA, joins complementary ends to produce a circular plasmid containing a gene of interest.  -->

<!-- There are also elements necessary for the gene translation, such as bacteriophage-derived T7 promoter. T7 promoter is a substrate for T7 polymerase. These two elements allow for inducible translation of the gene downstream of the T7 promoter upon addition of IPTG (ref to prev section). -->

<!-- Most expression plasmids contain these four elements: the origin of replication, the antibiotic resistance, multiple cloning site and T7 promoter. Some expression plasmids contain sequences enabling protein targeting and/or purification -->

<!-- In Gram negative bacteria, such as *E. coli*, proteins can be targeted into 4 different locations: the inner or outer membrane, the periplasmic space or out of the cell. The process of protein targeting is regulated by the signal sequences. Signal sequences are short sequences of amino acids, typically 16-32, at the N-terminus of the protein, recognisable by the intracellular translocation machinery [@perlman1983]. DNA encoding for these sequences can be utilised in the expression vector, to ensure the recombinant protein is destined to the appropriate sub-cellular compartment. This is critical for successful expression of some proteins. For example, membrane proteins must be targeted to the inner membrane, whereas proteins requiring the formation of disulphide bridges must be targeted to the periplasm of *E. coli*, where oxidation of cysteines takes place [@bardwell1991]. There are two commonly used periplasmic space signal sequences - OmpT and pelB [@lei1987]. pelB is native to *Erwinia carotovora*. It is a 22-amino acid sequence: *MKY*LLPTAAAGLLLLAAQ**PAMA**. As many other signal sequences, it contains 3 basic residues at the N-terminus (italic), a string of hydrophobic amino acids, and a cleavage site (bold) [@perlman1983]. The cleavage site is recognisable by the membrane bound signal peptidase [@pugsley1993], releasing the downstream peptide into the periplasmic space. Although not native to *E. coli*, pelB directs tagged proteins to the periplasm of this bacterium [@yoon2010], via the Sec translocation pathway (Figure \@ref(fig:folding-e-coli-label). -->

<!-- (ref:folding-e-coli) **Targeting and folding of periplasmic recombinant proteins in *E. coli*.** The premature protein containing pelB signal sequence (red) is targeted to the Sec translocation system. The co-translational translocation of the proteins occurs using the energy [@daniels1981]. Partially folded protein is released into the periplasmic space. Partially folded protein can be: aggregated in the inclusion bodies (1), degraded by proteases such as DegP or Prc (2), folded into the correct tertiary (and/or quaternary) structure with the aid of chaperones such as Skp and FkpA (3) or covalently modified by formation of disulphide bonds. Cysteines can be oxidised by DSbA (4), whereas incorrectly formed disulphide bonds reduced by DsbC (5). -->

<!-- ```{r folding-e-coli-label, fig.cap="(ref:folding-e-coli)", fig.scap='Targeting and folding of periplasmic recombinant proteins in \\textit{E. coli}.', fig.align='center', echo=FALSE} -->
<!-- knitr::include_graphics("fig/results5/png/secretion_ecoli.png") -->
<!-- ``` -->

<!-- Another feature of many expression vectors are affinity tags. These are sequences either on N or C terminus of cDNA encoding for the protein of interest. Upon expression, they allow for selective purification of the tagged recombinant protein. There are a number of tags available, including large globular proteins such as 40.6 kDa maltose binding protein (MBP) and small several amino acids long stretches of amino acids, such as histidine-tag (HIS-tag). HIS-tag is a sequence encoding for 6-10 histidines. Upon expression, these histidines bind with high affinity to divalent cations. This binding affinity is used in the process of purification. Some tags, sucgh as MBP provide not only means of protein purification, but also aid solubility and stability of proteins, therefore are an important feature of expression vectors [@lebendiker2011]. -->

## Strategies used to express pentameric ligand gated ion channels in *E. coli*. 

Despite the instrinsic difficulties of expressing multimeric and membrane proteins in *E. coli*, they have been used to express and purify high quantity of folded nAChRs and related proteins. This includes AChBP [@abraham2016], ELIC [@nys2016; @hilf2008] and full length human $\alpha7$ nAChRs [@tillman2016], and the ECD of the rat $\alpha7$ [@fischer2001]. Strategies employed in these studies highlight ways of overcoming major difficulties faced when trying to express recombiant proteins in *E. coli*. 

Eukaryotic secretory and membrane proteins are targeted to ER and Golgi where they undergo a process of maturation (Figure \@ref(fig:turnover-label)), before being sent their correct localisations. As part of the maturation process, vast number of these proteins undergo a process of post-transcriptional modification [@khoury2011], such as addition of carbohydrates (glycosylation) or formation of disulphide bonds. These modifications can contribute to the stability of the protein, aiding their folding [@xu2015]. The *E. coli* secretory system is much simpler, therefore the nature, the frequency and the mechanism of post-translational modifications differ from those in eukaryotic cells [@dell2010]. There are a number of strategies employed to overcome this problem. For example, proteins requiring formation of disulphide bonds can be targeted to the periplasmic space of the *E. coli*, where this process can take place, due to the presence of active alkaline phosphatases [ref]. To enable formation of disulphide bonds of recombinant human nAChR, $\alpha7$ was targeted to the periplasmic space of *E. coli* with the pelB sequence [@tillman2016]. 

Targeting to the periplasmic space may also have other advantages. There periplasmic environment is less crowded due to reduced number of secreted proteins, in comparison to the cytoplasm, therefore the likelihood of proteolysis of the recombinant protein is reduced [@makrides1996]. This may lead to potential increased stability of proteins in the periplasm vs cytoplasm. Expression in the periplasm may be advantageous for some proteins, even those that do not require formation of disulphide bonds, such as ELIC [@hilf2008, @nys2016].

One of the major obstacles when producing a recombinant protein in *E. coli* is it that many proteins tend to form insoluble aggregates [@peternel2011]. This issue can be addressed by a number of approaches. First, recombinant protein can be tagged by another protein, native to the biological host. MBP is an *E. coli* periplasmic protein [@bedouelle1983]. N-terminal fusion of the protein of interest with MBP increases solubility and stability of proteins expressed in both cytoplasmic and periplasmic space [@raran-kurussi2015], including those rich in disulphide bonds [@planson2003]. MBP fusion strategy was employed by @fischer2001 to express $\alpha7$ ECD and @hilf2008 to express ELIC. Another way to increase solubility is to to produce a fragment and not a full length of the protein. This is a common approach employed when studying the soluble domains of transmembrane proteins, such as ECD containing the agonist-binding pocket of nAChRs. ECD is soluble, therefore it may be easier to successfully express and purify than the full-lent protein containing hydrophobic sections [ref]. Importantly, ECDs expressed heterogeneously can form pentameric assemblies and folded binding sites [@kouvatsos2016; @dellisanti2007]. ECD of $\alpha7$ was expressed in *E. coli* by @fischer2001. Another approach is to modify the genetic code, to introduce amino acid mutations, some of which may favour expression. For example, mutation of 2 amino acids to increase solubility can increase expression by 264-fold [@dale1994]. Mutation of a single amino acid in $\alpha7$ ECD sequence increases stability and solubility of expressed protein [@tsetlin2002]. There is also another version of the $\alpha7$ ECD containing greater alterations generated by @zouridakis2009 (Figure \@ref(fig:alpha7-seq-mutant-label)). These mutations increased protein solubility, solubility and production yield of the protein.

(ref:alpha7-seq-mutant) **Sequences of ECD nAChR variant with increased solubility.** Sequences of ECD used in this study: human $\alpha7$ (hu a7^), honey bee $\alpha5$ (Ap a5^) and *C. elegans* ACR-21 (Cel ACR-21) have been mutated (residues in red) based on the mutant version of $\alpha7$ (mut-10) (@zouridakis2009). In addition, Cys-loop of the honey bee and *C. elegans* subunits have been replaced for the more soluble Cys-loop sequence of Aplysia AChBP.

<!-- and alignment of wild-type and mutated amino acid sequence of human $\alpha7$ ECD. Introduction of these mutations, resulted in increased expression levels of folded protein in yeast. Mutated amino acids are   underlined in red. Cysteines forming disulphide bonds are in yellow boxes, whereas TrpB and Tyr involved in interaction with the agonist in grey boxes. Sequence alignment taken from . -->

```{r alpha7-seq-mutant-label, fig.cap="(ref:alpha7-seq-mutant)", fig.scap = "Sequences of ECD nAChR variant with increased solubility", fig.align='center', echo=FALSE}

Summarising, there is no universal protocol for the expression of folded and stable recombinant proteins particularly in the challenging subdomain of membrane protein that derives from protein with complex quaternary (penatameric) structure. Therefore, it is a common practise to try several approaches, optimising the *E. coli* growth and expression conditions along the process. Years of research developed the use of solubility enhancers, targeting signalling sequences and other approaches to allow for the expression of complex molecules, such as nAChRs and bacterial structurally related proteins. For example, to express and purified ELIC from *E. coli*, @hilf2008 expressed ELIC tagged by MBP and targeted to the periplasmic space by pelB. pelB is native to *Erwinia carotovora* [@lei1987]. It is a 22-amino acid sequence: *MKY*LLPTAAAGLLLLAAQ**PAMA**. As many other signal sequences, it contains 3 basic residues at the N-terminus (italic), a string of hydrophobic amino acids, and a cleavage site (bold) [@perlman1983]. The cleavage site is recognisable by the membrane bound signal peptidase [@pugsley1993], releasing the downstream peptide into the periplasmic space. Although not native to *E. coli*, pelB directs tagged proteins to the periplasm of this bacterium [@yoon2010], via the Sec translocation pathway. Despite all advances, the successful expression of nAChRs in *E. coli* has been achieved only in a handful of cases. The inability to produce high quantity and quality of nAChRs, hinders their structural analysis and understanding of the molecular basis of selectivity of important agricultural compounds, such as neonicotinoids. 


## Aim

The aim of this chapter is to develop an *E. coli* based expression platform for insect nAChRs to enable characterisation of the ligand binding site and determination of structural features underpinning their interactions with neonicotinoids. 

As the first step, the expression and purification of human $\alpha7$ was initiated as a test bed. This receptor was chosen because it forms homopentameric receptors in which the recombinant expression of a single subunit can potentially drive functional expression of the ECD.

To enable expression of proteins in *E. coli* cells, necessary DNA elements were cloned into the expression vector  (Figure \@ref(fig:construct-diagram-label)). These elements are START codon, pelB, sequence encoding for a string of 10 histidines (DecaHIS), maltose binding protein (MBP) and clevage site 3C. 

<!-- 1. Start codon for initiation of translation2. pelB signal sequence  -->
<!-- 2. DecaHIS for purification of the proteins -->
<!-- 3. 2 amino acid long linker (amino acid sequence (PM) -->
<!-- 4. Solubility enhancer MBP -->
<!-- 5. 3 amino acid long linker (amino acid sequence PGS) -->
<!-- 4. Protease cleavage site 3C  -->
<!-- 5. Extracellular domain of the human $\alpha$ nAChR -->
<!-- 6. 8 amino acid linker (GEVEQPLE) -->
<!-- 7. 2GSC subunit of a pentameric protein -->
<!-- 8. Stop codon to terminate translation -->

(ref:construct-diagram) **Schematic diagram of the DNA construct used for the expression of $\alpha7$ ECD in *E. coli*.** This is more text that belongs here.

```{r construct-diagram-label, fig.cap="(ref:construct-diagram)", fig.scap = "Schematic diagram of the DNA construct used for the expression of $\\alpha7$ ECD in \\textit{E. coli}.", fig.align='center', echo=FALSE}

START codon initiates protein translation, pelB targets protein to periplasm, DecaHIS and/or MBP enable purification with nickel or dextrin chromatography column, respectively. MBP also acts as a solubility enhancer, whereas 3C is cleavage site enabling removal of pelB-HIS-MBP-3C from downstream peptide upon treatment with an appropriate protease. Downstream of these elements sequence encoding for the $\alpha7$ extracellular domain was cloned. There are several reasons why expression the ligand binding domain without the transmembrane domain was carried out. First, this study is concerned with the structure of the ligand binding site, which is contained in the ECD domain of the receptor. ECD is potentially soluble, therefore easier to successfully express and purify than the full-length protein containing hydrophobic sections [ref]. Although ECDs can form pentameric channels, the sequences within the TM region of the receptor may be also important in the process of receptor assembly [@wang1996a]. To account for this, the ECD was flanked by sequence encoding for 2GSC - a single subunit of a pentameric protein. 

(ref:2gsc) **Comparison of the pentameric soluble bacterial protein transmembrane domain of the nicotinic acetylcholine receptor.** 2GSC is a 4-helical protein, assembling into a pentameric bundle (a). The general architecture and dimensions closely reassemble those of the nAChR transmembrane domain (b). Images and dimensions were derived in UCSA Chimera (PDB codes:2GSC and 2BG9 for muscle nAChR). Distances were derived by calculating distances from the most distal atoms on the polar ends of the structures.

```{r 2gsc-label, fig.cap="(ref:2gsc)", fig.scap="Comparison of the pentameric soluble bacterial protein transmembrane domain of the nicotinic acetylcholine receptor", fig.align='center', echo=FALSE}

2GSC is a cytosolic protein endogenous to Gram-negative bacteria *Xanthomonas campestris*. Its structure was derived by X-ray crystallography, following the expression in and purification from *E. coli* (Figure \@ref(fig:2gsc-label)) [@lin2006]. 2GSC is a four-helical protein assembling into pentameric bundles. The overall architecture is similar to that of the nAChR membrane spanning domain (Figure \@ref(fig:2gsc-label)), however, 2 GSC is soluble. It is therefore hypothesised that the oligomerisation of the 2GSC could aid assemble of pentameric ECD of nAChRs.

It needs to be noticed that although this chapter describes the expression of human $\alpha7$ receptor, the expression of several other genes have been tested. 

Two further ECDs, namely honey bee $\alpha5$ and *C. elegans* ACR-21 subunit were cloned. Their expression was driven from plasmids containing 2GSC, as well as two other proteins: 1VR4 and 2GUV. 1VR4 and 2GUV are bacterial proteins of unknown function, shown to form pentamers in *E. coli*. Out of 9 constructs tested, the results from the $\alpha7$ were the most promising.

mk11g11's avatar
mk11g11 committed
<!-- ```{r tesing-constructs, echo=FALSE} -->
<!-- library(kableExtra) -->
<!--  library(dplyr) -->

<!-- testconstructs <- data.frame( -->
<!--   Construct = c("$\\alpha7$-1VR4", "$\\alpha7$2GUV", "ACR21-2GSC", "ACR21-1VR4", "ACR21-2GUV", "$\\alpha5$-2GSC", "$\\alpha5$-1VR4", "$\\alpha5$-2GUV"), -->
<!--   Expression = c("", "~10 % purified. Remainer lost after 1st spin", "", "", "", "", "", ""), -->
<!--   Purification = c("", "", "", "", "", "", "", ""), -->
<!--   Gel = c("ND", "ND", "ND", "ND", "Aggregates", "ND", "ND", "ND") -->
<!-- ) -->

<!-- testconstructs %>% -->
<!--    mutate_all(linebreak) %>% -->
<!--  kable("latex", align = "l", booktabs = TRUE, escape = F, -->
<!--        col.names = linebreak(c("Construct", "Expression", "Purification", "Gel\npurification")), -->
<!--        caption = 'Expression of human, honeybee and C. elegans nAChR extracellular domains in E. coli', -->
<!--        ) %>% -->
<!--    kable_styling(position = "center", full_width = FALSE, latex_options = "hold_position") -->
mk11g11's avatar
mk11g11 committed

mk11g11's avatar
mk11g11 committed
<!-- ``` -->
mk11g11's avatar
mk11g11 committed


## Results

### Generation of the vector for the expression of human $\alpha7$ nAChR in *E. coli* periplasm.

The expression vector was generated in a two-step reaction. First, pelB-HIS-MBP-3C was cloned, followed by the $\alpha7$-2GSC. For simplicity, pelB-HIS-MBP-3C sequence will be refereed to as MBP-3C.

MBP-3C was PCR amplified (Figure \@ref(fig:pet27-mbp-label), Table \@ref(tab:MBP-amplification)) with SalI, NdeI flanking primers (Table \@ref(tab:primer-seq1)) from the vector used for expression of ELIC [@hilf2008]. Purified PCR product and the destination pET27 vectors were sequentially digested with SalI and NdeI restriction enzymes to enable ligation (Figure \@ref(fig:pet27-mbp-label)). Ligation and colony selections were performed to generate pET27-pelB-HIS-3C (pET27-MBP-3C for short), suitable for expression of proteins in the periplasm. The success of cloning was provisionally indicated by NdeI and SalI digestion to produce the backbone and insert DNA fragments of purified plasmid. The positive clones were amplified and sequenced using primers flanking the insert (Appendix ref(fig:pelb-3c-lbl)). Single nucleotide mutation (C substituted by A) occurred between the pelB and DecaHIS, but this conservative mutation (GCC to GCA codon) encode for alanine (Appendix E).

(ref:pet27-mbp) **Generation of the vector for the expression of proteins in the periplasm of *E. coli*.** (a) Cartoon representation of the process of amplification of the gene by PCR (a) indicating the restriction sites of enzymes used for DNA digestion. pelB-HIS-MBP-3C sequence was amplified from pET26-GLIC vector, gel excised and purified. Digested with SalI and NdeI PCR fragment was cloned into digested pET27. 9b) Agarose gel of digested PCR template (pET26-GLIC), PCR products (2), pET27 vector backbone (3) and cloned expression pET27-pelB-HIS-MBP-3C vector (4 and 5) against DNA ladder (M). The sizes of generated DNA fragments in bp are given under the DNA bands. The localisation of restriction sites within the DNA fragments are indicated in a.

```{r pet27-mbp-label, fig.cap="(ref:pet27-mbp)", fig.scap = "Generation of the vector for the expression of proteins in the periplasm of \\textit{E. coli}.", fig.align='center', echo=FALSE}

$\alpha7$-ECD-2GSC was cloned into the pET27-MBP-3C expression vector (Figure \@ref(fig:pet27-hu-ligation-label)). pBMH plasmid containing coding $\alpha7$-ECD-2GSC sequence was synthesised. The Hu$\alpha7$-2GSC gene was PCR amplified with primers containing non-complementary sequences containing SalI and NheI restriction sites (Table \@ref(tab:primer-seq1) and Table \@ref(tab:human-lgd-amplification)). PCR product and  pET27-MBP-3C were sequentially digested with SalI and NheI restriction enzymes. Purified DNA fragments were ligated, colonies selected and DNA purified. DNA was then analytically digested with SalI and NheI as well as send for sequencing. Cloned DNA sequence was error free (Appendix F). The generated pET27-MBP-3C-$\alpha7$-ECD-2GSC vector was used for the expression of $\alpha7$-ECD.

(ref:pet27-hu-ligation) **Generation of the vector for the expression of ligand binding domain of human $\alpha7$ nAChR.** Schematic representation (a) and DNA agarose gel (b) of generation of expression vector. $\alpha7$ was PCR amplified using primers flanked with restriction enzyme recognition sites, digested and cloned into digested pET27-pelB-3C vector.

```{r pet27-hu-ligation-label, fig.cap= "(ref:pet27-hu-ligation)", fig.scap= "Generation of the vector for the expression of ligand binding domain of human $\\alpha7$ nAChR.", fig.align='center', echo=FALSE} 

### Expression of $\alpha7$-ECD chimera in *E.coli* 

The pET27-pelB-3C-$\alpha7$-ECD-2GSC was used to express the chimera protein in *E. coli* cells. To enable protein expression, *E. coli* cells were transformed with the expression plasmid and grown in the presence of antibiotic kanamycin. Bacteria were grown in LB growth media, which contains nutrient to support bacterial growth. The conditions at which bacteria are grown can be modified to optimize protein expression (PhD thesis of Ben Yarnall, data not shown). The factors were investigated with resoect to the induced expression of $\alpha7$ ECD chimera. Transformed *E. coli* culture was grown at 37 &deg;C until OD600nm≈1.0 before adition of 0.5 mM IPTG for 6 hours. This method allowed for high levels of protein expression over a short period of time. 

In parallel, culture was grown at 37 &deg;C until OD600nm≈0.5 before the temperature was lowered to 18 &deg;C, and the growth allowed to proceeded until OD600nm≈1 (aaproximately XX minutes). At this point, 0.2 mM IPTG was added and the cultures incubated overnight at 18 &deg;C.  

Pre- and post-induction samples were collected for both growth and IPTG induction conditions (Section \@ref(samples) and mixed with denaturing sample buffer. Proteins present in the SDS-bacterial cell extracts before and after IPTG induction were resolved with sodium dodecyl sulphate-polyacrylamide gel electrophoresis (SDS-PAGE) and visualised with the Coomassie staining (Figure \@ref(fig:hua7-WB-label)). 

Addition of IPTG should lead to the expression of the $\alpha7$ construct. Indeed, a distinct band of 84 kDa can be seen on the gel. The size of this band corresponds to the predicted size of expressed coding sequence from the $\alpha7$ construct. 

To authenticate the band as the protein of interest Western Blot of the samples run on the SDS-PAGE was performed (Section \@ref(western)). Western Blot uses antibodies with high affinity for the N-terminal DecaHIS tag (Section \@ref(abs)). As seen in Figure \@ref(fig:hua7-WB-label), no protein was detected in the pre-induction sample. Whereas addition of IPTG inducer resulted in production of HIS-tagged protein. This protein is of expected size od ~ 84 kDa. 

Western Blot was examined closely to establish which condition resulted in the production the highest amount of the recombinant protein. Samples run on the SDS-PAGE gel were volume-normalised, thus the intensity of bands on Western Blot can be compared. The expression of $\alpha7$ chimera at 37 &deg;C and induction with 0.5 mM IPTG was the highest after 2 hours and decreased over time. Interestingly, multiple bands were detected in sample collected after the overnight protein expression at lower temperature and lower IPTG concentration. This suggests multiple sized HIS-tagged proteins are present in the sample, including truncated or proteolysed ones. Based on band intensities, the highest level of expression was achieved after overnight expression with low IPTG concentration and at low temperature.

(ref:hua7-WB) **Expression of the $\alpha7$ ECD chimera protein in *E. coli*.** Coomassie stained SDS-PAGE gels (a) and corresponding Western Blot (b) of proteins obtained from the lysed whole cell samples. Transformed with $\alpha7$ ECD containing plasmid *E . coli* were grown in 1 L of LB. Protein expression was induced with either 0.5 mM IPTG and proceeded at 37 &deg;C or with 0.2 mM IPTG and proceeded at 18 &deg;C. Samples of the cellular suspension prior to induction (pre-induction) and at 2, 4, 6 hours after the induction with high IPTG concentration were taken. Alongside, a sample after overnight (16 expression) driven by low concentrations of IPTG (0.2 mM) were taken. Samples were prepared as described in Section \@ref(samples).

```{r hua7-WB-label, fig.cap="(ref:hua7-WB)", fig.scap= "Expression of the $\\alpha7$ ECD chimera protein in \\textit{E. coli}.",  fig.align='center', echo=FALSE}

mk11g11's avatar
mk11g11 committed
Following overnight protein expression at 18 &deg;C, induced by 0.2 mM IPTG, and 6-hour expression at 37 &deg;C induced with 0.5 mM IPTG, the protein was purified. (Section \@ref:(purification-general-methods)). At each stage of purification the samples were collected to run them on the SDS-PAGE gel. Briefly, purification was done is a three-step process. The cells were precipitated and broken down by sonication to release their content. Homogenised cells were then spun down (low speed spin) to remove the unbroken cells, nucleic acids, organelles and large insoluble cellular particles, such as inclusion bodies (precipant was colected as a Whole Cell sample). The supernatant from this low speed spin was then spun at 100 000g to precipitate cellular organelles. $\alpha7$ ECD in the 100 000g soluble fraction was subsequently resolved using solid phase Ni^2+^-NTA IMAC purification (Section \@ref(his)). Briefly, the soluble fraction was incubated with Ni2+-NTA resin for 2 hours at 4 $^\circ$C to allow binding of the expressed HIS-tagged $\alpha7$ ECD chimera to beads. The mixture was decanted into the chromatography column. The unbound proteins were collected in the flow-through, before the the beads were washed three times. The wash fractions were pooled and run as “Wash” on the gel. The resin-bound protein was then eluted by washing the column with 5mL of 0.2 mM imidazole, to displace the HIS-tagged protein from the imobilized Nickel by competion. The eluted proteins were collected in two eluate fractions (Eluate 1 and 2).
mk11g11's avatar
mk11g11 committed

Representative samples from each of the fractions indicated above were resolved on the SDS-PAGE gel (Figure \@ref(fig:expression-conditions-test-label)) and the presence of recombinant protein was detected by Western Blot with anti-HIS-tag antibodies (Figure \@ref(fig:expression-conditions-test-label) b). Since most of the purification features are common following expression at 18 and 37 &deg;C, some general comments are be made first. Then the comparison between the total purified protein will be made between the two. 

The Whole Cell sample is the precipitate collected after the low-speed spin of the sonicated cells. As expected, a large number of proteins of various sizes were present in this sample, as visible on the Coomassie stained SDS-PAGE. There is a high intensity band of 84 kDa (size corresponding to the expression product of $\alpha7$ ECD chimera) on both the Coomassie stained SDS-PAGE and the Western Blot. This suggests that following harvest, the cells were either not broken up entirely and the recombinant protein retained intra-cellularly, or the protein was present in the inclusion bodies. To account for this, the sonication steps were extended from 6 to 8 minutes in the future experiments. 

"Flow through" and "Wash" were samples collected during the first two steps of IMAC, and are expected to contain proteins with no- or weak - affinity to Ni^2+^-NTA resin. Indeed, a large number of proteins of various sizes can be seen on the Coommassie stained gel, particularly in the "Flow through". Additionally, there is also Immunoreactive proteins of the expected $\alpha7$ ECD chimera protein size present in both the Flow Through and Wash, suggesting it failed to bind to the Ni^2+^-NTA resin with high affinity. This could indicate that the insufficient amount of resin was present, therefore not all HIS-tagged protein managed to bind. Alternatively, the HIS-tag was buried in the tertiary and/or quaternary structure of the protein and was thus not accessible for interactions. Therefore, the amount of resin was increased for future experiments from 0.5mL to 1 mL used for purification of the protein from 1 L of culture. Additionally the incubation time of the incubation of the soluble fraction with resin was increased from 2 hours to overnight.

Eluate samples are expected to contain proteins with high affinity to Ni^2+^-NTA resin. However, no immunoreactive protein was detected in the eluate following expression induced at 37 &deg;C and 0.5 mM IPTG. In contrast, there is a band on the Western Blot in the eluate collected after expression at 18 &deg;C and 0.2 mM IPTG, corresponding to the $\alpha7$ ECD, based on its size of 84 kDa. Thus, more protein is being successfuly purified following extended expression at lower temperature. Therefore, based on the expression and purification, more protein is being expressied at and purified following expression at 18 &deg;C.

(ref:expression-conditions-test) **The effects of the temperature and inducer concentration on the expression of $\alpha7$ nAChR chimera in *E. coli*.** SDS-PAGE gel (a) and corresponding Western Blot (b) of samples collected during the purification of $\alpha7$ nAChR chimera following the side by side expression of the protein at two different conditions. The expression was proceeded as explained in Figure \@ref(fig:hua7-WB-label). Following expression, cells were harvested and homogenised. Protein purification proceeded as described in methods (ref). The expected size of monomeric $\alpha7$ nAChR chimera is 84 kDa.

```{r expression-conditions-test-label, fig.cap = "(ref:expression-conditions-test)", fig.scap = "The effects of the temperature and inducer concentration on the expression of $\\alpha7$ nAChR chimera in \\textit{E. coli}.", fig.align='center', fig.align='center', out.height = '60%', echo=FALSE}

### Purification of the $\alpha$ 7 ECD chimera protein

The expression and purification process was repeated with the modified conditions. That is, 1. lower IPTG concentration and low temperature during the expression, 2. extended sonication time, 3. overnight equilibration of the soluble fraction with resin and 4. increased amount of resin used were. 

To determine whether modified conditions have an effect on the protein purification efficiency, samples were collected during the purification procedure. Following expression of the protein at 37 &deg;C induced with 0.2 mM IPTG, cells harvested from 1 L of culture were sonicated and centrifuged at 16000g. The centrifugation precipitate sample was run on the SDS-PAGE (Whole cells). The supernatant was spun down again at 100000g to collected a soluble fraction (Load) which was subsequently incubated with 1 mL of Ni^2+^-NTA resin (binding capacity of...mg) overnight. The mix was decanted onto the chromatography column. This was followed by 3 washes in 10 mL of buffer and 1 mL of 0.2 mM imidazole-containing buffer to generate 5 distinct eluate fractions obtained from the Ni2+-NTA IMAC samples. Collected samples were prepared and run on the SDS-PAGE gel (Figure \@ref(fig:hua7-expression-gel-label)). 

This showed a band of the expected size of 84 kDa was present in eluate samples suggesting successful purification of $\alpha7$-ECD chimera together with few other contaminants, the 50 kDa one being the most prominent, as judged by the staining intensity. 

Intensely stained band corresponding to the size of the $\alpha7$-ECD chimera is also seen in the whole cell and the Load fractions, suggesting that the Significant proportion of the induced protein was lost during centrifugation steps. 

<!-- In comparison to the previous results, the intensity of $\alpha7$ ECD chimera protein bands is greater here, indicating a higher amount of protein present. This suggests that the introduced protocol changes do indeed improve the purification efficiency. However, since these changes were all introduced at once, it is difficult to state which ones were the most beneficial.  -->

(ref:hua7-expression-gel) **Coomassie stained SDS-PAGE gel of samples collected during purification of $\alpha7$ ECD chimera protein.** *E. coli* cells were grown in 1L of TB. Protein expression was induced with 0.2 mM IPTG and proceeded overnight.

```{r hua7-expression-gel-label, fig.cap="(ref:hua7-expression-gel)", out.height = '30%', fig.scap = "Coomassie stained SDS-PAGE gel of samples collected during purification of $\\alpha7$ ECD chimera protein.", fig.align='center', echo=FALSE}

During preparation of samples for the SDS-PAGE, samples are heated resulting in protein denaturation and disintegration of individual subunits in multimeric complexes (ref). Therefore, based on obtained SDS-PAGE results, it is impossible to state whether expressed $\alpha7$ ECD chimera is monomeric or pentameric. It is crucial for the recombinant protein to form multimeric complexes, because the nAChR ligand binding sites are on the interface of two neighbouring subunits. Blue native PAGE gel of non-denatured and non-reduced samples was run to allows for separation of proteins based on their mass and charge. The bands were visualised using Coomassie stain.

Eluate collected (Figure \@ref(fig:hua7-expression-gel-label)) were pooled. Two samples were prepared, one of which was boiled for ~ 5 minutes to denature multimeric proteins into their constituent subunits. The boiled sample should contain proteins only in the monomeric form, whereas non-boiled sample should contain proteins in their native state. Boiled sample was cooled and together with the un-boiled one, run on the native gel with the aim to determine whether there are any high molecular weight bands selectively present in the un-boiled sample (Figure \@ref(fig:native-gel-label). 

Boiled sample contains a single band of ~ 55kDa. There is also a corresponding band in the unboiled sample. A clear and strong staining present at the top of the gel produced from the un-boiled sample is also evident. This could represent a multimeric form of the $\alpha7$ ECD chimera. The size of a pentamer is 420 kDa, therefore it is possible that the electrophoresis was run not for long enough to allow the protein to enter the gel. Alternatively, the staining could represent protein aggregates.

<!-- Protein samples run on SDS-PAGE gel undergo denaturation. To determine if $\alpha7$ ECD chimera is purified as a monomer or oligomer, purified samples were run on a native gel.  -->

<!-- Samples were boiled and unboiled. Boiling on the protein leads to denaturation, hence this sample should complain monomeric proteins exclusively. In contrast, unboiled samples could contain monomeric and oligomeric proteins. Alongside, bovine serum albumin (BSA) was run as a size marker. BSA is a monomeric protein of 66.5 kDa- similar to the size of a monomeric $\alpha7$ ECD chimera. It also forms dimers on native gel of 127 kDa [].  -->

<!-- Comparing boiled and unboiled samples on the native gel, there is no difference in the distance travelled by bands present. Both samples produced bands. These bands are between the monomeric and dimeric BSA. This means, their size is larger than 66.5 kDa, but smaller than 127 kDa. These bands probably represent a monomeric version of the protein of 76 kDa.  -->
<!-- Interestingly, unboiled sample contains a staining at the well entry. This may suggest there are protein aggregates, or alternatively a protein of large size which has not entered the gel. The latter is unlikely, as the boiled sample does not contain such staining. Therefore it is likely that expressed protein forms aggregates.  -->

(ref:native-gel) **Commasie stained Native Blue PAGE gel of $\alpha7$ ECD chimera eluates.** Boiled and unboiled eluate samples of eluted $\alpha7$ ECD chimera protein (Figure \@ref(fig:hua7-expression-gel-label)) were run on native non-denaturing gel alongside molecular weight markers (Mw) and un-boiled Bovine Serum Albumin (BSA) sample of 66.5 kDa.

```{r native-gel-label, fig.cap="(ref:native-gel)", fig.scap='Commasie stained Native Blue PAGE of denatured and native elaute samples collected following the $\\alpha7$ ECD chimera purification', fig.align='center',out.width='30%',echo=FALSE}

Size-exclusion chromatography, also known as gel filtration, is a complementary method for accessing the size of protein. The advantage of this method over PAGE is that the size estimation is much more accurate and the separation range is much greater (in this case, 10 - 600 kDa). This procedure uses matrix filled column, containing pores of defines sizes. Loaded proteins can travel through the column at a defined speed, depending on their molecular weight. There is a reversal relationship between the molecular weight and the motility rate. That is, smaller molecules travel slower and are eluted later from the column, in comparison to the larger ones. Proteins are detected by spectroscopy because their amide bonds absorb at 280 nM. The result is a spectra of the absorbance against the eluted volume (Figure \@ref(fig:standard-curve-gel-filtration-label)). 

To estimate the size of proteins present in a sample, the standard curve was generated using (Section \@ref(calibration)). The homogeneous solutions of proteins of known sizes were run to derive their spectra. The proteins used were: trypsin of 23.3 kDa, chicken serum albumin of 47.5 kDa, bovine serum albumin of 66.5 kDa and Dextrin which forms large aggregates. The peak position as a function of volume eluted was derived and normalised to the peak position of the void (aka protein which does not enter the column pores, but passes straight through). The the normalised peak positions for blue dextran, BSA, Chicken Serum Albumin and Trypsin were 0, 5.25, 6.90 and 7.75, respectively (Figure \@ref(fig:protein-standard-label)). These values were plotted on a logMw against normalised peak position graph and to produce an equation of a staright line of y = -0.17 x + 2.78, where y is the log molecular weight of the protein and x is the normalised peak position. This equation was then used to calculate the size of the proteins present in $\alpha7$ ECD chimera. 

<!-- {r figure-1, fig.cap="(ref:native-gel)", fig.scap='Commasie stained NAtive Blue PAGE of denatured and native elaute samples collected following the $\alpha7$ ECD chimera purification', fig.align='center',out.width='\\textwidth',echo=FALSE} -->

(ref:standard-curve-gel-filtration) **Calibration curve for molecular weight determination by gel filtration.** 1 mL of standard proteins were applied to the column. Blue dextran was used to determine the void volume.

```{r standard-curve-gel-filtration-label, fig.cap="(ref:standard-curve-gel-filtration)", out.width= '80%', fig.scap= "Calibration curve for molecular weight determination by gel filtration." , fig.align='center', echo=FALSE}

To prepare samples, protein was expressed in 1 L of the growth medium and purified using optimized protocol. Eluate samples collected following Ni^2+^-NTA IMAC were pooled and concentrated to 500 $\mu$L. The final concentration of the sample was 3 mg / mL, as measured by spectoscropy. This sample was run on the SDS-PAGE gel (Figure \@ref(fig:gel-filtration-eluate-label) a). 

mk11g11's avatar
mk11g11 committed
(ref:filtration-gel) **Expression and purification of $\alpha7$ ECD chimera for size-exclusion chromatography**. SDS-PAGE gel of samples collected during protein expression and purification.
mk11g11's avatar
mk11g11 committed

```{r filtration-gel-label, fig.cap="(ref: filtration-gel)", fig.scap="Expression and purification of $\\alpha7$ ECD chimera for size-exclusion chromatography", fig.align='center', echo=FALSE}

A protein of desired size of 84 kDa was present, as well 50 and 25 kDa bands. The sample was run through the filtration column and the peak spectra was derived (Figure \@ref(fig:gel-filtration-eluate-label) b). The highest peak was eluted at 16.20 mL. Normalised to void, this is 6.72 mL. This equates to 42.6 kDa. There are also small peaks: one eluted at 22 mL, and the other at 26.70 mL. These proteins are below 10 kDa. A small peak can be also seen at ~ 10 mL which overlaps with the void peak and may represent aggregated proteins. 

(ref:gel-filtration-eluate) **Estimation of size of proteins present following $\alpha7$ ECD chimera expression and purification.** SDS-PAGE gel (a) and gel filtration spectra of concentrated $\alpha7$ ECD chimera eluate.

```{r gel-filtration-eluate-label, fig.cap="(ref:gel-filtration-eluate)", fig.scap = "Estimation of size of proteins present following $\\alpha7$ ECD chimera expression and purification", fig.align='center', echo=FALSE}

<!-- ```{r expression-table, echo=FALSE} -->
<!-- library(kableExtra) -->
<!-- library(dplyr) -->
<!-- phusion_components <- data.frame( -->
<!--   Construct = c("Hua7-2GUV", "Hua7-1VR4", "Ap a5-2GSC", "Ap a5-2GUV", "Ap a5-1VR4", "ACR-21-2GSC", "ACR-21-2GUV", "ACR-21 1VR4"),  -->
<!-- Expression = c("YES", "YES", "YES", "YES", "YES", "YES", "YES", "YES"),  -->
<!-- IMAC yield = c("", "", "", "", "", "", "", ""), -->
<!-- Column yield = ("", "", "", "", "", "", "", "")) -->

<!-- phusion_components %>% -->
<!-- mutate_all(linebreak) %>%  -->
<!-- kable("latex", booktabs = T, escape = F,  -->
<!-- caption = "Expression and purification of ECD of nAChRs in *E. coli*".) %>% -->
<!--   kable_styling(latex_options = "hold_position") -->
<!-- ``` -->


## Discussion

This chapter aims to determine whether E. coli BL21(DE3) cells are appropriate for the expression of ECD of nAChRs. This was done with a view to characterize candidate neonicotinoid binding sites. The difficulties in expression and purification of recombinant proteins, hinders their structural analysis and hence identification of molecular interactions between the target and the ligand.

There are several host systems available for the production of nAChRs and related proteins. For example, yeast cells have been successfully used to express assembled and folded mammalian $\alpha2$ [@kouvatsos2016], $\alpha1$ [@dellisanti2007] and $\alpha9$ [@zouridakis2014] LBD of nAChRs and nAChR LBD structural surrogate mollusc AChBP [@hilf2008, @hilf2009]. Functional mammalian $\alpha4\beta2$ receptors were successfully express and subsequently purified from both mammalian [@morales-perez2016] and insect cell lines [@kouvatsos2014]. *E. coli* is an attractive alternative due to the relative low cost of use and ease of manipulation. The successful expression of folded AChBPs [@abraham2016] and full length human $\alpha7$ [@tillman2016] was achieved in *E. coli* cells.

The suitability of *E. coli* as an expression system for nAChR ECD was tested by expressing and purifying human $\alpha7$ ECD - chimera protein.

### Expression and purification of $\alpha$ 7 ECD chimera yields product of the correct size

The initial experiments were carried out to determine whether and under what conditions can the expression of $\alpha7$ ECD - chimera can be achieved. Two conditions were tested: rapid expression at 37 &deg;C and 0.5 mM IPTG and slower expression at 18 &deg;C and 0.2 mM IPTG. The cells expressing $\alpha7$ LBD - chimera were collected from both conditions. Pre- and post-induction samples were run on the SDS-PAGE gel and Coomassie stained to resolve and visualise proteins (Figure \@ref(fig:hua7-WB-label)). $\alpha7$ LBD - chimera was authenticated by Western Blot using anti-HIS antibodies. Clear band of 84 kDa present in the induced samples, confirming successful expression. Greater intensity of the band from 18 &deg;C and 0.2 mM IPTG suggest this is a favourable condition for the expression of $\alpha7$ LBD - chimera. Lower temoerature and IPTG concentrations were also beneficial for the expression of other $\alpha7$ LBD construct [@abraham2016].
In adition, to the band representing $\alpha7$ LBD - chimera there was also an induction of the 50 kDa protein, which is likely a proteolytic fragment with the HIS-tag, based on immunoreactivity. 

<!-- It was the thickness at 2 hrs and smallest at 6 hours. This suggests the protein is either degraded, or the expression of $\alpha7$ LBD - chimera induced with high IPTG concentration was toxic to cells. The toxicity could be detected by measuring the optical density of the culture. The decreasing values would suggests $\alpha7$ LBD - chimera is toxic. Toxicity of cell due to the expression of eukaryotic proteins is a common problem during the protein expression [dumon-seignovert2004]. This may be a result of aggregation of large amount of mRNA in the cell which saturates the translation and translocation system [@wang2011a]. Sample from the overnight expression induced by 0.2 mM was collected 16 hrs after the expression was induced. A band of the size corresponding to the $\alpha7$ ECD - chimera was detected. In comparison to the 2 hrs fast expression band, it was thicker, suggesting induction of expression with 0.2 mM IPTG and expression at lower temperature give rise to higher expression yields. Temperature and IPTG concentration are important determinants of the soluble protein expression [@san-miguel2013]. Low temperature and IPTG concentrations were also beneficial in expression of another $\alpha7$ ECD chimera [@abraham2016].  -->

Western Blot with anti-HIS antibodies detected the presence of HIS-tagged proteins with immunoreactivity consistent with the expressed size of $\alpha7$ ECD - chimera. However, purification results showed that the proportion of induced protein availble for binding was disappointing. The expressed protein was lost during the purifcation procedure, some was precipitated following centrifugation of sonicated cells (Figure \@ref(fig:expression-conditions-test-label) Whole Cell sample), suggesting the formation of inclusion bodies. In addition, some expressed protein failed to bind to the nickel resin, potentially due to misfolding or aggregation. The expression of remaining contstruct was even more challenging, with a smaller proportion of the ECD - chimera purified.

mk11g11's avatar
mk11g11 committed
### Analysis of the quaternary structure.
mk11g11's avatar
mk11g11 committed

To determine whether the protein was purified as a pentamer, Native-PAGE gel was run, which enables separation of folded and assembled proteins on the gel. Two samples were prepared: one containing denatured by boiling proteins and the other containing non-denatured, un-boiled proteins. A clear staining of high molecular weight proteins was observed in the un-boiled sample, but not in the boiled sample, suggesting purification of multimeric proteins. The presence of high molecular weight protein was not confirmed by the gel filtration. Therefore further experiments are needed to investigate whether expressed $\alpha7$ ECD chimera forms pentameric structures. This could include binding of radio labelled ligands, such as $\alpha-bgtx$ [@barnard1971; @carbonetto1979; @clarke1985].

<!-- ### Expressed protein is prone to the proteolytic cleavage.  -->

<!-- Additional information can be extracted about the stability of expressed proteins by the analysis of sizes of protein present in the eluates following $\alpha7$ ECD - chimera expression and purification. SDS-PAGE and Western Blot revealed there is a number of HIS-tagged proteins of smaller than $\alpha7$ ECD chimera purified. Since HIS-tag is on the N-terminus, these bands represent the N-terminal portions of the protein. The major band detected is of the size of ~ 50 kDa, which corresponds to the size of the HIS-MBP. This may suggest that there is a proteolytic site in the MBP-$\alpha7$ ECD linker sequence, or at the N-teminal site of the $\alpha7$ ECD. However, due to the poor availability of substrate sequences for *E. coli* proteases, this is difficult to predict. An alternatively explanation for the C-terminal proteolytic degradation is that $\alpha7$ ECD-2GSC does not fold correctly and is unstable. 2GSC is a bacterial 4-helical structure forming pentamers [@lin2006]. The crystal structure was derived from proteins expressed and purified in *E. coli* cytoplasm, whereas in this study it was targeted to the periplasm [@lin2006]. Targeting to the periplasmic space may have a detrimental effect on the stability and solubility of the protein (@latifi2015), leading to protein degradation. It would be therefore beneficial to further optimise the purification protocol to increase stability of the protein, for example by growth of transformed *E. coli* cells with $\alpha7$ ligands to stabilise $\alpha7$ ECD. An addition of heat shock inducing compounds to promote formation of chaperones could be also beneficial [@tillman2016].  -->

In summary, this chapter validates the use of *E. coli* as a system for the expression of $\alpha7$ ECD and highlights the need for further optimisation to improve stability and purification efficiency of the recombinant protein. 

<!-- Analysis of the proteins by the SDS-PAGE requires denaturation of proteins prior to loading samples on the gel. Therefore all proteins are in the monomeric state. To establish whether eluates collected post expression contain proteins of various sizes, native-SDS-PAGE was run. This method allows for the analysis of eluate samples without the necessity of denaturing. 2 samples were prepared: one boiled (hence containing denatured, monomeric proteins), and the un-boiled (hence containing non-denatured, potentually multimeric proteins). The shift in the distance migrated by proteins on the gel were examined. No differences were observed, suggesting all protein is monomeric state. -->

<!-- In conclusion, this chapter validates the  -->
<!-- ###Finally, some changes to the protocol to be made and future directions  -->

<!-- 1. Resticted flexibility due to tagging to other proteins. - MBP was used as a tag during AChBP exprression. It waspurified as a pentmer, therefore shpuld not interfere with pentamer formation. 2GSC is on the C-terminal end of the protein. C-terminal of $\alpha7$ ECD is linked to the N-terminal end of the 2GSC by a 8 amino acid linker. 2GSC has similar architecture to the transmembrane domain of the nAChRs, however there are also some differences. For example, Therefore tagging $\alpha7$ to 2GSC may have restricted the flexibility to $\alpha7$ ECD resulting in insufficient interactions between the subunits and inability of these to form pentamers.  -->

<!-- 2. Misfolded proteins will not form pentamers - Why would they be missfolded? May imprtant sequence for oligomerisation in the ECD [@sumikawa1992; @sumikawa1994; @kreienkamp1995]. However, there may also be some sequences within the TM domain [@wang1996] of $\alpha7$ nAChRs important for oligomerisation. These are missing in this constructr. whilst the expression of ECD of some nachrs was successful [@dellisanti2007; @zouridakis2014], the expression of full lenght $\alpha7$ has only been reported [@tillman2016]. Expression of alpha and delta subunits in alpha-delta heterodimers. Expression of ECD of alpha 1 with other muscle subunits folds but does not associate with delta subunit [@wang1996a]. Inclusion of M1 seq only was enough to rescue oligomerisation of alpha1 with delta.  -->
<!-- Folding of nAChRs is a complex process, with as many as 55 potential interacting partners. Maturation with many interacting partners through ER-Gogi process. In *E. coli* it is released to the periplasm as it is being synthesised. Chaperons identified: Skp captures unfolded proteins and aids their correct maturation, also other proteins including PPIases, SurA, FkpA, among others [reviewed in @baneyx2004].  -->

<!-- Another way of assesing the size of proteins present in the sample is by gel-filtration. Size of the protein is In this method, protein sample is loaded onto the porous-gel filled column. The time taken for the protein to travel across the column is inversly correlated to the molecular weight.  which allows for estimation of the molecular weight of proteins present in the sample. The major protein species of the size of 50 kDa. proteolytic cleavage ?  -->

<!-- Detection of proteins of sizes exceeding pentameric form of ECD of alpha7 fused to MBP also found here [@fischer2001].  -->

<!-- Note that periplasmic signal sequences destabilisies proteins [@singh2013].  -->

<!-- Better to use native signal sequences such as MA [@samant2014]. -->

<!-- Further experiemts are needed to determine whether this construct is suitable for generation of folded pentameric $\alpha7$ ECD. One would be binding to bgtx.  -->

<!-- Strategies to stabilise the protein - expression of chaperon proteins and ligands - for example cholie and sorbitol used succesfully to express full lenght nAChR [@] -->
<!-- ###Expression and purification of  $\alpha7$ ECD chimera yields product of the correct monomeric size -->

<!-- 1. Modify the protocol - purification at 4 $deg;C thruoghpout to slow down the rate of proteolysis - PROBLEM OF DEGRADATION - test this system first! -->
<!-- 2. Use several costructure containing many different nachrs subuntis and screen for the one that can be expressed in *E. coli*. Expression in *E. coli* [@jia2016].  -->
<!-- 1. Try different construct - try expressing HPSD, MBP, $\alpha7$ in isolation to determine what effect they have on *E. coli*, whether high quantity of folded proteins can be generated. Then combine and see whether tagging to each has an effect. -->
<!-- 2. Use defined protocols for the expression of nAChRs - some of which include *E. coli*. However, these had problems too  -->
<!-- Alternative may be to Try different host - most structure come from proteins expressed in yeast because they are capable of post-transcriptional modification, such as glycosylation imp in processing of nAChRs.  -->

<!--  did similar things - fused rat $\alpha7$ ECD to three stuctures assembling into pentamers. No stable proteins and all deposited into inclusion bodies. -->
<!-- @utkin2001 @fischer2001 ECD of $\alpha7$ some pentamers,but mostly aggregated proteins.  -->
<!-- Column exceeds  -->
<!-- Also aggregates - This is a common problem when eucaryotic proteins expressed in *E. coli*. These two factors often need to be optimised, because recombinant proteins tend to misfold and form aggregates in the inclusion bodies or in the cytoplasm and periplasm of *E. coli*. MBP was used as a tag for expression improvements for many proteins in *E. coli* []. 2GSC were expressed to high levels in *E. coli* cells for structure analysis [@lin2006], therefore $\alpha7$ ECD may be the issue here.  -->

<!-- ###Protein is prone to degradation -->

<!-- ###*E. coli* may not be suitable for expression of nAChR - why?  -->
<!-- Folding -->
<!-- Processing -->
<!-- Other systems should be tested  -->
<!-- Alternatively different construct  -->

<!-- More soluble version of nAChR is also available. @zouridakis2009 introduced several changes to the  -->
<!-- Also modification of the genetic code to mutate AAs may improve solubility. -->

<!-- MBP fusion $\alpha7$ ECD [@fischer2001] and ELIC [@hilf2008] and glutathione S‐transferase (GST) $\alpha7$ ECD [@utkin2001]. -->

<!-- 2. Expression of AChBP in cytoplasm of BL21(DE3) cells [@abraham2016] -->

<!-- 3. ELIC expressed in C43 E. coli cells. ELIC fused to MBP on the N-terminal end [@nys2016] -->

<!-- 4. ELIC: His-Tag, MBP, 3C protease site and the ELIC sequence [@hilf2008]. -->

<!-- 1. Expression of the full length alpha7 nachr receptor from T7 promoter plasmid. In RosettaBL21(pLys) cells. With pelB sequence [@tillman] -->

<!-- Alternatively, the linker may not provide sufficient solubility and flexibility for the $\alpha7$ ECD to fold properly, resulting in protein degradation. $\alpha7$ ECD and MBP are linked by an 8 amino acid long sequence: Gly-Glu-Val-Glu-Gln-Pro-Leu-Glu. This sequence was used for the production of ELIC [@hilf2008], and contains features shared among many other synthetic linkers used in molecular biology and native linkers naturally occurring in biological systems [@chen2013]. Linkers connecting recombinant proteins can play a major role in the efficiency of the expression (reviewed in @chen2013), therefore they are commonly optimised for different fusion proteins. This includes the optimisation of the amino acid number and sequence to alter linker's primary and the secondary structure. -->