The impact of Hfq-mediated sRNA-mRNA interactome on the virulence of enteropathogenic Escherichia coli

Description

condition, the non-activating cultures were diluted 1:100 into Dulbecco's modified Eagle medium (DMEM, Biological Industries) and grown at 37°C without shaking to an OD600 of 0.3. Epithelial cells, HeLa, or HEK293, were grown in DMEM supplemented with 10% fetal calf serum (FCS; Biological Industries) and antibiotics (penicillin-streptomycin solution; Biological Industries). A cell line stably expressing GFP was generated with GFP expressing lentivirus as described (69).

Supplementary
NE8878 E2348/69 containing cesT-flagx3 translational fusion; constructed by lambda red knockout of tir-cesT intergenic region (IGR) and cesT This study CDS using tetA-sacB cassette, followed by a pop-out of the latter using a tir-cesT IGR and cesT CDS:flagx3 insert created by PCR and ligation.

NE8810
E2348/69 cesT*; constructed by lambda red knockout of tir-cesT IGR and cesT CDS using tetA-sacB cassette, followed by a pop-out of the latter using a mutated tir-cesT IGR and cesT CDS insert created by PCR and ligation.

Northern blots
For RNA extraction, cultures in the indicated conditions were centrifuged at 4°C and the pelleted cells were resuspended in 50µl 10 mM Tris-HCl (pH 7.5) containing 1 mM EDTA.
Lysosyme was added to 0.5 mg/ml, and samples were subjected to three freeze-thaw cycles.
RNA was extracted using TRI Reagent (Sigma) according to manufacturer instructions. RNA samples (10µg) were denatured for 10 min at 70°C in 65% formamide, separated on 7 M urea, 6% polyacrylamide gels in 44.5 M Tris-base, 44.5 M Boric acid and 2 mM EDTA pH 8.0, and transferred to Zeta-Probe membrane (BioRad) by electroblotting. The membranes were hybridized with specific [ 32 P] end labelled DNA probes. The probe sequences are listed in Supplementary Table S6.

RNA extraction and real time PCR
Total RNA was extracted using TRI reagent (Sigma) according to the manufacturer's instructions. RNA (1.5 µg) was treated with RQ1 DNase I (Promega) at a concentration of 1U/µg RNA for 30 min at 37°C. DNase I was inactivated by adding 1µl of stop solution and heating the samples for 15 min at 65°C. DNA digestion was verified by PCR, using primers #1952 and #1953. cDNA was synthesized using a qPCRBIO high-quality cDNA synthesis kit (PCR Biosystems) and quantified by Real-Time PCR using iTaq Universal SYBR Green Supermix (Bio-Rad) with a CFX96 Real-Time System (Bio-Rad) according to the manufacturer's instructions. The level of 16S rRNA (rrsB) was used to normalize the expression data for mgrR and cesT. The relative amount of cDNA was calculated by the standard curve method (ΔΔCq), which was obtained by PCR of serially diluted genomic DNA as standard and analyzed using Bio-Rad CFX maestro software.

Western blots
Bacterial cultures OD600 was measured and similar amounts of cells were precipitated, resuspended in 1X Laemmli sample buffer (Bio-Rad) and boiled for ten minutes. The extracts were analyzed by western analysis, using the primary antibodies that are listed in Table S12.
The loading amounts were further normalized by the total protein amount as recorded from stain-free gels imaging (Bio-Rad) or by Coomassie or Ponceau red staining.

Determination of GFP fluorescence intensity
Bacteria were grown in DMEM as indicated above. The cultures were then washed and suspended in phosphate buffered saline (PBS). Then, fluorescence intensity of the GFP was measured (485-nm excitation and 510-nm emission) using a Spark 10M microplate reader (Tecan) and normalized according to the optical density (OD600). Each experiment was performed in triplicates, and the mean value of their normalized fluorescence intensity was calculated.

Infections and microscopy analysis
HeLa cells were seeded in a 24-well plates (Nunc) .

Bacterial Attachment assay:
HEK-293T cells stably expressing GFP were infected by bacteria cells. At 3 h post infection the infected cells were washed from unattached bacteria and detached host cells and the levels of remaining mCherry (bacteria) and GFP (cells) were measured. The mCherry/GFP ratio was used as readout for bacterial attachment.

Genome annotation
RIL-seq exploits genome annotation of genomic features at a nucleotide resolution (16, 25).
For EPEC annotation, genes were initially taken from the gene.dat files of the EPEC E2348/69 genome version 19 and the three plasmids it contains: NC_011601.1, NC_011602.1, NC_011603.1 and EU580135 (taken from NCBI). To these we have added the sRNAs we identified to be homologous to the ones known in E. coli K-12 MG1655 (28, 29).
For these added sRNAs, gene name was adapted from the K-12 MG1655 orthologue (Supplementary Table S4). Also added are the newly identified Pas sRNAs. (The Pas sRNAs were discovered in a run of RIL-seq in which they were not included specifically, but as intergenic or antisense features.  (74)), the first and last genes were identified and 5'/3'UTR regions were assigned as follows: If there was a homologous UTR from K-12 MG1655, its annotation was used (named 5/3UTR, respectively), otherwise 100 nucleotides upstream the AUG (or less if there was a gene end in this range) were designated as the 5'UTR (named EST5UTR, for estimated 5'UTR). Similarly, 100 nucleotides downstream the stop codon were designated as the 3'UTR (named EST3UTR). The GFF file that includes all genes used for RIL-seq data analysis can be found in Supplementary Table S8.
For the RNA-seq analysis, we were interested in the differential expression of the individual genes, and therefore only considered reads that were mapped to genes. The annotations were extracted accordingly from the above described GFF file (Supplementary   Table S8) to generate the GFF file that was used for the RNA-seq analysis. It is of note that in the RNA-seq data analysis, reads with ambiguous mapping, overlapping two adjacent annotations, were excluded.
The GFF file was constructed in-house for EPEC E2348/69 genome based on the BioCyC version 19.0 (74) and two of the plasmids it contains: NC_011601.1 (chromosome), NC_011602.1, NC_011603.1. Genes encoded on the smaller plasmid EU580135 were inserted manually. New sRNAs were also inserted as explained above.
Each line in the file contains the following information:

Generation of network images
The network images were generated using Cystoscape (75). The networks include the interactions listed in the summary sheet of Supplementary Table S3. We included in the networks only interactions (edges) that passed the statistical filter in at least two of the triplicate libraries of the indicated condition (activating/non-activating condition). Self-edges were removed.
To get only the virulence-associated interactions, we have selected all nodes representing RNAs mapped to accessory genome regions. We define accessory genome regions as those residing on plasmids, prophages and integrative elements listed in Table S2 of Iguchi et al (33). Then, the fragments found to be interacting with these RNAs according to RIL-Seq were added to the selection. All interactions identified for this set of nodes were included in the virulence-associated networks. Note that this selection means that an interaction of two "core" nodes may be included, as long as each of them also interacts with an accessory genome node.

Supplementary Figures
206 Fig. S1: EPEC hfq deletion mutant exhibits unstable BFP expression (A) The flow of construction and analysis of the hfq mutants Upper panel: Two independent hfq deletion mutants, designated C1, C2, were constructed and deposited to frozen stocks (indicated by red-cup tubes). Bacteria from the frozen stocks were streaked on a plate and a colony of each mutant was transformed with a plasmid expressing hfq (pZS*hfq) to create strains C1+, C2+ (upward arrow). Colonies derived from C1 and C2 mutants, designated C1a-d and C2a-c, were further cultured in LB (downward arrow). All derivatives of C1 and C2 were deposited to frozen stocks. Lower panel: For analysis, bacteria from the frozen stokes were streaked and colonies were inoculated to LB, followed by sub-culturing in DMEM to obtain cultures grown under nonactivating and activating conditions, respectively.
(B-C) Analysis of hfq mutants and derivatives by western blots (B) The levels of two representative proteins in the bacteria were tested: Tir, the major T3SS effector, and BfpA, the pilin subunit of BFP. Tir and BfpA levels were compared between wild type, hfq mutants (C1, C2) and respective complemented mutants (C1+, C2+). Total protein extracts from cultures grown under non-activating (NA) and activating (AC) conditions were subjected to western blot analysis with anti-Tir and anti-BfpA antibodies. As loading control (Control) we used total protein separated on SDS-PAGE (Bio-Rad stain free gel). Molecular size markers are shown in the right-hand side. The results show that in wild-type EPEC the expression of both Tir and BfpA is higher under the activating condition as compared to the non-activating condition. In the hfq mutants the levels of Tir and BfpA are increased in cultures grown under both conditions. Upon hfq complementation, the expression levels of Tir were restored to the level seen in the wild type bacteria. In contrast, BfpA amount in the complemented strains dropped below detection levels. Given that BfpA overexpression was shown to cause an envelope stress (17), our results hint that during the construction of the complemented strains suppressor mutants that eliminate BfpA expression become dominant in the culture. (C) To test if suppressor mutations arise during growth of the hfq mutants we have repeated BfpA analysis using several isolated colonies derived from the C1 and C2 cultures (C1a-d and C2a-c, see (A)). Indeed, six of the seven tested colonies have lost BfpA expression, reinforcing the premise that the hfq mutant has acquired suppressor mutations that eliminated BfpA expression. (D-E) Whole genome sequencing (WGS) of hfq mutants To assess whether the tested derivates of the hfq mutants contain suppressor mutations, we have extracted DNA from the cultures and subjected it to whole genome sequencing (WGS). The steps until DNA extraction for each of the strains used are shown. Tables detail the genotype differences compared to the wild type and their frequency in each sample, according to the WGS analysis. The WGS revealed two types of suppressor mutations: (i) A frameshift mutation in the perA gene that encodes the positive regulator of BFP. Sequence alignment of perA from wt and Δhfq C1 mutant, shown in (E), indicates a frameshift mutation in the perA gene due to deletion of one nucleotide within a stretch of eight T residues, (highlighted in pink). This mutation was present in 35% of the reads while the other 65% showed the wild type sequence, indicating C1 is a mixed culture containing the wild type and a suppressor variant. Analysis of four isolated colonies derived from the C1 mutant (C1a-d) showed that one exhibits wild type genotype (C1b), while all the others contain only the mutated perA. This result is in agreement with the blot shown in (C).
(ii) Curing of the pMAR2 plasmid that contain all the BFP genes as well as the perABC operon involved in BFP regulation. The C2 mutant had very few reads mapped to pMAR2 and DNA extracted from all three derivatives, C2a-c, completely lacked pMAR2 sequences. Taken together, our results reinforce previous reports suggesting that Hfq functions to repress BfpA expression (17). Furthermore, our data suggest that BfpA overexpression due to lack of Hfq is unfavorable in EPEC, possibly by inducing envelope stress (17), leading to rapid accumulation of secondary events that eliminate BFP expression altogether. Thus, to avoid possible variabilities stemming from loss of BFP expression we have used a derivative of C1 strain, in which we removed Km R , as our Δhfq strain for transcriptome analysis.  . Targets under both activating and non-activating conditions were analyzed, the most statistically significant motif was taken, and its E-value is reported. The motif was further tested to see if it complements a sub-sequence of GlnZ by searching a match to the motif on the reverse complement sequence of the sRNA, using MAST (78) (pink region in GlnZ sequence). The motif is presented 5' to 3'. (B) Alignment of glnA 3′UTRs of selected Enterobacterial genera is shown below. Conserved nucleotides are marked by * with shading. The end of the glnA ORF is conserved (light blue) as well as the suggested binding site (pink). The stop codon of the K-12 MG1655 glnA sequences is in lower case letters for reference. The Rhoindependent terminator is highlighted in grey. Arrows indicate the arms of the terminator stem. Note the first arm of the stem has a variable location.

Fig. S5. Common motifs in the target sets of newly identified sRNAs and MgrR are complementary to the putative sRNA sequence
As described in Supplementary Fig. S4, MEME (38) was used to identify a common sequence motif in the target sequences of newly identified sRNAs, and the motif was considered only if it was found to complement a sub-sequence of the sRNA. The latter was done by searching a match to the motif on the reverse complement sequence of the sRNA, using MAST (78), followed by manual evaluation. Putative binding sites on the sRNA are in bold pink. The motif is presented 5' to 3' and its E-value is reported. (A) Newly predicted sRNAs encoded in the core genome. (B) Newly identified sRNAs encoded in the accessory genome. (C) The common motif extracted for MgrR target sequences.  For each sRNA the targets identified under each growth condition were divided into two groups: Shared -targets that were found to interact with the sRNA in both the activating and non-activating condition; Specific -targets that interacted with the sRNA only in the respective condition. Colored bar heights represent the relative fractions of S-chimeras associated with each sRNA, while the numbers indicate the number of different interactions (i.e., targets). For all three sRNAs most S-chimeras represent targets identified in the two conditions.

Fig. S8. MgrR inhibits CesT translation in a dose dependent manner
(A) Total protein was extracted from cultures of wild type EPEC, single mutants ΔmgrR and cesT* and a double mutant cesT*ΔmgrR, grown under the activating condition. Proteins were then analyzed by Western blot using antibodies raised against Tir, CesT and intimin. Stain-Free gel of total protein was used as a loading control. (B-D) Wild type EPEC containing a chromosomal Flag-tagged cesT was transformed with plasmids expressing MgrR and LacI (pZE12-MgrR, pREP4). Bacteria were statically grown overnight in LB, sub-cultured in DMEM, and grown for 2h to an OD600 of ~0.1. Next, cultures were treated with increasing concentrations of IPTG (final 0, 0.01, 0.05, 0.1 or 0.5 mM final) and grown for additional 2h to an OD600 of ~0.45. As a control, a culture of the same strain which did not carry the MgrR and LacI expressing plasmids was grown to the same growth phase without IPTG treatment. Culture samples were used for total protein and total RNA extraction. Note that the different lanes/bars are the same for B-D and their annotation appears below D. (B) CesT-Flag levels analyzed by western blotting. Coomassie staining of the total proteins was used as loading control. Increasing IPTG concentrations are indicated by black triangle. Protein size is indicated. (C) Quantitation of MgrR levels in total RNA using quantitative PCR. The value of MgrR level in the control sample (black column) was set as 1. Presented are the ratios between MgrR levels in each of the samples, normalized by 16S rRNA levels, and the control sample.