Convenient synthesis and delivery of a megabase-scale designer accessory chromosome empower biosynthetic capacity

The design principle for selecting accessory genes and promoters

We categorized the 1715 accessory genes from the pan-genome of 1011 yeasts23 into two groups: orthologous and non-homologous. Among the orthologous genes, we selected 359 genes who share > 80% similarity with S288C through bi-directional Blast hit (BBH) analysis.29 For non-homologous genes, we selected 183 genes which are not significantly similar to S288C using the standard Blast web service and present in more than five strains among the 1011 isolates.

According to the types of accessory genes, we also classified the promoters into two categories: endogenous promoters and synthetic promoters. The promoter of orthologous accessory gene was derived from the corresponding gene’s promoter in S288C. To avoid the repetitive use of synthetic promoters, we leveraged the most extensive synthetic promoter library available, encompassing a vast pool of 100 million synthetic promoters.31 Specifically, we chose the top 183 promoters based on their transcriptional strength in the library.

Construction of strains with modified centromeres

The S. cerevisiae BY4742 was used as the starting strain, in which a Cas9 uniform target site XT2 and a URA3 gene were inserted adjacent to the centromeres of all 16 chromosomes. Construction of the 16 chromosome-modified strains was carried out in three steps (Supplementary information, Fig. S3). First, sgRNA arrays targeting chromosomes 1–4 and the corresponding donor fragments were introduced into BY4742. For each chromosome, ~1 μg of the chromosome targeting cassette HR1–4 (containing 400 bp homology arms for each centromere, URA3 cassette, and CRISPR protospacer sequence XT2: GGTGTAACGTAGACTCACAGTGG) and sgRNA array plasmid pRS42H-sgChr1–4 were co-transformed into S. cerevisiae BY4742 cells harboring the constitutively expressed Cas9 plasmid (pRS415-Cas9) using a standard lithium acetate transformation protocol.60 The transformed cells were plated on the synthetic dropout medium (SC-Ura-Leu+HYG) without uracil and leucine. Positive colonies were verified by PCR and Sanger sequencing. Second, positive colonies were grown in 5 mL of SC-Ura-Leu liquid medium at 30 °C for 24 h, and HR5–8 and sgRNA array plasmid pRS42B-sgChr5–8 were introduced by standard lithium acetate transformation. The transformed cells were plated on the synthetic dropout medium (SC-Ura-Leu+Ble) without uracil and leucine, and the positive colonies were verified by PCR and Sanger sequencing, and so forth. Similarly, BY4741-Chr9–16XT2-URA was constructed. sgRNA array for wtChr1–16 and Cas9 plasmid (pRS415-Cas9) were introduced into BY4742-Chr1–8XT2-URA and BY4741-Chr9–16XT2-URA, respectively. Third, after selection for correctly transformed colonies, BY4742-Chr1–8XT2-URA (harboring sgRNA array for wtChr1–16) and BY4741-Chr9–16XT2-URA (harboring pRS415-Cas9) were plated on the synthetic dropout medium (SC-Ura-Leu) at 30 °C, respectively. Each colony of them was selected and added into 5 mL of YPD liquid medium to mate. Then PCR verification was performed to select correct BY4742-Chr1–16XT2-URA colony. Following the above process, XT1 (GCGGGATGGTGTCCCCAGGGCGG) and FCY1 were integrated at the centromeres of the 16 chromosomes in BY4741 to construct BY4741XT1-FCY strain.

Haploidization by genome elimination in yeast

The CRISPR/Cas9-induced DSBs at all centromeres were triggered by co-culturing haploid strains BY4741 and BY4742XT2-URA, which harbor Cas9 (pRS415-Cas9) and gRNA plasmids (pRS42H-sgXT2), respectively. Yeasts with different mating types were incubated in corresponding selective media overnight. One milliliter of yeast culture with different mating types was harvested and washed twice with sterile ddH2O. Two hundred microliters of each culture were then added to 5 mL YPD (in which each strain grew well) and incubated at 30 °C with shaking at 220 rpm for 8 h. Cells from 1 mL of the mating solution were subsequently harvested and washed with sterile ddH2O twice. Ten microliters of the solution and 200 µL of sterile ddH2O were mixed together, plated onto selective medium (SC-Leu+Hyg+5-FOA), and then incubated at 30 °C for 2–3 days.

PCRTag analysis of haploidized strain

To verify the elimination of BY4741XT1-FCY1 and BY4742XT2-URA3 chromosomes, we used the junctions between FCY1 or URA3 and chromosomes at the BY4741XT1-FCY1 or BY4742XT2-URA3 centromere as specific PCRTags by colony PCR to distinguish the chromosome-modified and WT strains. Briefly, a single yeast colony was resuspended in 10–40 µL of 20 mM NaOH and placed in the thermocycler. Yeast colony suspensions were boiled at 95 °C for 5 min and then cooled at 4 °C for at least 5 min before PCR was performed. 1 µL of yeast lysate was used as a template in a 10 µL 2× Rapid Taq Master Mix (Vazyme P222-AA) with 0.25 µM of primers. PCR program: 95 °C, 5 min; 30× (95 °C, 20 s; 55 °C, 90 s; 72 °C, 1 min); 72 °C, 5 min; 4 °C, hold. PCR products were separated on a 1% agarose gel containing ethidium bromide. 2 kb Plus II DNA Ladder (TransGen BM121-01) was used as a molecular weight standard.

DNA content measurement by flow cytometry

First, a single yeast colony was picked and inoculated into 5 mL of YPD or SC medium and cultured until reaching the logarithmic phase. A volume of 3 mL of the culture was centrifuged at 5000 rpm for 2 min to collect the cells, which were washed twice with ddH2O. The cell pellet was resuspended in 1 mL of ddH2O, and the optical density at 600 nm (OD600) of the cell suspension was measured. The haploid and diploid cell suspensions were diluted with ddH2O to OD600 of 1 (~107 cells/mL). For fixation, 107 cells were treated with 70% ethanol at room temperature for 1 h. The cells were then centrifuged at 5000 rpm for 2 min and the cell pellets were resuspended in 50 mM sodium citrate buffer. After centrifugation, the cell pellets were collected and resuspended in 975 μL of sodium citrate buffer, and 25 μL of RNase A (10 mg/mL) was added, followed by incubation at 50 °C for 1 h. Subsequently, 50 μL of proteinase K (20 mg/mL) was added and incubated at 50 °C for 1 h. A 1:10 dilution of SYBR dye in Tris-EDTA buffer (pH 8.0) was prepared, and 20 μL of the diluted SYBR dye was added to the above system for dark staining at room temperature for 1 h. Triton X-100 was added to a final concentration of 0.25%, and the sample was vortexed. The cell suspension was transferred to a 12 mm × 75 mm tube, and the cell size and total DNA content were measured using a flow cytometer to determine the ploidy of the cells. The distributions of FITC-A were processed to identify the two main density peaks corresponding to the cell populations in G1 and G2 phases. Flow cytometry was performed using a BD Aria III. The data were analyzed with FlowJo (Treestar).

Evaluation of the mating ability using Tester a and Tester α

Two different mating type strains lacking HIS1, referred to as Tester a and Tester α, were used. Both strains were selected and cultured overnight in 5 mL YPD at 30 °C with shaking at 220 rpm. Two hundred microliters of both culture solutions were plated onto SD solid medium and recorded as test plates. After haploidization, the strains were copied onto test plates, respectively. The SD solid medium plates were cultured at 30 °C for 24 h.

Construction of the iterative assembly vectors

Iterative assembly parts were constructed in this study to facilitate the genetic manipulation. Four distinct target sites for the gRNA–Cas9 complex, each consisting of a 23 bp sequence containing a protospacer adjacent motif, were designed for iterative assembly. The gRNA expression cassettes, namely gRNA-S1, gRNA-S2, gRNA-S3, gRNA-S4, and gRNA-S5, corresponding to the five target DNA sites (S1 site: cggtggacttcggctacgtaggg, S2 site: gctgttcgtgtgcgcgtcctggg, S3 site: acttgaagattctttagtgtagg, S4 site: cgccgctccgagggccgcacggg, and S5 site: gttgcaaatgctccgtcgacggg), were generated using PCR and overlap-PCR techniques. The selective marker genes HIS3 (located on the plasmid backbone), LEU2 (located on the plasmid backbone), URA3, and LYS2, were amplified from plasmids pRS413, pRS415, pRS416, and BY4742 genome, respectively. The homologous arm fragment HR (~400 bp) was amplified using PCC1 plasmid as the template, which was similar to the plasmid homologous arm sequence.

Construction of initial fragments using TAR assembly in S. cerevisiae

To facilitate the future assembly of the synAC, sets of neighboring fragments consisting of five or six fragments were introduced into S. cerevisiae. These fragments were used to construct 32 initial fragments (~32 kb) using TAR assembly. To construct the initial fragment plasmids, functional vectors containing iterative assembly parts and homologous arms (500 bp homologous arms designed to be added to the ends of DNA fragments) were pre-constructed. The NEBuilder HiFi DNA Assembly Master Mix from NEB was employed to assemble the vectors and iterative parts into the pre-constructed functional vectors. Subsequently, the pre-constructed vectors were amplified using the KOD-one (a kind of DNA polymerase) PCR Master Mix from TOYOBO. Five or six linear fragments (100–200 ng of each fragment) and the pre-constructed functional vector (~100 ng) were co-transformed into BY4741XT1-FCY and BY4742XT2-URA yeast strains according to the design outlined in Supplementary information, Data S2.

DNA assembly via HAnDy

Single colonies were inoculated into 5 mL of SC medium and grew overnight at 30 °C with shaking at 250 rpm until the OD600 reached a range of 4–5. Approximately equal amounts of two neighboring haploid cells (~200 μL) with opposite mating types were co-cultured in 5 mL of fresh YPD medium. Mating occurs when haploid cells with opposite mating types are co-cultured. The mating process involves spontaneous assembly of DNA fragments and programmed haploidization facilitated by the orthogonal-cut CRISPR/Cas9 system in diploid cells. After co-culturing for 8–12 h, a volume of 0.5 mL of the culture was centrifuged at 5000 rpm for 2 min to collect the cells, which were washed twice with ddH2O. The cell pellet was resuspended with ddH2O and diluted to an OD600 of 1 (~107 cells/mL), and 20 μL of the diluted solution was plated on a selective medium. The plates were then incubated at 30 °C for 1–2 days. Colonies that grew on the selective medium were picked and verified for successful assembly through PCR analysis of the newly formed junctions and chromosome elimination. The colony PCR validation was performed as described previously. For each assembly, PCR verification was carried out on the junctions at both ends of each fragment and haploidized yeast. Positive colonies were picked to validate the DNA content by flow cytometry further. The positive haploid colonies confirmed by PCR sequencing were inoculated and grown in 3 mL of SC liquid medium until saturation at 30 °C to eliminate the plasmid harboring the haploidization system. After 24 h, 1 µL of the culture was plated and grown on SC selective medium.

PFGE

The PFGE protocol was modified,61 and a single colony was inoculated into 5 mL of YPD overnight with shaking at 30 °C. One milliliter of the overnight culture was transferred to a tube and centrifuged at 1200× g for 2 min at room temperature. Cells were washed twice in solution I (0.05 M EDTA, 0.01 M Tris, pH 7.5) and resuspended in 150 µL of solution I with 10 µL of zymolyase (2 mg/mL zymolyase 20 T, 10 mM sodium phosphate, pH 7.5). Cells were placed in a 42 °C heat block. Two hundred and fifty microliters of agarose solution (1% (w/v) low-melting temperature agarose, 0.125 M EDTA, pH 7.5) was preincubated at 42 °C and mixed with cells by pipetting with a wide-bore pipette tip in the tube. The tube was placed on ice immediately, and 400 µL of LET (0.5 M EDTA, 0.01 M Tris, pH 7.5) was added. The tube was incubated for 8–10 h overnight at 37 °C and placed on ice for 10–20 min, and then the agarose plug in the tube was transferred to a 15 mL Falcon tube. Four hundred microliters of NDS (0.5 M EDTA, 0.01 M Tris, pH 7.5, 1% (w/v) sodium lauryl sarcosine, 2 mg/mL proteinase K) was added and incubated overnight at 50 °C. The tubes were placed on ice for 10 min. The NDS was exchanged with solution I and rocked/swirled gently at room temperature for 1 h, and the wash was repeated three more times. Plugs were stored in fresh solution I at 4 °C. The electrophoresis was performed in one stage, and the gel was prepared with 1% low-melting agarose and 1× Tris/borate/EDTA (TBE) buffer. The electrophoresis conditions were as follows: switch time of 60–120 s, run time of 20 h, angle of 120°, and voltage of 6 V/cm. For the second time, the agarose plugs were removed from the agarose gel and digested for 2–3 h by a restriction enzyme (NEB) before the second PFGE analysis. The conditions of the PFGE program were set as follows: the voltage was 6 V/cm at an angle of 120°, the switch time ranged from initial values of 60 s to a final value of 120 s, the temperature was 10 °C, and the total time was 22 h. Using WT BY4742 genome agarose plugs as a marker, the sizes of the assembled large DNA constructs (200 kb–1.5 Mb) could be validated.

Whole-genome sequencing and RNA-seq

RNA-seq workflow is as follows: yeast cells harboring synAC and the control strain harboring pRS416 were cultured overnight in 3 mL of SC-Ura medium at 30 °C. The cultures were added to 10 mL of fresh SC-Ura medium and incubated until the OD600 reached ~0.8. Three parallel samples were set for each yeast strain harboring the synAC. The RNA extraction was conducted according to the standard procedure. The samples were tested using the BGISEQ-500 platform, and each sample produced an average of 6 GB of data. The average alignment rate of the sample against the reference genome was 95.79%. The sequencing data is called raw reads or raw data, and then quality control (QC) of the raw reads is performed to determine whether the sequencing data is suitable for subsequent analysis. The filtered sequencing clean data were aligned to the reference genome using the Hisat2 (v2.0.1) software for short-read alignment, with default parameters.62 To distinguish between the expression of endogenous and accessory genes, we employed a sequence alignment approach to select specific 30-bp tags within each accessory gene for specific differentiation of endogenous and accessory genes, following the previous study.63 For whole-genome sequencing, the strain samples were prepared and analyzed according to the standard protocols.19

DNA delivery via HAnDy

Single colonies of assembly hosts and recipient strains were inoculated into 5 mL of SC medium and grew overnight at 30 °C with shaking at 250 rpm until the OD600 reached 4–5. A volume of 0.3 mL of the culture was centrifuged at 5000 rpm for 2 min to collect the cells, which were washed three times with 1 mL ddH2O. Approximately equal amounts of haploids with opposite mating types were co-cultured in 5 mL of fresh YPD medium containing 2% galactose and 3% raffinose instead of 2% glucose. The initial OD600 of the co-culture was set to 0.3. After co-culturing for ~12 h, 2 mL of the culture was centrifuged at 5000 rpm for 2 min to collect the cells, which were washed twice with ddH2O. The cell pellet was resuspended with ddH2O and diluted to an OD600 of 1 (~107 cells /mL), and 20 μL of the diluted solution was plated on a selective medium containing 1 mg/mL 5-FOA. The plates were then incubated at 30 °C for 1–2 days. Colonies that grew on the selective medium were screened and picked for verification. Colony PCR validation was performed as described before. Positive colonies were picked to further validate the DNA content by flow cytometry and whole-genome sequencing.

Serial dilution assays on various types of medium

Yeasts were incubated in 5 mL of liquid SC medium (2% glucose, 0.2% dropout mixture, 6.72 g/L yeast nitrogen base) overnight at 30 °C with rotation at 220 rpm, after which 200 µL of the culture was transferred to 5 mL of SC medium at 30 °C with rotation at 220 rpm and then grown to an OD600 of 0.5. The cultures were serially diluted in 10-fold increments in ddH2O and, and ~5 µL of the diluted solution was spotted from the lowest to the highest concentrations on the corresponding selective solid SC medium. These plates were incubated for 3–5 days at 30 °C or other temperature. For the sole carbon source culture, glucose (2%) was replaced by corresponding carbon source (2%). For the osmotic pressure and heavy metal conditions, the SC medium was supplemented with 1.5 M NaCl, 1.5 M KCl, 225 mM LiCl and 10% ethanol, and 0.2 mM Cd(NO3)2, respectively. For the temperature tolerance conditions, routine SC medium was used.

Untargeted metabolomic sample preparation and extraction

The preparation and extraction of metabolomic samples were performed following standard protocols modified from.51 Briefly, the yeast strain was cultured in 5 mL of SC medium and incubated with shaking at 30 °C for 24 h. Subsequently, 5 mL of culture sample was collected and centrifuged at 1200 × g for 5 min at 4 °C. The cell pellet was washed twice with Milli-Q water and immediately submerged in a prechilled solution of 60% (v/v) methanol/water to quench the reaction rapidly. After a 30-s incubation at 40 °C, the samples were centrifuged at 4000 × g for 5 min at 4 °C to collect the cell pellets. The cells were then washed twice with phosphate-buffered saline at 4 °C, followed by a final wash with Milli-Q water to remove any residual culture medium. The cell pellets were collected by centrifugation at 4000 × g for 5 min at 4 °C. To extract the metabolites, 700 μL of an extraction solvent containing an internal standard (methanol:acetonitrile:water = 4:2:1, v/v/v) was added to the cell pellets. The mixture was vigorously shaken for 1 min and placed in a –20 °C freezer for 2 h. Subsequently, the samples were centrifuged at 25,000 × g and 4 °C for 15 min. The supernatant (600 μL) was carefully transferred to a new EP tube, followed by freeze-drying. The dried samples were then reconstituted in 180 μL of a methanol:water solution (1:1, v/v) and vortexed for 10 min until complete dissolution. The reconstituted samples were centrifuged at 25,000 × g and 4 °C for 15 min. The supernatant was transferred to a new EP tube and stored at –80 °C until further analysis.

UPLC-MS/MS analysis

For this experiment, we used a Waters UPLC I-Class Plus (Waters, USA) random Q Exactive high-resolution mass spectrometer (Thermo Fisher Scientific, USA) to separate and detect metabolites. Chromatographic separation was performed on a Waters ACQUITY UPLC BEH C18 column (1.7 μm, 2.1 mm × 100 mm, Waters, USA), and the column temperature was maintained at 45 °C. The mobile phase consisted of 0.1% formic acid (A) and acetonitrile (B) in positive mode and 10 mM ammonium formate (A) and acetonitrile (B) in negative mode. The gradient conditions were as follows: 0–1 min, 2% B; 1–9 min, 2%–98% B; 9–12 min, 98% B; 12–12.1 min, 98%–2% B; and 12.1–15 min, 2% B. The flow rate was 0.35 mL/min, and the injection volume was 5 μL. The mass spectrometry (MS) conditions were as follows: Q Exactive (Thermo Fisher Scientific, USA) was used to perform primary and secondary MS data acquisition. The full scan range was 70–1050 m/z with a resolution of 70,000, and the automatic gain control (AGC) target for MS acquisitions was set to 3,000,000 with a maximum ion injection time of 100 ms. The top 3 precursors were selected for subsequent MS/MS fragmentation with a maximum ion injection time of 50 ms and resolution of 17,500, and the AGC was 100,000. The stepped normalized collision energy was set to 20 eV, 40 eV and 60 eV. The electrospray ionization (ESI) parameters were set as follows: the sheath gas flow rate was 40, the Aux gas flow rate was 10, the positive-ion mode spray voltage (|KV|) was 3.80, the negative-ion mode spray voltage (|KV|) was 3.20, the capillary temperature was 320 °C, and the Aux gas heater temperature was 350 °C.

Metabolite ion peak extraction and metabolite identification

After importing the off-line MS data into compound discoverer 3.2 (Thermo Fisher Scientific, USA) software and analyzing the MS data in combination with the bmdb (BGI metabolome database), mzcloud database, and ChemSpider online database, a data matrix containing information such as metabolite peak area and identification results was obtained. Then, the matrix was further analyzed and processed.

Software: Compound Discoverer v.3.2

Parameter: Parent ion mass deviation: <5 ppm

Mass deviation of fragment ions: <10 ppm

Retention time deviation: <0.2 min.

Untargeted LC-MS/MS data processing and analysisData preprocessing

The resulting file was input from Compound Discoverer to MetaX for data preprocessing and further analysis. Data preprocessing included the following: (1) normalizing the data to obtain relative peak areas by probabilistic quotient normalization (PQN);64 (2) QC-based robust LOESS signal correction to correct the batch effect;65 (3) removing metabolites with a coefficient of variation >30% in their relative peak area in QC samples. PQN is a sample normalization method that can improve comparability between samples via the following steps: (1) obtain an overall reference vector by analyzing the ion intensity distribution in each sample; (2) analyze the correction coefficient between the actual sample and the reference vector for actual sample correction.

QC-RLSC is an effective data correction method in metabolomics, and the method is able to correct experimental sample signals by local polynomial regression fitting signal correction based only on the QC sample.

QC

Principal component analysis (PCA) is an unsupervised pattern recognition method for the statistical analysis of multidimensional data. Through orthogonal transformation, a group of variables that may be correlated are converted into a group of linear unrelated variables, which are called principal components after the transformation. This method is used to study how a few principal components can reveal the internal structure between multiple variables while keeping the original variable information. Log transformation and Pareto scaling were mainly used to compute principal components. The PCA plot reflects the real distribution of samples and is mainly used to observe the separation trend between sample groups and whether there are abnormal samples, as well as to reflect the variability between groups and within groups from the original data. For QC samples, the better the QC sample aggregate, the more stable the instrument is, and the better the reproducibility of the collected data.

Metabolite functional annotation

Taxonomic and functional annotation of the identified metabolites is a good way to understand the properties of different metabolites. The Human Metabolome Database contains chemical, molecular biology/biochemical and clinical information on metabolites and supports metabolic pathway searches and spectral searches. KEGG pathways form the core of the KEGG database. Numerous metabolic pathways and the relationships among them can be found in this database. In organisms, different metabolites act together to exert their biological functions. The functional annotation of pathways was performed through the KEGG pathway database to determine the main biochemical metabolic pathways associated with the metabolites.

Screening the differences between groups

Partial least squares-discriminant analysis (PLS-DA) is a supervised statistical method. It can reflect the differences between classification groups better. This method uses partial least squares regression to establish a model between metabolite expression and sample categories to predict sample categories. Additionally, variable importance in projection (VIP) was used to measure the impact strength and explanatory power of each metabolite expression pattern for the classification and discrimination of each group of samples, helping screen metabolic biomarkers. Generally, a VIP value > 1 could indicate that metabolites have a significant effect on distinct sample categories. After log2 transformation of the data, a PLS-DA model was established between the comparative analysis groups (two groups of samples), the scaling method was Pareto, and a 7-fold cross validation was used to validate when building the model. OPLSDA is a combination of OSC and PLS-DA. It is an extension of PLS-DA, which can decompose X matrix information into two types of information related to Y and unrelated to Y, remove information irrelevant to classification, and effectively reduce the complexity of the model without reducing the predictive ability of the model, thereby enhancing the explanatory power of the model.

MRM analysis of metabolite production

Yeast cultures were pelleted by centrifugation at 3500× g for 5 min at 12 °C, and 150 μL aliquots of supernatant were removed for analysis. Metabolites were analyzed by LC-M/MS using a Waters Acquity UPLC and Waters Xevo TQ-XS with the mass spectrometer. Chromatography was performed using an ACQUITY UPLC BEHC18 column (2.1 mm × 100 mm, 1.7 μm; Waters) with 0.1% (v/v) formic acid in water as mobile phase solvent A and 0.1% (v/v) formic acid in acetonitrile as solvent B. The column was operated with a constant flow rate of 0.3 mL/min at 30 °C and a sample injection volume of 5 μL. Chromatographic separation was performed using the following gradient: 0.00–0.5 min, 5% B; 0.50–3.00 min, 5%–24% B; 3.00–4.50 min, 24%–95% B; 4.50–7.00 min, 95% B; 7.01–10.00 min, 5% B. The LC eluent was directed to the MS from 0.01–10.00 min operating with ESI in positive mode, Desolvation temperature of 500 °C, gas flow rate of 11 L/min, and nebulizer pressure of 40 psi.

Comments (0)

No login
gif