Joint epigenome profiling reveals cell-type-specific gene regulatory programmes in human cortical organoids

Cell culture and organoids generationMouse embryonic stem cell culture

The mouse embryonic stem (mES) cell E14TG2a line was obtained from the American Type Culture Collection (ATCC, CRL-1821). Cells were cultured at 37 °C (5% CO2) on 0.1% gelatin (Millipore, ES-006-B) coated 10-cm dishes (Falcon, 35300) in sterile-filtered (Millipore, SCGPS05RE) Glutamax DMEM medium (Gibco, 31966047) supplemented with 15% heat-inactivated FBS (ThermoFisher, 16141079), 1% penicillin–streptomycin (Gibco, 15140122), 1% MEM (Gibco, 11140035), 0.2% β-mercaptoethanol (Gibco, 31350010) and 0.1% LIF (Merck, ESG1106). Medium was changed every day and cells were split using Accutase (ThermoFisher, A1110501) every second day to a density of 3 × 105 cells.

Human induced pluripotent stem cell maintenance

The human induced pluripotent stem (hiPS) cells used for the generation of cortical organoids and 3DRAM-seq were provided by the Helmholtz Zentrum München iPSC core facility (ISFi001-A hiPS cell line)61. For the MPRA in cortical organoids, we used the previously generated hiPS cell line CRTDi004-A (https://hpscreg.eu/cell-line/CRTDi004-A), which was derived from a healthy donor62. The cells were cultured at 37 °C (5% CO2) on plates (StemCell Technologies, 38016) coated with Matrigel (Corning, 354277) in mTeSR plus medium (StemCell Technologies, 100-0276), and passaged as colonies using ReLeSR (StemCell Technologies, 05872) or Gentle Cell Dissociation reagent (StemCell Technologies, 07174). Before each round of human cortical organoid production, hiPS cell cultures were tested for mycoplasma contamination using a LookOut Mycoplasma PCR Detection kit (Sigma-Aldrich, MP0035) and validated for pluripotency markers by immunohistochemical staining using a Human Pluripotent Stem Cell 3-Color Immunocytochemistry kit (R&D Systems, SC021). Chromosome analysis of the HMGU1 fixed cell suspension was performed by the Cytogenetics Laboratory of the Cell Guidance Systems, and 20 metaphases were analysed.

3D human cortical organoid generation

For the generation of 3D human cortical organoids (3D-hCOs), hiPS cells were cultured at 37 °C (5% CO2) on 10-cm dishes (Corning, 353803) coated with Matrigel (Corning, 354277) in mTeSR plus medium (StemCell Technologies, 100-0276) to 80–90% confluency. The day before the culture reached the correct confluency, hiPS cells were pre-treated with 1% dimethyl sulfoxide (Sigma-Aldrich, D2650), and the mTeSR plus medium was switched to complete Essential 8 medium (Life Technologies, A1517001). After 24 h, the hiPS cell colonies were dissociated to single cells using Gentle Cell Dissociation reagent (StemCell Technologies, 07174), and resuspended in complete Essential 8 medium supplemented with 10 µM of the ROCK inhibitor Y-27632 (Sigma-Aldrich, SCM075) to prevent cell death. In total, 1 × 104 single cells were seeded in one well of an AggreWell 800 plate (StemCell Technologies, 34815) pre-treated with 500 µl of Anti-Adherence Rinsing Solution (StemCell Technologies, 07010). The plate was then centrifuged at 100g for 3 min at room temperature to distribute the cells into the microwells. hiPS-cell-derived embryo bodies were formed within the microwells after 24 h and were collected by firmly pipetting the medium up and down with a cut 1 ml pipetting tip and collected on a 40 µm cell strainer (VWR, 734-2760). The embryo bodies were transferred to ultra-low attachment 10-cm dishes (Corning, 3262) and cultured in Essential 6 medium (Life Technologies, A1516401) supplemented with 2.5 µM dorsomorphin (StemCell Technologies, 72102), 10 µM SB-431542 (StemCell Technologies, 72232) and 2.5 µM XAV-939 (Tocris, 3748) for the first 5 days of culture. Medium was changed daily, except for day 1. At day 7, embryo bodies were embedded in a drop of Matrigel (Corning, 354234) using Organoid Embedding Sheet (StemCell Technologies, 08579). Matrigel-embedded embryo bodies were cultured in differentiation medium (without vitamin A) containing a 1:1 mixture of DMEM/F-12 (Gibco, 11330-032) and neurobasal medium (Gibco, 21103-049) supplemented with 0.5% N2 supplement (Life Technologies, 17502-048), 0.025% Insulin (Sigma-Aldrich, I9278), 1% B-27 Supplement minus vitamin A (Life Technologies, 12587010), 1% GlutaMAX supplement (Life Technologies, 35050-061), 0.5% MEM-NEAA (Life Technologies, 1140-050), 1% penicillin–streptomycin (Gibco, 15140122) and 0.1% β-mercaptoethanol (Gibco, 31350010) for 4 days with one change of medium 2 days after embedding. After day 4, Matrigel-embedded embryo bodies were cultured in differentiation medium (with vitamin A) containing a 1:1 mixture of DMEM/F-12 and neurobasal medium supplemented with 0.5% N2 supplement, 0.025% insulin, 1% B-27 supplement (Life Technologies, 17504044), 1% GlutaMAX supplement, 0.5% MEM-NEAA, 1% penicillin–streptomycin and 0.1% β-mercaptoethanol with changes of medium every 3–4 days. 3D-hCOs were collected after 45 days and dissociated using a Papain-based Neural Tissue Dissociation kit (Miltenyi Biotec, 130-092-628) following the manufacturer’s dissociation protocol.

Sectioning and immunohistology

Cortical organoids were washed twice in PBS for 5 min, fixed in freshly prepared 4% formaldehyde in PBS at room temperature for 20 min, washed twice in PBS for 5 min and cryoprotected in 30% sucrose in PBS at 4 °C until they sank. Subsequently, cortical organoids were embedded in Tissue Tek OCT compound (Science Services, SA62550-01), snap frozen on dry ice and finally cryosectioned (~16 μm) using a CryoStar NX70 (ThermoFisher). Sections were collected on Superfrost Plus adhesive microscope slides (ThermoFisher, J1800AMNZ) and stored at −80 °C until further use. For immunohistochemistry, the sections were hydrated in PBS and incubated for 1 h at room temperature in PBS blocking buffer containing 5% horse serum (Sigma-Aldrich, H0146), 1% BSA (Thermo Scientific, 15260-037) and 0.3% Triton X-100 (Sigma Aldrich, X100). Staining was performed overnight at 4 °C either with anti-PAX6 (1:100 dilution; BioLegend, 901301) and anti-EOMES (1:150 dilution; R&D Systems, AF6166) antibodies for wild-type organoids or with anti-RFP (1:1,000 dilution; Rockland, 200-101-379) and anti-GFP (1:1,000; Abcam, ab13970) for co-electroporated organoids. All antibodies were diluted in PBS blocking buffer. Sections were washed three times for 10 min with 0.1% Triton X-100 in PBS followed by secondary staining with either donkey anti-sheep-A488 (Thermo Scientific, A32794) and donkey anti-rabbit-A555 (Thermo Scientific, A32794) or goat anti-chicken Alexa Fluor 488 (Thermo Scientific, A-11039) and donkey anti-goat Alexa Fluor Plus 555 (Thermo Scientific, A32816) all diluted in blocking buffer (1:1,000). Sections were washed, stained with 4′,6-diamidino-2-phenylindole (DAPI) and finally mounted using Fluoromount-G (Invitrogen, 00-4958-02). All images were acquired using a Zeiss LSM 710 confocal microscope.

Cell fixation

Cells from dissociated 3D-hCOs, electroporated organoids or mES cells were fixed at a concentration of 1 × 106 cells per ml with 1% formaldehyde (ThermoFisher, 28906) in PBS for 10 min at room temperature with slow rotation. To quench the reaction, glycine (ThermoFisher, 15527-013) was added to a final concentration of 0.2 M followed by incubation for 5 min at room temperature with slow rotation. Thereafter, cells were spun down at 500g for 5 min at 4 °C and washed once with PBS containing 1% BSA (Sigma-Aldrich, B6917) and 0.1% RNAsin plus RNase inhibitor (Promega, N261A). Fixed cells from mES cells were directly used for 3DRAM-seq, whereas fixed cells from 3D-hCOs were first subjected to immunoFACS before use.

ImmunoFACS

ImmunoFACS was performed as previously described (https://www.protocols.io/view/immunofacs-b2a2qage/)3 with minor modifications, which included the addition of 0.5× complete, EDTA-free protease inhibitor cocktail (Roche, 11873580001) to all buffers. SOX2-PE (1:20; BD Biosciences, 562195), PAX6-Alexa Fluor 488 (1:40; BD Biosciences, 561664) and EOMES-eFluor660 (1:20; Thermo Scientific, 50-4877-41) antibodies were used. Cell sorting was carried out on a FACSAria Fusion (BD Biosciences; laser: 405 nm, 488 nm, 561 nm and 640 nm) or a FACSAria III (BD Biosciences; laser: 405 nm, 488 nm, 561 nm and 633 nm) using a 100 µm nozzle. After sorting, cells were either directly used for 3DRAM-seq, MPRA library preparation or RNA was extracted using a Quick-RNA FFPE Miniprep kit (Zymo Research, R1008) with Zymo-Spin IC columns (Zymo Research, C1004-250). FACS plots were generated using FlowJo.

Real-time quantitative PCR

Reverse transcription was performed using Maxima H Minus Reverse Transcriptase (ThermoFisher, EP0751) with Oligo(dT)18 primer (ThermoFisher, SO132) according to the manufacturer’s instructions. Transcripts were quantified using Luna Universal quantitative PCR (qPCR) master mix (New England BioLabs, M3003X) with the appropriate primers (Supplementary Data 1) either on a Roche LightCycler 480 or on an Applied Biosystems QuantStudio 6 Flex Real-Time PCR system.

3DRAM-seqGeneration of biotinylated methylation controls

Methylation controls were generating by first mixing 10 µl of fully methylated pUC19 DNA (Zymo Research, D5017) with 10 µl of unmethylated lambda DNA (Promega, D1521). Control DNA was GpC methylated using the methyltransferase M.CviPI (New England BioLabs, M0227), purified with 1× AMPure XP beads (Agencourt, A63881) and sheared to ~550 bp using a Covaris S220 sonicator. Sticky ends were biotinylated by incubating the sheared DNA for 6 h at 37 °C with DNA polymerase I (New England BioLabs, M0210) and a nucleotide mix containing biotin-14-dATP (Life Technologies, 195245016) in DpnII buffer (New England BioLabs, R0543S) followed by 1× AMPure XP bead purification and quantification using a Qubit dsDNA HS Assay kit (ThermoFisher, Q32851).

RNA isolation and library preparation

To generate gene expression data, RNA from approximately 2.5 × 104 fixed mES cells or 105 fixed-sorted RGCs and IPCs was isolated using a Quick-RNA FFPE Miniprep kit (Zymo Research, R1008) in combination with Zymo-Spin IC columns (Zymo Research, C1004-250) according to the manufacturer’s instructions starting from the tissue-dissociation step. Yield was quantified using a Qubit RNA HS Assay kit (ThermoFisher, Q32852) and a high RNA quality (RIN > 8) was verified using a Bioanalyzer High Sensitivity RNA 6000 Pico kit (Agilent, 5067-1513). Next, 100 ng of mES cell RNA and around 60 ng of RGC and IPC RNA was used for RNA library generation using a NEBNext Single Cell/Low Input RNA Library Prep kit (New England BioLabs, E6420) according to the manufacturer’s instructions.

Generation of 3DRAM-seq libraries

To generate 3DRAM-seq libraries, which enables the simultaneous measurement of DNA methylation, accessibility and the 3D genome, approximately 2.5 × 105 mES cells or between 1.5 and 2 × 105 immunoFACS-sorted human RGCs and IPCs were used. Cells were first lysed with 0.2% Igepal-CA630 (Sigma-Aldrich, I3021) for 10 min at room temperature, washed once with 1× GpC buffer (New England BioLabs, M0227S) containing 1% BSA (Sigma-Aldrich, B6917) and subsequently incubated for 3 h at 37 °C in a reaction mix containing 60 U M.CviPI (New England BioLabs, M0227S) and 0.6 mM SAM (New England BioLabs, B9003). During the incubation period, the reaction was substitution with 8 U M.CviPI and 1 µl of 32 mM SAM every hour. Nuclei were washed, permeabilized with 0.5% SDS (Invitrogen, AM9823) quenched with 1.5% Triton-X-100 (Sigma-Aldrich, X100) and digested with 400 U DpnII (New England BioLabs, R0543) overnight at 37 °C. Subsequently, sticky ends were filled by incubating the nuclei for 4 h at room temperature with DNA polymerase I (New England BioLabs, M0210) and a nucleotide mix containing biotin-14-dATP (Life Technologies, 195245016) in DpnII buffer. Proximity ligation was performed for at least 6 h at 16 °C using T4 DNA ligase (New England BioLabs, M0202). Thereafter, nuclei and chromatin were digested using 200 µg proteinase K (New England BioLabs, P8107) with 1% SDS followed by reverse crosslinking overnight at 68 °C with 0.5 M NaCl, purification by ethanol precipitation and shearing to ~550 bp DNA fragments using a Covaris S220 sonicator. To remove biotinylated ATPs and repair the sticky ends, the sheared DNA was incubated with T4 DNA polymerase (New England BioLabs, M0203) and non-biotinylated nucleotides for 4 h at 20 °C. Approximately 0.01% of biotinylated methylation controls were added to the sample, and bisulfite conversion was performed using an EZ DNA Methylation-Gold kit (Zymo Research, D5005) followed by construction of the sequencing library using a Accel-NGS Methyl-Seq DNA Library kit (Swift Bioscience, 30024, now xGen Methyl-Seq DNA Library Prep IDT, 10009860) according to the manufacturer’s instructions until the adapter ligation step. After this step, biotin pulldown was performed using MyOne Streptavidin T1 beads (ThermoFisher, 65602) followed by 5 washes with washing buffer containing 0.05% Tween-20 (Sigma-Aldrich, P9416) and 2 additional washes with low-TE water. To increase library complexity, on-bead final library amplification was performed in five separate reactions using EpiMark Hot Start Taq (New England BioLabs, M0490) with Methyl-Seq Indexing primers (Swift Bioscience, 36024; now IDT, 10009965 or 10005975) with the following PCR program: 95 °C for 30 s; (95 °C for 15 s, 61 °C for 30 s, 68 °C for 80 s) ×10–11; 68 °C for 5 min; hold at 10 °C. The different reactions were pooled, streptavidin T1 beads were pelleted on a magnetic rack and the prepared libraries within the supernatant were purified using 0.65× AMPure XP beads (Agencourt, A63881) to reach an average fragment size of approximately 500 bp. A detailed version of the protocol can be found at protocols.io63.

Bisulfite amplicon sequencing of M.CviPI-treated DNA

To optimize the M.CviPI incubation time, unfixed and fixed mES cells were lysed, washed as described above and incubated with 60 U M.CviPI (New England BioLabs, M0227S) and 0.6 mM SAM (New England BioLabs, B9003) at 37 °C for 10 min up to 4 h. The reactions were substituted with 8 U M.CviPI and 1 µl of 32 mM SAM every hour. Thereafter, nuclei were digested, reverse crosslinked, purified and bisulfite converted as described above. The bisulfite-converted DNA was amplified using EpiMark Hot Start Taq and target specific primers with the following PCR program: 95 °C for 30 s; (95 °C for 30 s, 58 or 52 °C for 30 s, 68 °C for 90 s) ×40; 68 °C for 5 min; hold at 10 °C. Different amplicons of one sample were pooled to equal molarity and purified using 1× AMPure XP beads (Agencourt, A63881). Sequencing libraries were generated from 50 µg purified DNA using a Nextera XT DNA Library Preparation kit (Illumina, FC-131-1024) with half of the recommend reaction volume and 5 min incubation at 55 °C. Final amplification was performed using NEBNext Ultra II Q5 master mix (New England BioLabs, M0544S) with sample-specific indexing primers using the following PCR conditions: 98 °C for 30 s; (98 °C for 10 s, 65 °C for 90 s) ×5; 65 °C for 5 min; hold at 10 °C. Subsequently, the generated libraries were purified using 1.2× AMPure XP beads.

Target-specific primers and indexing primers can be found in Supplementary Data 1.

MPRA design and plasmid pool generation

The designed MPRA plasmid pool included 500 scrambled control sequences, which had matched GC content and were pre-screened to minimize the presence of expressed TF motifs and 2,737 DARs that interact with a differentially expressed gene in at least one cell type in human organoids as well as 267 MER130 or UCON31 TEs. Additionally, we added 2,372 enhancer sequences for which only the corresponding motif sequence was iteratively mutated (100 permutations with similar GC content, lowest motif score selected). DARs were centred on the accessibility peak and resized to 266 bp, and nucleotide sequences were extracted using the ‘BSgenome.Hsapiens.UCSC.hg38’ R package. To facilitate barcode–CRE association, we added a 4 bp tag at the beginning of each WT/control (TCAG) or Mut (GTCA) sequence.

The MPRA plasmid pool was generated as previously described3, and a detail protocol can be found at https://www.protocols.io/view/mpra-plasmid-pool-preparation-bxchpit6/. In brief, 300 bp single-stranded oligonucleotides were synthesized (Twist Bioscience), and degenerated barcodes as well as KpnI/EcoRI restriction sites were added using two separate PCRs. PCR products were introduced into the pMPRA1 (ref. 64; Addgene, plasmid 49349) backbone through Gibson assembly and transformed into ElectroMAX Stbl4 competent cells (ThermoFisher, 11635018) using Gene Pulser/MicroPulser electroporation cuvettes with a 0.1 cm gap (parameters: 1.8 kV, 25 µF, 200 Ω). Transformed bacteria were immediately resuspended in 1 ml of warm SOC medium and a 1:10 dilution of the bacteria was distributed on 10 plates of LB agar containing 100 µg µl−1 carbenicillin. Transformant number was estimated on a 1:100,000 diluted counting plate, and the required number of colonies to achieve the targeted library complexity was scraped for plasmid purification (Qiagen, 27104). The purified plasmids were digested using KpnI/EcoRI and ligated with an insert containing the minimal promoter and mScarlet-I. The purified ligation product was transformed into Escherichia coli as described above, scraped, and the final plasmid library was purified using a EndoFree Plasmid Maxi kit (Qiagen, 12362). Single CRE constructs were synthesized (Twist Bioscience) directly with the Gibson overhangs as well as KpnI/EcoRI restriction sites and cloned as described above. The resulting construct were transformed into NEB Turbo Competent E. coli (NEB, C2984I) and purified using an EndoFree Plasmid Maxi kit (Qiagen, 12362).

All primers and DNA blocks used are listed in Supplementary Data 1.

MPRA and CRE barcode association library generation

MPRA libraries were prepared as previously described3 with minor modifications. In brief, RNA and DNA from fixed immunoFACS-sorted cells were extracted using a Quick-DNA/RNA Microprep Plus kit (Zymo Research, D7005) according to the manufacturer’s instructions. Purified RNA was treated with TURBO DNase (ThermoFisher, AM1907) and reverse transcribed with Maxima H Minus RT (ThermoFisher, EP0753) using Oligo(dT)18 Primer (ThermoFisher, SO132). cDNA was purified using 1.5× AMPure XP magnetic beads (Agencourt, A63881). For both the DNA and cDNA libraries, unique molecular identifiers (UMIs) were added by PCR (98 °C for 30 s; (98 °C for 10 s, 65 °C for 30 s, 72 °C for 1 min) × 3; 72 °C for 3 min, and hold at 4 °C) using the primers RV_univ_MPRA and FWD_mScar_Tn7_10UMI_3 (0.5 µM each). P7 and P5 dual indexing sequencing adaptors (0.1 µM each) were attached separately by first amplifying the library with Ad2.X (ref. 65) and P5NEXTPT5 primers using the following PCR program: 98 °C for 30 s; (98 °C for 10 s, 65 °C for 90 s) × (10× for DNA or 12× for cDNA); 72 °C 5 min, and hold at 4 °C. PCR products were purified and amplified using Ad2.X and P5NEXT_SX primers for additional PCR cycles (minimum 12) determined by qPCR and using one-tenth of the first PCR product as input. All PCRs were performed in 1× NEBNext Ultra II Q5 master mix (New England BioLabs, M0544), reactions were split into two separate reaction tubes to increase library complexity and PCR products were pooled and purified using 0.8× to 1.2× AMPure XP magnetic beads. Final libraries were quantified using Qubit (ThermoFisher) and Bioanalyzer 2100 (Agilent). For the CRE barcode association library, 5 ng of the plasmid pool without minimal promoter and mScarlet-I was used to attach P5 and P7 dual indexing sequencing adaptors in two separate PCRs. Both PCRs were performed using NEBNext Ultra II Q5 master mix (New England BioLabs, M0544) with RV_univ_MPRA + FWD_CRS_Tn7 (0.5 µM each; PCR conditions: 98 °C for 30 s; (98 °C for 10 s, 65 °C for 30 s and 72 °C for 3 min) × 3; 72 °C for 3 min, and hold at 4 °C) and P5NEXT_SX + Ad2.X (0.1 µM each; PCR conditions: 98 °C for 30 s; (98 °C for 10 s, 65 °C for 90 s) × 10; 72 °C for 5 min, and hold at 4 °C), respectively.

Generation of hiPS cell line for MPRA

The hiPS cell line CRTDi004-A (Human Pluripotent Stem Cell Registry) was generated from previously published foreskin fibroblasts (termed Theo) of a consenting healthy donor66. Isolation of cells and reprogramming to hiPS cells was approved by the ethics council of the Technische Universität Dresden (EK169052010, EK386102017). Theo fibroblasts were reprogrammed at the CRTD Stem Cell Engineering Facility at Technische Universität Dresden using a CytoTune-iPS 2.0 Sendai Reprogramming kit (ThermoFisher, A16517) according to the supplier’s recommendations for transduction. Following transduction with the Sendai virus, cells were cultured on irradiated CF1 mouse embryonic fibroblasts (ThermoFisher, A34180) in KOSR-based medium (80% DMEM/F12, 20% KnockOut Serum Replacement, 2 mM l-glutamine, 1% nonessential amino acids, 0,1 mM 2-mercaptoethanol, all from ThermoFisher, 11330-032, 10828028, 25030149, 11140050 and 31350010, respectively) supplemented with 10 ng ml–1 human FGF2 (StemCell Technologies, 78003). Individual iPS cell colonies were mechanically picked, expanded as clonal lines and adapted to Matrigel (Corning, 354277), mTeSR1 and ReLeSR (both StemCell Technologies, 85850 and 05872, respectively) conditions after several passages. Master and working hiPS cell stocks were established from the clone with the best morphology.

To characterize the newly generated CRTDi004-A hiPS cell line, the following tests were performed: for flow cytometry analysis of pluripotency, Alexa Fluor 488 anti-Oct3/4, PE anti-Sox2, V450-SSEA-4, and Alexa Fluor 647 anti Tra-1-60 (all from BD Biosciences, 560253, 560291, 561156 and 560122, respectively) were used according to the manufacturer’s recommendations. Three germ layer differentiation was performed as previously described67, and resulting cells were stained using a 3-Germ Layer Immunocytochemistry kit (ThermoFisher, A25538) according to the manufacturer’s instructions. For endoderm, SOX17 primary antibody (Abcam, ab84990) followed by Alexa Fluor 488 goat anti-mouse IgG (ThermoFisher, A32723) was used. qPCR with reverse transcription for pluripotency and tri-lineage spontaneous differentiation was performed according to the instruction manual of the human ES cell Primer Array (Takara Clontech). Standard G banding karyotyping was done in collaboration with the Institute of Human Genetics, Jena University Hospital, Germany, and 20 metaphases were analysed.

Electroporation of human cortical organoids

Human cortical organoids were generated following a previously reported protocol68 using the hiPS cell line CRTDi004-A cultured in standard conditions (37 °C, 5% CO2) on Matrigel-coated plates (Corning, 354277) and approved by the ethics council of the Technische Universität Dresden (SR-EK-456092021). Electroporation of the MPRA library was carried out 2 days after the first slicing on day 45 of organoid culture. Organoids were transferred to a 6 cm ultra-low-attachment dish (Eppendorf, 30701011) containing Tyrode’s solution (Sigma-Aldrich, T2145). Using a glass microcapillary (Sutter Instrument, BF120-69-10), 0.2–0.5 µl of the plasmid DNA (either MPRA library or an equal molar mix of single CRS constructs with CAG-GFPnls control plasmids3) at a final concentration of 1 µg µl–1 diluted in 0.1% Fast Green solution (in dH2O) were injected into areas depicting ventricular morphology. Injections were carried out using a microinjector (World Precision Instruments, SYS-PV820) on continuous setting. Up to five ventricles were injected per organoid. A total of 35–40 organoids were processed per replicate depending on the size and number of ventricular structures. After injection, organoids were transferred into an electroporation chamber containing Tyrode’s solution and electroporated with 5 pulses applied at 38 V for 50 ms each at intervals of 1 s (Harvard Bioscience, BTX ECM 830). Subsequently, the electroporated organoids were returned to culture medium and incubated for 72 h before further processing. Organoids were dissociated using a MACS neural tissue dissociation kit P (Miltenyi Biotec, 130-092-628) with a reduced incubation time of enzyme mix 1 (6 min at 37 °C) and omitting the incubation with enzyme mix 2.

Library quality control and sequencing

Libraries were quantified by qPCR using a NEBNext Library Quant kit (New England BioLabs, E7630), and the size distribution of the obtained libraries was assessed using an Agilent 2100 Bioanalyzer. Sequencing was performed on a NextSeq550 or NovaSeq6000. Sequencing statistics are listed in Supplementary Table 2.

Bioinformatics analysisMapping and analysis of gene expression

RNA sequencing libraries were mapped and deduplicated using STAR69 with default settings. DESeq2 (ref. 70) was used to calculate fragments per kilobase of transcript per million mapped read (FPKM) and differential expressed gene values (FDR < 0.05). Gene body coverage and transcriptomic distribution was computed using RSeQC71.

Mapping of 3DRAM-seq

For mapping of 3DRAM-seq results, we used an adapted TAURUS-MH9 pipeline, which includes read splitting based on the ligation junction, mapping with Bismark and improved quality control. Next, 100 bp paired-end reads were first trimmed using Trim Galore with the following parameters: --nextseq 30 --clip_R1 1 --clip_R2 15 --length 20. Subsequently, reads where aligned using Bismark72 with Bowtie2 in single-end mode and the post-bisulfite adapter tagging option (--pbat) for the reverse read. To recover chimeric reads resulting from the proximity ligation step, and are therefore not aligned during the previous step, unmapped reads from the previous step were split at adjacent DpnII cutting sites (GATTGATT, GATTGATC for forward and AATCAATC, GATCAATC for reverse strand, including variations where endogenous C is not methylated and thus converted to T), separately aligned and subsequently merged with the non-chimeric reads. The unique read pairs were transformed into restriction fragment end (fend) coordinates, converted into ‘misha’ tracks and imported into the corresponding genomic database (mm10 or hg38). Methylation levels in both, CpG and GpC context, were calculated on uniquely mapped reads using the Bismark methylation extractor and coverage2cytosine function with the --nome-seq option on, ensuring that only cytosines in the correct context are considered. Only 5× for individual replicates or 10× for merged replicates covered cytosines were considered for further analysis. Data were mapped using the mm10 genome for mES cells and the hg38 genome for RGCs and IPCs.

Mapping of external datasets

To compare 3DRAM-seq results with comparable multiomics datasets (Methyl-3C, Methyl-HiC and WGBS), raw data were downloaded and processed using the adapted TAURUS-MH with dataset-specific modifications mainly during the trimming step. For Methyl-HiC10, the parameters --clip_R1 1 and --clip_R2 1 were used to account for the pre-bisulfite adapter ligation step, which does not introduce low complexity tails. To account for the random primer amplification step and therefore template switch of Methyl-3C reads were trimmed with -a AGATCGGAAGAGCACACGTCTGAAC -a2 AGATCGGAAGAGCGTCGTGTAGGGA --clip_R1 16 --clip_R2 16 --three_prime_clip_R1 3 --three_prime_clip_R2 3 to remove the low-complexity 5′ tail induced by the Adaptase and random primer sequence and adapter from the 3′ prime end. Additionally, read 1 instead of read 2 was flagged using –pbat during the alignment steps. WGBS27 data were trimmed and directly aligned using Bismarck with Bowtie2 in PE mode.

ChIP–seq and ATAC–seq datasets were uniformly processed using the ENCODE ChIP–seq or ATAC–seq pipeline, respectively, whereas the Hi-C data were processed as previously described5. DHS and MNase-seq data were directly downloaded from Encode.

Estimation of bisulfite conversion efficiency

The efficiency of bisulfite conversion was estimated through the CpG methylation of unmethylated lambda DNA using Bismark in paired-end mode with the –nome-seq option. The detection rate of methylated cytosines both in the CpG and GpC context was determined by fully methylated pUC19 DNA as well as in situ GpC methylated lambda DNA. In all cases, we observed methylation above 98%, indicating a false negative rate of less than 2%.

Co-accessibility and co-methylation analysis at single-molecule resolution

Sequencing reads were first split per chromosome into individual data frames containing only relevant fields such as read name, read pair identity and exact coordinates of methylation calls. We used the FST R package to reduce memory footprint of the full dataset and to facilitate downstream processing and to further improve read and write speed. To interrogate accessibility and/or methylation at multiple loci, bed files containing coordinates for the desired parameters were prepared (for example CTCF motifs overlapping ChIP–seq peaks). These regions were centred on, for example, transcription factor motifs (filter intervals), at which methylation and/or accessibility calls will be computed.

To search for reads containing methylation calls that fell within the region of interest, a binary search was executed to identify the closest filter interval existing in genomic space for each read. Next, the absolute distance (in base pairs) between each methylation call and the nearest filter interval was computed. Following that, a user-defined window (100 bp) was used as a threshold, at which methylation calls that lie within this distance were retained, whereas the rest were discarded. Overall, only reads that contained at least one methylation call were retained. The search and filter process was repeated separately and iteratively for read pairs 1 and read pairs 2. The final result was obtained by merging based on full read name to ensure only read pairs overlapping both filter intervals were kept.

To determine paired co-accessibility patterns, the average accessibility (based on GpC methylation) was calculated separately for read 1 and read 2 in a chosen window centred on the feature of interest, such as CTCF motif, and separated by a minimum distance. The resulting two-column matrix was then used as input for k-means clustering, clusters were reordered on the basis of their mean value for consistency and the matrix was plotted using the R package ComplexHeatmap. All subsequent analysis was performed using exactly the same cluster assignments.

To test whether there was a dependency between accessibility in read 1 and read 2, we used the Fisher exact test on the 2 × 2 contingency matrix, and we report the odds ratio and P values. This approach aims to test whether the null hypothesis (accessibility at read 1 and read 2 are independent events) can be rejected. The analysis for Fig. 5j,k was performed analogous to the CTCF-based analysis in Fig. 3b. First, we filtered LHX2–SOX2 or NEUROG2–EOMES motifs, retaining only those that overlapped with a GpC peak (based on bulk accessibility in RGCs or IPCs, respectively). Next, we identified all read pairs for which read 1 overlapped with one of the motifs (for example, LHX2) and read 2 overlapped with the other motif (for example, SOX2). We then measured the average accessibility per read within a 50 bp window for reads that are separated by at least 100 bp but not more than 300 bp. This distance cut-off is different from our measurements of long-range interactions associated with CTCF loops because we wanted to determine whether these pairs of TFs interact directly or co-bind on chromatin synergistically at closer distances.

Visualization of linear marks at genomic features and GO term enrichment

Average enrichment plots and heatmaps of DNA methylation, chromatin accessibility or ChIP–seq in windows centred around the genomic feature were visualized using SeqPlots73. Functional enrichment analysis was performed using Cluster profiler and visualized with enrichPlot74.

Identification and characterization of DARs and differential methylated regions

Accessible peaks based on GpC methylation were identified using the gNOMeHMM package26 with default settings (q value ≤ 0.05), which resulted in 67,177 peaks for mES cells, 39,738 peaks for RGCs and 54,334 peaks for IPCs. Accessible peaks of RGCs and IPCs were merged to generate a common peak set (66,280 peaks). DARs and differential methylated regions (DMRs) were identified in the common peak set using methylKit75 with following settings: lo.count=10, hi.perc=99.9, overdispersion=“MN”, test=“Chisq”, qvalue=0.05. Peaks within promoter regions (±5 kb from the TSS) were associated with their nearest TSS, whereas distal peaks were associated with genes within the same TAD displaying the highest Hi-C score with a minimal and maximum distance of 5 kb and 2 Mb, respectively.

TF motif analysis

For motif-based analysis, we used the JASPAR2022 core vertebrate database and excluded all TFs that were not expressed in our data (FPKM < 1). TF factor motif enrichment was either calculated using the CreateMotifMatrix function from the Signac package76 or using the monaLisa package40. Motifmatchr was then used to identify TF motifs within genomic regions (p.cutoff = 0.0005) and to centre the region around them.

Repetitive element analysis

Localization of repetitive elements for the hg38 genome were obtained from RepeatMasker and repeats classified as satellite, simple_repeat, tRNA, rRNA, snRNA, srpRNA or low_complexity were removed. Individual repeats were associated to genes as described for DARs.

Hi-C data processing

The filtered fend-transformed read pairs obtained from the TAURUS-MH pipeline were converted into tracks and imported into the genomic databases. Normalization was performed using the Shaman package (https://tanaylab.bitbucket.io/shaman/index.html), and Hi-C scores were calculated using a kNN strategy on the pooled replicates as previously described5 with a kNN of 100. For visualization, fend-transformed read pairs were converted into .hic files using Juicer pre and displayed using Juicebox77. HiCRep78 was used to calculate reproducibility between biological replicates and datasets.

Contact probability, insulation, TAD boundary calling and average TAD contact enrichment

Contact probability as a function of the genomic distance was calculated as previously described5. To define insulation based on observed contacts, we used the insulation score5,79, which was calculated on the pooled contact map at 1 kb resolution within a region of ±250 kb and was multiplied by (−1). TAD boundaries were then defined as the local 2 kb maxima in regions where the insulation score was above the 90% quantile of the genome-wide distribution. Differential TAD boundaries were identified as previously described5 using genome-wide normalized insulation scores. To calculate insulation and contact enrichment within TADs, their coordinates were extended upstream and downstream by the TAD length, and this distance was split into 100 equal bins. The observed versus expected enrichment ratio was calculated in each resulting 100 × 100 grid (per TAD) and the average enrichment was plotted per bin. Average DNA methylation and accessibility levels were calculated for each of these 100 bins per TAD and are represented as the mean ± 0.25 quantiles.

Compartments and compartment strength

The dominant eigenvector of the contact matrices (250 kb bins) were computed as previously described80 using scripts available at https://github.com/dekkerlab/cworld‐dekker/. Compartment strength was determined by the log2 ratio of observed versus expected contacts (intrachromosomal separated by at least 10 Mb) either between domains of the same (A–A, B–B) or different types (A–B), as previously described5 and represents the ratio between the sum of observed contacts within the A and B compartments and the sum of intercompartment contacts (AA + BB)/(AB + BA).

Aggregated and individual contact strength at pairs of genomic features

Contact enrichment ratios between pairs of genomic features, such as motif-centred differential accessible regions, were calculated using two complementary approaches3. First, Hi-C maps were aggregated to calculate the log2 ratio of the observed versus expected contacts within a window centred on the pair of interest. In addition, the average enrichment ratio of the contact strength in the centre of the window (central nine bins) versus each of the corners was calculated. Second, to analyse the heterogeneity of the data and the contribution of individual pairs, we extracted the kNN-based Hi-C score in a 10 kb window centred around each of the pairs separately and represented the data as a scatterplot or boxplot. Significance was then calculated using the Wilcoxon rank test.

MPRA CRE–barcode association

For CRE–barcode association, 75 bp pair-end reads were trimmed using cutadapt with the following parameters: -m 12 -a GAATTCATCTGGTA -G GACCGGATCAACT -u 1 --discard-untrimmed. Next, 150 bp paired-end reads were first filtered using cutadapt (-m 12 -a GAATTCATCTGGTACCTCGGTTCACGCAATG -G ^CCAGGACCGGATCAACT -u 1 --discard-untrimmed --action=none --interleaved | cutadapt -g GAATTCATCTGGTACCTCGGTTCACGCAATG -G ^CCAGGACCGGATCAACT --discard-untrimmed --action=none –interleaved). Subsequently, forward and reverse reads were individually trimmed using -l 12 for the barcode, -g ^CCAGGACCGGATCAACT–discard-untrimmed for the forward CRE reads or -g GAATTCATCTGGTACCTCGGTTCACGCAATG --discard-untrimmed -l 105 for the reverse CRE read. Trimmed fastq reads for both 75 bp and 150 bp pair-end reads were separated based on a 5′ 4 bp identifier (GTCA or TCAG) and CRE–barcode association was performed separately on wild-type, mutant sequences using MPRAflow81. The resulting pickle libraries were merged to increase the number of recovered CREs and filtered for promiscuous barcodes.

MPRA data processing

The 150 bp paired-end reads from the cell-type-specific DNA and RNA libraries were trimmed using cutadapt with the following parameters: -m 12:10 -e 0.4 -u 1 -a GAATTCTCATTAC -A TCGACCGCAAGTTGG --discard-untrimmed. Read 1 was additionally trimmed with -l 12. Count tables for RNA and DNA reads were generated using MPRAflow (--bc-length 12 and --mpranalyze). MPRAanalyse82 was used to calculate the MPRA signal (mad.score) and to identify significant active enhancers (mad.score BH adjusted P value ≤ 0.1). For comparison of replicates, the normalized DNA/RNA read counts and ratio of sums were calculated as previously described83.

Statistic and reproducibility

No statistical methods were used to predetermine sample sizes, but our sample sizes are similar to those reported in previous publications3,5,6,7,8,9,10. No data were excluded from the analyses. Data collection and analysis were not performed blind to the conditions of the experiment.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

留言 (0)

沒有登入
gif