The scRNA + BCR-seq data for B cells from humans and mice were obtained from the article [15, 19, 20].This study utilized three scRNA-seq and scBCR-seq datasets from the Gene Expression Omnibus (GEO) repository and the ArrayExpress database, encompassing peritoneal B cells from C57BL/6 wild-type mice across three different age groups (E-MTAB-10081), peripheral blood B cells from healthy children (GSE166489), and peripheral blood B cells from healthy adults (GSE165080).The peritoneal B cell samples from mice (three samples in total) were collected from peritoneal tissues at Day 6, Week 10, and Month 15. The healthy children (six samples in total) had an average age of 8.4 ± 10 years, while the healthy adults (eight samples in total) had an average age of 34 ± 15 years.The accession number, basic sample information and data composition information are shown in Supplementary Tables 1–4.
Screening of B cellsThe analysis steps for the shared raw scBCR-seq data table for each sample are as follows:1.Removal of non-functional sequences: Sequences with “is_cell” marked as “FALSE”, Sequences with “high_confidence” marked as “FALSE”, Sequences where “chain” is not “IGH, IGK, IGL", Sequences in “productive” that are marked as “None, FALSE”. 2.Statistics of single BCR B cells in a single B cell: (1)H + K; (2)H + L; 3.Statistics of dual BCR B cells in a single B cell: (1)H + L1 + L2; (2)H + K1 + K2; (3)H + K + L; (4)others: H1 + H2 + K; H1 + H2 + K + L;H1 + H2 + L, H1 + H2 + K1 + K2 etc.
Data analysis process and quality controlWe first used R (version 4.2.3) and Seurat package (version 5.1.0) to analyze the quality control of the scRNA-seq data for each sample:
1.Filter the low-quality cells: cells from mice with higher than 13% of mitochondrial contents and cells with fewer than 500 UMIs or fewer than 200 features; cells from children with less than 5% of mitochondrial contents and cells of more than 1000 but fewer than 20,000 UMIs or fewer than 500 features; cells from adults with less than 5% of mitochondrial contents and cells of more than 500 UMIs or fewer than 500 features.
2.The DoubletFinder software package (version 2.0.4) was used to find and eliminate possible doublets, and filtering was not included in the analysis.
3.Based on the barcode of each cell within each sample, B cells that meet the quality control criteria for both scRNA-seq and scBCR-seq are selected. This ensures that only B cells with complete data for both RNA sequencing and B cell receptor sequencing are included in the subsequent analysis.
Cluster and single and dual BCR B1 and B2 cell transcription factor expression analysis of single-cell transcriptome dataThe single-cell transcriptomic data after quality control were analyzed using R language (version 4.2.3) and the Seurat software package (version 4.4.0). Principal component analysis (PCA) was performed using the JackStraw algorithm and the PCEIbow plotting function. Subsequently, the selected principal components were subjected to Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction and clustering analysis. The UMAP method is used to map high-dimensional data onto a two-dimensional or three-dimensional space, allowing for the visualization and differentiation of cellular clusters.
Using the markers provided in the raw data, the pre-B, B1, and B2 subgroups of cells from mice on Day 6, Week 10, and Month 15 were annotated as follows: (1) Pre-B cell markers include genes such as Vpreb3, Sdc4, Sys1, etc.; (2) B1 cell markers include genes such as Bhlhe41 (also known as Plzf), Cd9, Zbtb32, etc.; (3) B2 cell markers include genes such as Fcer2a, Ighd, etc. However, due to the extremely low proportion of B1 cells in human peripheral blood, we were unable to cluster a B1 cell subgroup for analysis.For the data of children and adult individuals, they are divided based on the expression of different differential genes: (1)naive B cell makers: IL4R, (2)memory B cell makers: IGHA1,CRIP2,ITGB1;(3)marginal B cell makers: FCRL5,FGR. The cell clusters and gene expression profiles of humans and mice are shown in Supplementary Fig. 1.
To visualize B cell subgroup information using ggplot, the following steps can be taken: 1.Map the different B cell subgroups identified by their BCR pairings to the cluster: (1)H + K; (2)H + L3;H + L1 + L2;(4)H + K1 + K2; (5)H + K + L;(6) others; (7) non-pairing; 2.Map the types of BCR B cells to the cluster:(1)single BCR (2)dual BCR (3)non-pairing BCR;
For the additional analysis comparing the transcription factors between single and dual BCR B cells across three age groups in mice.
Statistical analysesData are presented as mean ± SD.Statistical analysis was performed using IBM SPSS Statistics 26 software.For the comparison of continuous variables between two groups, an independent samples T-test was used. For the comparison of categorical variables, a chi-square test was employed. A p-value of less than 0.05 was considered to indicate statistical significance. Data visualization was performed using R Studio (version 3.3.3), Origin (version 2022), and GraphPad Prism (version 5) software.
Comments (0)