Last updated: 2026-01-27

Checks: 6 1

Knit directory: paed-airway-allTissues/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: uncommitted changes

The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20230811)

The command set.seed(20230811) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: relative

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Repository version: 378760b

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 378760b. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .RData
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    analysis/.DS_Store
    Ignored:    analysis/naiveB_top10_markers_whitebg.png
    Ignored:    data/.DS_Store
    Ignored:    data/RDS/
    Ignored:    figure/.DS_Store
    Ignored:    output/.DS_Store
    Ignored:    output/CAM_results/
    Ignored:    output/CSV/.DS_Store
    Ignored:    output/CSV_v2/.DS_Store
    Ignored:    output/CSV_v2/Adenoids/.DS_Store
    Ignored:    output/CSV_v2/BAL/.DS_Store
    Ignored:    output/CSV_v2/Bronchial_brushings/.DS_Store
    Ignored:    output/CSV_v2/Nasal_brushings/.DS_Store
    Ignored:    output/CSV_v2/Tonsils/.DS_Store
    Ignored:    output/DGE/.DS_Store
    Ignored:    output/DGE/RUVseq_TonsilAtlas/.DS_Store
    Ignored:    output/DGE/RUVseq_TonsilAtlas_k1/
    Ignored:    output/DGE/RUVseq_earlyAIR_Tonsils/.DS_Store
    Ignored:    output/DGE/Tonsils/RUV_Tonsils_Double negative T/
    Ignored:    output/RDS/
    Ignored:    output/plots/

Untracked files:
    Untracked:  All_metadata_combinedplots.Rhistory
    Untracked:  analysis/03_Batch_Integration.Rmd
    Untracked:  analysis/Age_proportions.Rmd
    Untracked:  analysis/Age_proportions_AllBatches.Rmd
    Untracked:  analysis/All_Batches_QCExploratory_v2.Rmd
    Untracked:  analysis/All_Umaps_and_marker_plots.Rmd
    Untracked:  analysis/All_metadata.Rmd
    Untracked:  analysis/Annotation_BAL.Rmd
    Untracked:  analysis/Annotation_Bronchial_brushings.Rmd
    Untracked:  analysis/Annotation_Nasal_brushings.Rmd
    Untracked:  analysis/BatchCorrection_Adenoids.Rmd
    Untracked:  analysis/BatchCorrection_Nasal_brushings.Rmd
    Untracked:  analysis/BatchCorrection_Tonsils.Rmd
    Untracked:  analysis/Batch_Integration_&_Downstream_analysis.Rmd
    Untracked:  analysis/Batch_correction_&_Downstream.Rmd
    Untracked:  analysis/Cell_cycle_regression.Rmd
    Untracked:  analysis/Clustering_Tonsils_v2.Rmd
    Untracked:  analysis/Cross-tissue_analysis.Rmd
    Untracked:  analysis/DGE_RUVSeq_Adenoids.Rmd
    Untracked:  analysis/DGE_RUVSeq_Bronchial_brushings.Rmd
    Untracked:  analysis/DGE_RUVSeq_Nasal_brushings.Rmd
    Untracked:  analysis/DGE_earlyAIR_tAtlas_Naïve_Bcells_Adenoids.Rmd
    Untracked:  analysis/DGE_earlyAIR_tAtlas_Naïve_Bcells_activated.Rmd
    Untracked:  analysis/DGE_earlyAIR_tAtlas_Plasma_Bcells.Rmd
    Untracked:  analysis/DGE_earlyAIR_vs_TonsilAtlas.Rmd
    Untracked:  analysis/Explore_census.Rmd
    Untracked:  analysis/Lineage_subclustering_Bcells.Rmd
    Untracked:  analysis/Lineage_subclustering_Myeloid_cells.Rmd
    Untracked:  analysis/Lineage_subclustering_Tcells.Rmd
    Untracked:  analysis/Lineages_subclustering.Rmd
    Untracked:  analysis/Master_metadata.Rmd
    Untracked:  analysis/Pediatric_Vs_Adult_Atlases.Rmd
    Untracked:  analysis/Preprocessing_Batch1_Nasal_brushings.Rmd
    Untracked:  analysis/Preprocessing_Batch2_Tonsils.Rmd
    Untracked:  analysis/Preprocessing_Batch3_Adenoids.Rmd
    Untracked:  analysis/Preprocessing_Batch4_Bronchial_brushings.Rmd
    Untracked:  analysis/Preprocessing_Batch5_Nasal_brushings.Rmd
    Untracked:  analysis/Preprocessing_Batch6_BAL.Rmd
    Untracked:  analysis/Preprocessing_Batch7_Bronchial_brushings.Rmd
    Untracked:  analysis/Preprocessing_Batch8_Adenoids.Rmd
    Untracked:  analysis/Preprocessing_Batch9_Tonsils.Rmd
    Untracked:  analysis/TonsilsVsAdenoids.Rmd
    Untracked:  analysis/cell_cycle_regression.R
    Untracked:  analysis/testing_age_all.Rmd
    Untracked:  code/DGE_utils.R
    Untracked:  code/pseudobulk_analysis.R
    Untracked:  data/Cell_labels_Gunjan_v2/
    Untracked:  data/Cell_labels_Mel/
    Untracked:  data/Cell_labels_Mel_v2/
    Untracked:  data/Cell_labels_Mel_v3/
    Untracked:  data/Cell_labels_modified_Gunjan/
    Untracked:  data/Gene_sets/
    Untracked:  data/Hs.c2.cp.reactome.v7.1.entrez.rds
    Untracked:  data/Raw_feature_bc_matrix/
    Untracked:  data/cell_labels_Mel_v4_Dec2024/
    Untracked:  data/celltypes_Mel_GD_v3.xlsx
    Untracked:  data/celltypes_Mel_GD_v4_no_dups.xlsx
    Untracked:  data/celltypes_Mel_modified.xlsx
    Untracked:  data/celltypes_Mel_v2.csv
    Untracked:  data/celltypes_Mel_v2.xlsx
    Untracked:  data/celltypes_Mel_v2_MN.xlsx
    Untracked:  data/celltypes_for_mel_MN.xlsx
    Untracked:  data/col_palette.xlsx
    Untracked:  data/earlyAIR_sample_sheets_combined.xlsx
    Untracked:  figure/DGE_Tonsils_CD8.Rmd/
    Untracked:  metadata_export.rds
    Untracked:  metadata_export1.rds
    Untracked:  naiveB_top10_markers.png
    Untracked:  naiveB_top10_markers_whitebg.png
    Untracked:  output/CSV/All_tissues.propeller.xlsx
    Untracked:  output/CSV/Bronchial_brushings/
    Untracked:  output/CSV/Bronchial_brushings_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/
    Untracked:  output/CSV/G000231_Neeland_Adenoids.propeller.xlsx
    Untracked:  output/CSV/G000231_Neeland_Bronchial_brushings.propeller.xlsx
    Untracked:  output/CSV/G000231_Neeland_Nasal_brushings.propeller.xlsx
    Untracked:  output/CSV/G000231_Neeland_Tonsils.propeller.xlsx
    Untracked:  output/CSV/Nasal_brushings/
    Untracked:  output/CSV_v3/B_cell_lineage/
    Untracked:  output/CSV_v3/Epithelial_lineage/
    Untracked:  output/DGE/Adenoids/
    Untracked:  output/DGE/Adenoids_B_memory/
    Untracked:  output/DGE/Adenoids_CD4/
    Untracked:  output/DGE/Adenoids_CD8/
    Untracked:  output/DGE/Adenoids_Cycling_GCB/
    Untracked:  output/DGE/Adenoids_IFN/
    Untracked:  output/DGE/Bronchial_brushings/
    Untracked:  output/DGE/DGE_Age_summary_BAL.xlsx
    Untracked:  output/DGE/Nasal_brushings/
    Untracked:  output/DGE/RUV/
    Untracked:  output/DGE/RUVseq_earlyAIR_Adenoids/
    Untracked:  output/DGE/RUVseq_earlyAIR_Bronchial_brushings/
    Untracked:  output/DGE/RUVseq_earlyAIR_Nasal_brushings/
    Untracked:  output/DGE/TonsilAtlas/
    Untracked:  output/DGE/Tonsils/DGE_Age_summary_Tonsils.xlsx
    Untracked:  output/DGE/Tonsils/RUVTonsils_Memory B cells/
    Untracked:  output/DGE/Tonsils/RUV_Tonsils_Memory B cells/
    Untracked:  output/DGE/Tonsils/Tonsils_Naïve B cells/
    Untracked:  output/DGE/Tonsils_CD4/
    Untracked:  output/DGE/Tonsils_CD8/
    Untracked:  output/DGE/Tonsils_IFN/
    Untracked:  paed_sub_L1_cells.rds

Unstaged changes:
    Deleted:    02_QC_exploratoryPlots.Rmd
    Deleted:    02_QC_exploratoryPlots.html
    Modified:   analysis/00_AllBatches_overview.Rmd
    Modified:   analysis/01_QC_emptyDrops.Rmd
    Modified:   analysis/02_QC_exploratoryPlots.Rmd
    Modified:   analysis/Adenoids.Rmd
    Modified:   analysis/Adenoids_v2.Rmd
    Modified:   analysis/Age_modeling.Rmd
    Modified:   analysis/Age_modelling_Adenoids.Rmd
    Modified:   analysis/Age_modelling_Tonsils.Rmd
    Modified:   analysis/AllBatches_QCExploratory.Rmd
    Modified:   analysis/BAL.Rmd
    Modified:   analysis/BAL_v2.Rmd
    Modified:   analysis/Bronchial_brushings.Rmd
    Modified:   analysis/Bronchial_brushings_v2.Rmd
    Modified:   analysis/DGE_TonsilAtlas.Rmd
    Modified:   analysis/DGE_Tonsils_CD8.Rmd
    Modified:   analysis/DGE_analysis_George.Rmd
    Modified:   analysis/DGE_earlyAIR_tAtlas_CD4_TCM.Rmd
    Modified:   analysis/DGE_earlyAIR_tAtlas_CD4_TFH.Rmd
    Modified:   analysis/DGE_earlyAIR_tAtlas_CD4_TN.Rmd
    Modified:   analysis/DGE_earlyAIR_tAtlas_CD8_TF.Rmd
    Modified:   analysis/DGE_earlyAIR_tAtlas_CD8_TN.Rmd
    Modified:   analysis/DGE_earlyAIR_tAtlas_DZtoLZ_GCB.Rmd
    Modified:   analysis/DGE_earlyAIR_tAtlas_Memory_Bcells.Rmd
    Modified:   analysis/DGE_earlyAIR_tAtlas_Naïve_Bcells.Rmd
    Modified:   analysis/Lineage_subclustering_Epithelial_cells.Rmd
    Modified:   analysis/Nasal_brushings.Rmd
    Modified:   analysis/Nasal_brushings_v2.Rmd
    Modified:   analysis/Subclustering_Adenoids.Rmd
    Modified:   analysis/Subclustering_BAL.Rmd
    Modified:   analysis/Subclustering_Bronchial_brushings.Rmd
    Modified:   analysis/Subclustering_Nasal_brushings.Rmd
    Modified:   analysis/Subclustering_Tonsils.Rmd
    Modified:   analysis/Tonsils.Rmd
    Modified:   analysis/Tonsils_v2.Rmd
    Modified:   analysis/index.Rmd
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c0.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c1.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c10.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c11.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c12.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c13.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c14.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c15.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c16.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c17.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c2.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c3.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c4.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c5.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c6.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c7.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c8.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/REACTOME-cluster-limma-c9.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c0.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c1.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c10.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c11.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c12.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c13.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c14.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c15.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c16.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c17.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c2.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c3.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c4.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c5.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c6.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c7.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c8.csv
    Modified:   output/CSV/BAL_Marker_gene_clusters.limmaTrendRNA_snn_res.0.4/up-cluster-limma-c9.csv
    Modified:   output/CSV_v2/G000231_Neeland_Nasal_brushings.propeller.xlsx
    Modified:   output/DGE/RUVseq_TonsilAtlas/DGE_RUVSeq_Age_TonsilAtlas.xlsx
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 TCM/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 TCM/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 TCM/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 TFH/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 TFH/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 TFH/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 TN/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 TN/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 TN/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 Treg-eff/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 Treg-eff/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 Treg-eff/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 Treg/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 Treg/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 Treg/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 effector/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 effector/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD4 effector/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD8 TF/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD8 TF/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD8 TF/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD8 TN/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD8 TN/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/CD8 TN/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Cycling GCB/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Cycling GCB/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Cycling GCB/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DGE_RUVSeq_Age_Tonsils.xlsx
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DGE_RUVSeq_Tonsils_all_results.rds
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ G2Mphase/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ G2Mphase/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ G2Mphase/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ GCB/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ GCB/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ GCB/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ early  Sphase/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ early  Sphase/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ early  Sphase/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ late Sphase/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ late Sphase/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZ late Sphase/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZtoLZ GCB transition/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZtoLZ GCB transition/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/DZtoLZ GCB transition/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Double negative T/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Double negative T/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Double negative T/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Early GC-committed NBC/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Early GC-committed NBC/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Early GC-committed NBC/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Early MBC/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Early MBC/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Early MBC/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Early PC precursor/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Early PC precursor/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Early PC precursor/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Follicular dendritic cells/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Follicular dendritic cells/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Follicular dendritic cells/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/GC-commited metabolic activation/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/GC-commited metabolic activation/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/GC-commited metabolic activation/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Gamma delta T/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Gamma delta T/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Gamma delta T/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Memory B cells/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Memory B cells/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Memory B cells/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Monocytes_macrophages/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Monocytes_macrophages/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Monocytes_macrophages/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Naïve B cell-IFN/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Naïve B cell-IFN/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Naïve B cell-IFN/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Naïve B cells activated/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Naïve B cells activated/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Naïve B cells activated/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Naïve B cells/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Naïve B cells/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Naïve B cells/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Plasma B cells/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Plasma B cells/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Plasma B cells/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Pre-BCRi II/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Pre-BCRi II/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/Pre-BCRi II/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/T-IFN/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/T-IFN/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/T-IFN/up_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/TFH-LZ-GC/all_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/TFH-LZ-GC/down_res.csv
    Modified:   output/DGE/RUVseq_earlyAIR_Tonsils/TFH-LZ-GC/up_res.csv
    Deleted:    test

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/Lineage_subclustering_Epithelial_cells.Rmd) and HTML (docs/Lineage_subclustering_Epithelial_cells.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
Rmd	378760b	Gunjan Dixit	2026-01-22	Updated lineage subclustering
html	378760b	Gunjan Dixit	2026-01-22	Updated lineage subclustering

Load libraries

suppressPackageStartupMessages({
  library(BiocStyle)
  library(tidyverse)
  library(here)
  library(dplyr)
  library(Seurat)
  library(clustree)
  library(paletteer)
  library(viridis)
  library(ggforce)
  library(ggridges)
  library(kableExtra)
  library(RColorBrewer)
  library(data.table)
  library(dplyr)
  library(cowplot)
  library(ggplot2)
  library(paletteer)
  library(patchwork)
  library(harmony)
  library(BiocParallel)
  library(circlize)
  library(presto)
  library(gtools)
})

Epithelial cells

paed_sub <- readRDS(here("output/RDS/Lineages_RDS_combined/SEU_Epithelial_Cells.rds"))

if (!"umap.sub" %in% names(paed_sub@reductions)) {
  
  paed_sub <- paed_sub |>
    FindVariableFeatures() |>
    ScaleData() |>
    RunPCA()
  
  gc()
  
paed_sub[["RNA"]]
paed_sub[["RNA"]] <- JoinLayers(paed_sub[["RNA"]])
gc()

  paed_sub <- RunUMAP(paed_sub, dims = 1:30, reduction = "pca",
                      reduction.name = "umap.sub")
  
  paed_sub <- paed_sub |>
    FindNeighbors(dims = 1:30, reduction = "pca") |>
    FindClusters(resolution = 0.05)
  gc()
  
  saveRDS(paed_sub, here("output/RDS/Lineages_RDS_combined/SEU_Epithelial_Cells.rds")) 
  } else {
  message("skipping processing")
}

skipping processing

paed_sub$tissue <- factor(
  paed_sub$tissue,
  levels = c("Tonsils", "Adenoids", "Nasal_brushings",
             "Bronchial_brushings", "BAL")
)

Previous labels

Excluding counts less than 100 to avoid too many labels in the plot

#sort(table(paed_sub$cell_labels_v2), decreasing = T)

k <- names(which(table(paed_sub$cell_labels_v2) > 50))
paed_sub$cell_labels_v22 <- ifelse(
    as.character(paed_sub$cell_labels_v2) %in% k,
    as.character(paed_sub$cell_labels_v2),
    NA
)

DimPlot(
    paed_sub,
    group.by = "cell_labels_v22",
    reduction = "umap.sub",
    label = TRUE,
    repel = TRUE,
    label.size = 4.5,
    shuffle = TRUE, raster = F)

#table( paed_sub$cell_labels_v2,paed_sub$RNA_snn_res.0.5)

opt_res <- "RNA_snn_res.0.05"  
n <- nlevels(paed_sub$RNA_snn_res.0.05)
paed_sub$RNA_snn_res.0.05 <- factor(paed_sub$RNA_snn_res.0.05, levels = seq(0,n-1))
paed_sub$seurat_clusters <- NULL
paed_sub$cluster <- paed_sub$RNA_snn_res.0.05
Idents(paed_sub) <- paed_sub$cluster

DimPlot(paed_sub, reduction = "umap.sub", raster =F, repel=T, label = T)

DimPlot(paed_sub, reduction = "umap.sub", raster =F, repel=T, label = T, split.by = "tissue")

Level 1 labels

new_levels <- c(
  "0" = "Non-ciliated Epithelial",
  "1" = "Ciliated Epithelial",
  "2" = "Ciliated Epithelial",
   "3" = "Secretory Epithelial",
  "4" =    "Non-ciliated Epithelial",
  "5" =  "Non-ciliated Epithelial",
    "6" ="Erythroid cells",
"7" = "Ciliated Epithelial",
"8" = "Non-ciliated Epithelial"
)
paed_sub <- RenameIdents(paed_sub, new_levels)
paed_sub$cell_labels <- NULL
paed_sub$cell_labels_l1 <- Idents(paed_sub)
paed_sub$cell_labels_l1 <- factor(
  Idents(paed_sub),
  levels = c(
    "Non-ciliated Epithelial",
    "Secretory Epithelial",
    "Ciliated Epithelial",
    "Erythroid cells"
  )
)

Idents(paed_sub) <- paed_sub$cell_labels_l1

DimPlot(paed_sub, reduction = "umap.sub", raster =F, repel=T, label = T)

Adding CellTypist info

files  <- c(
  paeds  = here("../EarlyAir_paper/output/CellTypist_results/cellTtypist_Epithelial_Paed_Covid19.csv"),
  immune = here("../EarlyAir_paper/output/CellTypist_results/cellTtypist_Epithelial_Immune_low.csv"),
  lung_airway   = here("../EarlyAir_paper/output/CellTypist_results/cellTtypist_Epithelial_Lung_Airway.csv"),
   lung_atlas   = here("../EarlyAir_paper/output/CellTypist_results/cellTtypist_Epithelial_Lung_atlas.csv"),
  tonsil = here("../EarlyAir_paper/output/CellTypist_results/cellTtypist_Epithelial_Human_Tonsil.csv")
)

for (nm in names(files)) {
  df <- read.csv(files[nm])
  paed_sub[[paste0("celltypist_", nm)]] <- df$predicted_labels
}

plot_celltype_heatmap <- function(
  seu,
  cluster_col = "cluster",
  celltype_col,
  min_n = 100,
  title,
  scale = "row",
  fontsize = 10
) {
  seu@meta.data %>%
    dplyr::count(.data[[cluster_col]], .data[[celltype_col]]) %>%
    dplyr::filter(n >= min_n) %>%
    tidyr::pivot_wider(
      names_from = .data[[celltype_col]],
      values_from = n,
      values_fill = 0
    ) %>%
    as.data.frame() %>%
    tibble::column_to_rownames(cluster_col) %>%
    t() %>%
    pheatmap::pheatmap(
      scale = scale,
      main = title,
      fontsize = fontsize
    )
}

plot_celltype_heatmap(paed_sub,celltype_col = "celltypist_paeds",title = "CellTypist Predictions (Paed Covid Study) by Cluster")

plot_celltype_heatmap(paed_sub,celltype_col = "celltypist_lung_airway",title = "CellTypist Predictions (Lung Atlas) by Cluster")

plot_celltype_heatmap( paed_sub,celltype_col = "celltypist_lung_atlas",title = "CellTypist Predictions (Lung Atlas) by Cluster")

plot_celltype_heatmap(paed_sub,celltype_col = "celltypist_tonsil", title = "Azimuth Predictions (Tonsil Atlas) by Cluster")

plot_ct <- function(obj, col) {
  k <- names(which(table(obj[[col]][,1]) > 100))
  DimPlot(
    obj,
    group.by = col,
    reduction = "umap.sub",
    cells = colnames(obj)[obj[[col]][,1] %in% k],
    label = TRUE, repel = TRUE, label.size = 2.5,
    shuffle = TRUE, raster = FALSE
  )
}

plot_ct(paed_sub, "celltypist_paeds") + ggtitle("Paed Covid Study")

plot_ct(paed_sub, "celltypist_immune") + ggtitle("Immune low dataset")

plot_ct(paed_sub, "celltypist_lung_airway") + ggtitle("Lung Airway Study")

plot_ct(paed_sub, "celltypist_lung_atlas") + ggtitle("Lung Atlas")

#plot_ct(paed_sub, "celltypist_tonsil") + ggtitle("Tonsil Atlas")

plot_ct_bar <- function(obj, col) {
  k <- names(which(table(obj[[col]][,1]) > 100))
  
  df <- obj@meta.data %>%
    filter(get(col) %in% k) %>%   
    count(cluster, !!sym(col)) %>%
    group_by(cluster) %>%
    mutate(frac = n / sum(n)) %>%
    ungroup()
  
  ggplot(df, aes(x = cluster, y = frac, fill = !!sym(col))) +
    geom_col(position = "stack") +
    scale_y_continuous(labels = scales::percent) +
    labs(x = "Cluster", y = "Fraction of cells", fill = col) +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 45, hjust = 1))
}

plot_ct_bar(paed_sub, "celltypist_paeds") + ggtitle("Paed Covid Study")

plot_ct_bar(paed_sub, "celltypist_immune") + ggtitle("Immune low dataset")

plot_ct_bar(paed_sub, "celltypist_lung_airway") + ggtitle("Lung Airway study")

plot_ct_bar(paed_sub, "celltypist_lung_atlas") + ggtitle("Lung Atlas")

plot_ct_bar(paed_sub, "celltypist_tonsil") + ggtitle("Tonsil Atlas")

features <- list(
  Epithelial_core = c("EPCAM", "KRT8", "KRT18", "KRT19"),
  Basal_cells = c("KRT5", "KRT14", "TP63"),
  Ciliated_cells = c("FOXJ1", "PIFO", "DNAH5", "DNAI1"),
  Secretory_goblet = c("MUC5AC", "SPDEF", "AGR2"),
  Secretory_club = c("SCGB1A1", "SCGB3A1", "SCGB3A2"),
  Ionocytes = c("FOXI1", "CFTR", "ASCL3"),
  Tuft_cells = c("POU2F3", "TRPM5", "AVIL"),
  Squamous_epithelial = c("KRT4", "KRT13", "KRT16"),
  Cycling_epithelial = c("MKI67", "TOP2A", "HMGB2"),
  EMT_like = c("VIM", "COL1A1", "FN1")
)

for (g in features) {
  print(
    FeaturePlot(
      paed_sub,
      features  = g,
      reduction = "umap.sub",
      raster    = FALSE,
      label     = TRUE
      #split.by  = "tissue"
    ) #
  )
}

Warning: The `slot` argument of `FetchData()` is deprecated as of SeuratObject 5.0.0.
ℹ Please use the `layer` argument instead.
ℹ The deprecated feature was likely used in the Seurat package.
  Please report the issue at <https://github.com/satijalab/seurat/issues>.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
generated.

Warning: The following requested variables were not found: KRT18

for (g in features) {
  print(
    FeaturePlot(
      paed_sub,
      features  = g,
      reduction = "umap.sub",
      raster    = FALSE,
      label     = TRUE,
      split.by  = "tissue",
      min.cutoff = "q10",
      max.cutoff = "q90"
    )
  )
}

Warning: The following requested variables were not found: KRT18

Warning: All cells have the same value (0.282118808412447) of "SCGB3A1"

Warning: All cells have the same value (0.0869732528804846) of "SCGB3A2"

Warning: All cells have the same value (0.0707303427496486) of "TRPM5"

Ciliated Epithelial

ciliated <-  subset(paed_sub, idents = "Ciliated Epithelial")
keep_cells <- paed_sub$cell_labels_l1 %in% "Ciliated Epithelial"
ciliated@assays$RNA@cells@.Data <- paed_sub@assays$RNA@cells@.Data[keep_cells, ]
ciliated

An object of class Seurat 
18076 features across 24034 samples within 1 assay 
Active assay: RNA (18076 features, 2000 variable features)
 3 layers present: data, counts, scale.data
 4 dimensional reductions calculated: pca, umap.unintegrated, umap.l1, umap.sub

  ciliated <- ciliated %>%
  FindVariableFeatures() %>%
  ScaleData() %>%
  RunPCA()

Finding variable features for layer counts

Centering and scaling data matrix

Warning: Different features in new layer data than already exists for
scale.data

PC_ 1 
Positive:  CTSC, KRT7, EPAS1, CREB3L1, AQP5, ADAM28, VEGFA, SERPINB3, RHOV, RDH10 
       CXCL1, SLC6A14, CYP2F1, KLK11, SPDEF, CEACAM6, PRSS8, SCNN1B, GPRC5A, KIAA1324 
       MUC5B, NR4A1, UPK1B, S100A16, TSPAN8, ECE1, FCGBP, SLC4A4, TCIM, XBP1 
Negative:  SYT5, SCGB2A1, SLC28A2, CD200R1, EGLN3, LAPTM5, SYT8, IFITM10, IFT57, WDR97 
       ADH7, ADGRB2, HSD17B13, APOD, FAM229B, DEUP1, MORN2, ANLN, FOXN4, CDC20 
       E2F7, NMU, PLK4, NEK2, ALDH3A1, SHISA8, CDKN2A, TMEM200C, AZU1, VIM 
PC_ 2 
Positive:  ISG15, IFI6, OAS3, TAP1, BST2, IFITM3, IDO1, DUOXA2, ISG20, IFIT2 
       OAS1, CFB, DUOX2, CMPK2, XAF1, WARS, IFI44L, HERC6, OAS2, IFIT1 
       GBP1, TAP2, UBE2L6, USP18, MX1, OASL, HELZ2, MX2, STAT2, IFI44 
Negative:  MT-ND3, CHST9, SELENOP, UGT2A1, MT-CYB, C12orf75, FAM216B, DSTN, GSTA2, PNRC1 
       CASC4, CPD, C1orf141, PROS1, PTGFR, TMEM123, SEC14L3, PPIL6, EIF1, TSC22D1 
       IFT57, MXRA7, CYP51A1, CP, MT-ND2, ADH1C, MLF1, EIF4G2, RHOQ, UFM1 
PC_ 3 
Positive:  BTG3, CDC20B, TMEM106C, ANLN, USP13, POGLUT3, NEK2, MCIDAS, PLK4, FOXN4 
       HES6, E2F7, CEP78, DEUP1, CDK1, CDC20, BUB1, LYZ, HYLS1, CCNO 
       POC1A, STIL, SHISA8, RANBP9, ISYNA1, TUBG1, H2AFZ, EPS8L3, CCDC14, COCH 
Negative:  SLC34A2, G0S2, VSTM2L, ZFP36, FYB2, TDRP, CEBPD, C8orf34, KSR1, PTMS 
       BTG2, RHOB, SCGB1A1, OXTR, LBH, DDIT4, SLC9A3R2, FGF14, FCGBP, LAMB3 
       CXCL8, HES4, NUPR1, PLAUR, SCGB3A1, PAMR1, IRF7, METRN, VIM, NFKBIA 
PC_ 4 
Positive:  LRRC26, CCNO, MCIDAS, SHISA8, FOXN4, HES6, CDC20B, ANLN, CDC20, E2F7 
       CDT1, BUB1, POC1A, PLK4, ASF1B, USP13, PCSK6, WDR62, RNASE1, NEK2 
       POGLUT3, TACC3, TMEM200B, HYLS1, EZH2, CIT, COCH, TOP2A, HELLS, ISYNA1 
Negative:  FAM216B, MORN2, STOML3, AGR3, SPATS2L, LAP3, PARP14, TSPAN19, TMEM123, CASC4 
       PPIL6, B3GNT5, IFT57, HACD4, CSTB, ACTB, GBP3, CABCOCO1, C12orf75, MLF1 
       DNAJA1, APOD, NT5C3A, LAMP3, TEX26, IFIT1, MYCBP, OAS2, SLFN5, OPTN 
PC_ 5 
Positive:  SCGB3A1, FLNB, SCGB1A1, FGF14, FYB2, C8orf34, PROS1, TDRP, SLC34A2, PPP1R9A 
       METRN, CYP2F1, KCNA1, C16orf89, CD4, FMO5, VAV3, MUC5B, MSMB, TSPAN11 
       IRX2, CEACAM6, PTGER4, SPATS1, PTGFR, IRX5, NKX2-1, LRMP, LAMB3, KLK11 
Negative:  CD36, SNCA, TCIM, NOS2, SLC5A8, SLC26A2, SLC39A8, SERPINB4, FMO3, IER3 
       SGK1, ADH7, HBEGF, AREG, GDF15, HSD17B13, FOS, SLC25A25, A4GALT, PTGS2 
       ARG2, MUC2, CA12, DUSP1, NTRK2, SELP, HMGCS2, ATP13A5, EGR1, FHL2

  gc()

             used    (Mb) gc trigger    (Mb) limit (Mb)   max used    (Mb)
Ncells    4501262   240.4   10623544   567.4         NA   10623544   567.4
Vcells 1495582400 11410.4 2219594182 16934.2     163840 2219521738 16933.7

  ciliated <- RunUMAP(ciliated, dims = 1:30, reduction = "pca", reduction.name = "umap.c")

Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
This message will be shown once per session

12:01:56 UMAP embedding parameters a = 0.9922 b = 1.112

12:01:56 Read 24034 rows and found 30 numeric columns

12:01:56 Using Annoy for neighbor search, n_neighbors = 30

12:01:56 Building Annoy index with metric = cosine, n_trees = 50

0%   10   20   30   40   50   60   70   80   90   100%

[----|----|----|----|----|----|----|----|----|----|

**************************************************|
12:01:57 Writing NN index file to temp file /var/folders/q8/kw1r78g12qn793xm7g0zvk94x2bh70/T//RtmpOWBeH2/filef98948c5e872
12:01:57 Searching Annoy index using 1 thread, search_k = 3000
12:02:02 Annoy recall = 100%
12:02:02 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
12:02:02 Initializing from normalized Laplacian + noise (using RSpectra)
12:02:03 Commencing optimization for 200 epochs, with 1043194 positive edges
12:02:03 Using rng type: pcg
12:02:09 Optimization finished

  meta_data_columns <- colnames(ciliated@meta.data)
  columns_to_remove <- grep("^RNA_snn_res", meta_data_columns, value = TRUE)
  ciliated@meta.data <- ciliated@meta.data[, !(colnames(ciliated@meta.data) %in% columns_to_remove)]
  resolutions <- seq(0.1, 0.5, by = 0.1)
  ciliated <- FindNeighbors(ciliated, dims = 1:30, reduction = "pca")

Computing nearest neighbor graph
Computing SNN

  ciliated <- FindClusters(ciliated, resolution = resolutions )

Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 24034
Number of edges: 932370

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9685
Number of communities: 12
Elapsed time: 1 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 24034
Number of edges: 932370

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9590
Number of communities: 14
Elapsed time: 2 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 24034
Number of edges: 932370

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9514
Number of communities: 19
Elapsed time: 2 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 24034
Number of edges: 932370

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9450
Number of communities: 22
Elapsed time: 2 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 24034
Number of edges: 932370

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9394
Number of communities: 23
Elapsed time: 2 seconds

clustree(ciliated, prefix = "RNA_snn_res.")

DimPlot(ciliated, group.by = "RNA_snn_res.0.1", reduction = "umap.c", label = TRUE, label.size = 2.5, repel = TRUE, raster = FALSE )

DimPlot(ciliated, reduction = "umap.c", label = TRUE, label.size = 2.5, repel = TRUE, raster = FALSE, split.by = "tissue" )

DimPlot(ciliated, group.by = "donor", reduction = "umap.c", label = TRUE, label.size = 2.5, repel = TRUE, raster = FALSE, split.by = "tissue" )

Warning: ggrepel: 22 unlabeled data points (too many overlaps). Consider
increasing max.overlaps

Warning: ggrepel: 18 unlabeled data points (too many overlaps). Consider
increasing max.overlaps

Warning: ggrepel: 6 unlabeled data points (too many overlaps). Consider
increasing max.overlaps

opt_res <- "RNA_snn_res.0.1"  
n <- nlevels(ciliated$RNA_snn_res.0.1)
ciliated$RNA_snn_res.0.1 <- factor(ciliated$RNA_snn_res.0.1, levels = seq(0,n-1))
ciliated$seurat_clusters <- NULL
ciliated$cluster <- ciliated$RNA_snn_res.0.1
Idents(ciliated) <- ciliated$cluster

k <- names(which(table(ciliated$celltypist_lung_airway) > 100))
ciliated$celltypist_lung2 <- ifelse(ciliated$celltypist_lung_airway %in% k,
                                  ciliated$celltypist_lung_airway, NA)

DimPlot(
    ciliated,
    group.by = "celltypist_lung2",
    reduction = "umap.c",
    label = TRUE,
    repel = TRUE,
    label.size = 2.5,
    shuffle = TRUE
)

plot_ct_bar(ciliated, "celltypist_paeds")

plot_ct_bar(ciliated, "celltypist_lung_airway")

plot_ct_bar(ciliated, "celltypist_lung_atlas")

#plot_ct_bar(ciliated, "celltypist_immune")
#plot_ct_bar(ciliated, "celltypist_tonsil")
#plot_ct_bar(ciliated, "predicted.celltype.l2") # Tonsil Atlas level 2 
plot_ct_bar(ciliated, "cell_labels_v2") # previous labels

Renaming idents

new_levels <- c(
  "0" = "Ciliated Epithelial",
  "1" = "Ciliated Epithelial",
  "2" = "Ciliated Epithelial",
   "3" = "Ciliated Epithelial",
  "4" =    "Deuterosomal",
  "5" =  "Ciliated Epithelial",
    "6" ="Ciliated Epithelial",
"7" = "Ciliated Epithelial",
"8" = "Ciliated Epithelial",
"9" = "Ciliated Epithelial",
"10" =  "Ciliated Epithelial",
"11" =  "Ciliated Epithelial"
)
ciliated <- RenameIdents(ciliated, new_levels)

ciliated$cell_labels_l2 <- Idents(ciliated)
#ciliated$cell_labels_l3 <- Idents(ciliated)

table(ciliated$cell_labels_l2, ciliated$tissue)

                     
                      Tonsils Adenoids Nasal_brushings Bronchial_brushings
  Ciliated Epithelial      53      236           12633                7369
  Deuterosomal              0        8            2063                  68
                     
                        BAL
  Ciliated Epithelial  1586
  Deuterosomal           18

df_counts <- ciliated@meta.data |>
  count(cell_labels_l2, tissue, name = "n")
df_totals <- df_counts |>
  group_by(cell_labels_l2) |>
  summarise(total = sum(n), .groups = "drop")

ggplot(df_counts, aes(x = cell_labels_l2, y = n, fill = tissue)) +
  geom_bar(stat = "identity", position = "stack") +
  geom_text(data = df_totals,
            aes(x = cell_labels_l2, y = total, label = total),
            vjust = -0.3, size = 3, inherit.aes = FALSE) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  labs(x = "cluster at l2",
       y = "Number of cells",
       fill = "Tissue")

ciliated <- FindSubCluster(ciliated, cluster = "Ciliated Epithelial", graph.name = "RNA_snn", resolution = 0.1, subcluster.name = "E_sub")

Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 21877
Number of edges: 858245

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9702
Number of communities: 11
Elapsed time: 2 seconds

Idents(ciliated) <- ciliated$E_sub

#levels(Idents(paed_sub)) <- mixedsort(levels(Idents(paed_sub)))
#levels(paed_sub$T_sub) <- mixedsort(levels(paed_sub$T_sub))
DimPlot(ciliated, reduction = "umap.c", raster =F, repel=T, label = T, group.by = "E_sub")

ciliated$cell_labels_l3 <- ciliated$E_sub

Idents(ciliated) <- factor(
  Idents(ciliated),
  levels = c(
    "Deuterosomal",
    "Ciliated Epithelial_0",
    "Ciliated Epithelial_1",
    "Ciliated Epithelial_2",
    "Ciliated Epithelial_3",
    "Ciliated Epithelial_4",
    "Ciliated Epithelial_5",
    "Ciliated Epithelial_6",
    "Ciliated Epithelial_7",
    "Ciliated Epithelial_8",
    "Ciliated Epithelial_9",
    "Ciliated Epithelial_10"
  )
)
ciliated$cell_labels_l3 <- Idents(ciliated)

DE markers

The marker genes for this subclustering can be found here-

Ciliated_population_subclusters

features.no.mt <- rownames(paed_sub)[!grepl("^MT-", rownames(paed_sub))]
gc()

             used    (Mb) gc trigger    (Mb) limit (Mb)   max used    (Mb)
Ncells    4544925   242.8   10623544   567.4         NA   10623544   567.4
Vcells 1493661024 11395.8 2219594182 16934.2     163840 2219521738 16933.7

paed_sub.markers <- FindAllMarkers(
  ciliated,
  only.pos = TRUE,
  min.pct = 0.25,
  logfc.threshold = 0.25,
  features = features.no.mt
)
gc()

             used    (Mb) gc trigger    (Mb) limit (Mb)   max used    (Mb)
Ncells    4621613   246.9   10623544   567.4         NA   10623544   567.4
Vcells 1493997208 11398.3 3196391621 24386.6     163840 3196082098 24384.2

top10 <- paed_sub.markers %>%
    group_by(cluster) %>%
    top_n(n = 10, wt = avg_log2FC) %>%
    ungroup() %>%
    distinct(gene, .keep_all = TRUE) %>%
    arrange(cluster, desc(avg_log2FC))


best.wilcox.gene.per.cluster <- paed_sub.markers |>
  group_by(cluster) |>
  slice_max(avg_log2FC, n = 1) |>
  select(cluster, gene)
best.wilcox.gene.per.cluster

# A tibble: 12 × 2
# Groups:   cluster [12]
   cluster                gene    
   <fct>                  <chr>   
 1 Deuterosomal           LINGO1  
 2 Ciliated Epithelial_0  G0S2    
 3 Ciliated Epithelial_1  LGI1    
 4 Ciliated Epithelial_2  HSD17B13
 5 Ciliated Epithelial_3  LY6D    
 6 Ciliated Epithelial_4  OASL    
 7 Ciliated Epithelial_5  KCNA1   
 8 Ciliated Epithelial_6  PPARGC1A
 9 Ciliated Epithelial_7  STATH   
10 Ciliated Epithelial_8  NDRG1   
11 Ciliated Epithelial_9  SLAMF7  
12 Ciliated Epithelial_10 SELP

table(ciliated$cell_labels_l3)


          Deuterosomal  Ciliated Epithelial_0  Ciliated Epithelial_1 
                  2157                   4535                   3611 
 Ciliated Epithelial_2  Ciliated Epithelial_3  Ciliated Epithelial_4 
                  3124                   2261                   1668 
 Ciliated Epithelial_5  Ciliated Epithelial_6  Ciliated Epithelial_7 
                  1639                   1183                   1086 
 Ciliated Epithelial_8  Ciliated Epithelial_9 Ciliated Epithelial_10 
                  1044                    870                    856

cluster_colors <- paletteer::paletteer_d("pals::glasbey")[factor(top10$cluster)]

DotPlot(ciliated,    
        features = unique(top10$gene),
        group.by = "E_sub",
        cols = c("azure1", "blueviolet"),
        dot.scale = 3, assay = "RNA") +
    RotatedAxis() +
    FontSize(y.text = 8, x.text = 12) +
    labs(y = element_blank(), x = element_blank()) +
    coord_flip() +
    theme(axis.text.y = element_text(color = cluster_colors)) +
    ggtitle("Top 10 marker genes per cluster (Seurat)")

Warning: Vectorized input to `element_text()` is not officially supported.
ℹ Results may be unexpected or may change in future versions of ggplot2.

out_markers <- here("output",
            "CSV_v3","Epithelial_lineage",
            paste("Marker_genes_Reclustered_ciliated.",opt_res, sep = ""))

dir.create(out_markers, recursive = TRUE, showWarnings = FALSE)

for (cl in unique(paed_sub.markers$cluster)) {
  cluster_data <- paed_sub.markers %>% dplyr::filter(cluster == cl)
  file_name <- here(out_markers, paste0("G000231_Neeland_", cl, ".csv"))
  if (!file.exists(file_name)) {
  write.csv(cluster_data, file = file_name)
  }
}

Pairwise DE testing for Ciliated Epithelial cells

ids <- unique(ciliated$cell_labels_l3)
res_list <- list()

for (i in 1:(length(ids) - 1)) {
  for (j in (i + 1):length(ids)) {

    id1 <- ids[i]
    id2 <- ids[j]

    res <- FindMarkers(
      ciliated,
      ident.1 = id1,
      ident.2 = id2,
      only.pos = TRUE,
      min.pct = 0.25,
      logfc.threshold = 0.5,
      features = features.no.mt
    )

    top_genes <- res %>%
      filter(pct.1 > 0.1 & pct.2 > 0.1) %>%
      arrange(p_val_adj, desc(avg_log2FC)) %>%
      head(10) %>%
      rownames()

    res_list[[paste(id1, "vs", id2, sep = "_")]] <- res

    cat("\n\n####", paste(id1, "vs", id2), "\n\n")

    print(
      VlnPlot(
        ciliated,
        features = top_genes,
        idents = c(id1, id2),
        pt.size = 0
      ) + patchwork::plot_annotation(title = paste(id1, "vs", id2))
    )

    kable(
      head(res, 50),
      caption = paste("Top 50 markers for", id1, "vs", id2)
    )
  }
}

Ciliated Epithelial_8 vs Ciliated Epithelial_3

Ciliated Epithelial_8 vs Ciliated Epithelial_1

Ciliated Epithelial_8 vs Ciliated Epithelial_4

Ciliated Epithelial_8 vs Deuterosomal

Ciliated Epithelial_8 vs Ciliated Epithelial_0

Ciliated Epithelial_8 vs Ciliated Epithelial_2

Ciliated Epithelial_8 vs Ciliated Epithelial_7

Ciliated Epithelial_8 vs Ciliated Epithelial_6

Ciliated Epithelial_8 vs Ciliated Epithelial_5

Ciliated Epithelial_8 vs Ciliated Epithelial_10

Ciliated Epithelial_8 vs Ciliated Epithelial_9

Ciliated Epithelial_3 vs Ciliated Epithelial_1

Ciliated Epithelial_3 vs Ciliated Epithelial_4

Ciliated Epithelial_3 vs Deuterosomal

Ciliated Epithelial_3 vs Ciliated Epithelial_0

Ciliated Epithelial_3 vs Ciliated Epithelial_2

Ciliated Epithelial_3 vs Ciliated Epithelial_7

Ciliated Epithelial_3 vs Ciliated Epithelial_6

Ciliated Epithelial_3 vs Ciliated Epithelial_5

Ciliated Epithelial_3 vs Ciliated Epithelial_10

Ciliated Epithelial_3 vs Ciliated Epithelial_9

Ciliated Epithelial_1 vs Ciliated Epithelial_4

Ciliated Epithelial_1 vs Deuterosomal

Ciliated Epithelial_1 vs Ciliated Epithelial_0

Ciliated Epithelial_1 vs Ciliated Epithelial_2

Ciliated Epithelial_1 vs Ciliated Epithelial_7

Ciliated Epithelial_1 vs Ciliated Epithelial_6

Ciliated Epithelial_1 vs Ciliated Epithelial_5

Ciliated Epithelial_1 vs Ciliated Epithelial_10

Ciliated Epithelial_1 vs Ciliated Epithelial_9

Ciliated Epithelial_4 vs Deuterosomal

Ciliated Epithelial_4 vs Ciliated Epithelial_0

Ciliated Epithelial_4 vs Ciliated Epithelial_2

Ciliated Epithelial_4 vs Ciliated Epithelial_7

Ciliated Epithelial_4 vs Ciliated Epithelial_6

Ciliated Epithelial_4 vs Ciliated Epithelial_5

Ciliated Epithelial_4 vs Ciliated Epithelial_10

Ciliated Epithelial_4 vs Ciliated Epithelial_9

Deuterosomal vs Ciliated Epithelial_0

Deuterosomal vs Ciliated Epithelial_2

Deuterosomal vs Ciliated Epithelial_7

Deuterosomal vs Ciliated Epithelial_6

Deuterosomal vs Ciliated Epithelial_5

Deuterosomal vs Ciliated Epithelial_10

Deuterosomal vs Ciliated Epithelial_9

Ciliated Epithelial_0 vs Ciliated Epithelial_2

Ciliated Epithelial_0 vs Ciliated Epithelial_7

Ciliated Epithelial_0 vs Ciliated Epithelial_6

Ciliated Epithelial_0 vs Ciliated Epithelial_5

Ciliated Epithelial_0 vs Ciliated Epithelial_10

Ciliated Epithelial_0 vs Ciliated Epithelial_9

Ciliated Epithelial_2 vs Ciliated Epithelial_7

Ciliated Epithelial_2 vs Ciliated Epithelial_6

Ciliated Epithelial_2 vs Ciliated Epithelial_5

Ciliated Epithelial_2 vs Ciliated Epithelial_10

Ciliated Epithelial_2 vs Ciliated Epithelial_9

Ciliated Epithelial_7 vs Ciliated Epithelial_6

Ciliated Epithelial_7 vs Ciliated Epithelial_5

Ciliated Epithelial_7 vs Ciliated Epithelial_10

Ciliated Epithelial_7 vs Ciliated Epithelial_9

Ciliated Epithelial_6 vs Ciliated Epithelial_5

Ciliated Epithelial_6 vs Ciliated Epithelial_10

Ciliated Epithelial_6 vs Ciliated Epithelial_9

Ciliated Epithelial_5 vs Ciliated Epithelial_10

Ciliated Epithelial_5 vs Ciliated Epithelial_9

Ciliated Epithelial_10 vs Ciliated Epithelial_9

Non-ciliated subclustering

non_ciliated <-  subset(paed_sub, idents = "Non-ciliated Epithelial")
keep_cells <- paed_sub$cell_labels_l1 %in% "Non-ciliated Epithelial"
non_ciliated@assays$RNA@cells@.Data <- paed_sub@assays$RNA@cells@.Data[keep_cells, ]
non_ciliated

An object of class Seurat 
18076 features across 36140 samples within 1 assay 
Active assay: RNA (18076 features, 2000 variable features)
 3 layers present: data, counts, scale.data
 4 dimensional reductions calculated: pca, umap.unintegrated, umap.l1, umap.sub

  non_ciliated <- non_ciliated %>%
  FindVariableFeatures() %>%
  ScaleData() %>%
  RunPCA()

Finding variable features for layer counts

Centering and scaling data matrix

Warning: Different features in new layer data than already exists for
scale.data

PC_ 1 
Positive:  ALOX15, FAT2, MT1X, MKI67, MT-ATP6, UGT2A1, KRT5, SNCA, TOP2A, CD44 
       TNC, KIFC1, NUSAP1, TYMS, BIRC5, FOXM1, CIT, TPX2, PHF19, SPC24 
       STMN1, MYBL2, RBP1, AURKB, HLF, HJURP, KIF2C, NOTCH1, TACC3, CDK1 
Negative:  DUOXA2, DUOX2, C15orf48, ISG15, SPNS2, SECTM1, OAS1, GPRC5A, LCN2, DUSP5 
       CMPK2, PDZK1IP1, MX2, CFB, IDO1, RSAD2, WARS, LMO7, OAS3, OAS2 
       OASL, ISG20, PRSS22, GBP1, CEACAM5, IFIT1, CD38, IFI6, CXCL10, IFIT2 
PC_ 2 
Positive:  BPIFB1, MUC5AC, TSPAN8, LYZ, PSCA, MSMB, PROM1, BPIFA1, TFF3, MSLN 
       FCGBP, RAB37, CEACAM5, RNASE4, PTPRN2, ANG, SERPINB11, PDE8B, CHAD, SYT7 
       LCN2, HMGCS2, AZGP1, TFF1, FBP1, SLC9A3, CAPN9, ANPEP, GSTA1, BDKRB2 
Negative:  MKI67, NUSAP1, TOP2A, KIFC1, TK1, TPX2, AURKB, BIRC5, IQGAP3, KIF2C 
       FOXM1, TYMS, SPC24, CIT, RRM2, ANLN, HIST1H1B, MYBL2, BUB1, HJURP 
       CDK1, ASF1B, KIF23, TACC3, TROAP, CCNB2, STMN1, LMNB1, ZWINT, PHF19 
PC_ 3 
Positive:  KRT78, MAL, TMPRSS11B, ECM1, PRSS27, TMPRSS11E, RHCG, SPRR3, KRT13, IL36A 
       RNASE7, SCEL, HSPB8, SPRR1B, HOPX, CRNN, NCCRP1, CNFN, A2ML1, CAMK2N1 
       EMP1, SPRR2A, KRT80, CRCT1, SPINK5, SPRR2D, S100A8, VSIG10L, S100A12, MUC21 
Negative:  IDO1, WARS, IFI6, IFITM3, TAP1, TRIM22, IFI44L, MX1, LAP3, IFI44 
       XAF1, ISG20, OAS2, IFIT2, PARP14, MX2, IFIT1, CMPK2, MT2A, IFIT3 
       APOL1, HERC6, GBP1, IFI30, TAP2, OAS3, GBP4, SERPING1, CXCL10, CX3CL1 
PC_ 4 
Positive:  BPIFB1, ALPL, LCN2, CEACAM6, MUC5AC, CEACAM5, MACC1, STEAP4, MSLN, MSMB 
       RAB37, LYZ, PSCA, PI3, ALDH3B1, FBP1, C15orf48, SLURP2, NQO1, FCGBP 
       CFB, CD24, BPIFA1, PROM1, RNASE4, MAL2, RAMP1, ME1, AZGP1, TSPAN8 
Negative:  TNC, NUPR1, KRT5, TNS4, KRT17, S100A2, CX3CL1, FOSB, EGR1, CXCL11 
       BTG2, HBEGF, FGFR3, FAT2, NOTCH1, SERPINB5, CXCL10, IGFBP5, ATF3, KRT6A 
       JUN, SOCS1, ZFP36, CLCA4, EGR3, TRIB1, ARL4D, HAPLN3, CD44, BCAM 
PC_ 5 
Positive:  IFI6, TMPRSS11B, MAL, ECM1, MUC5B, RNASE7, RHCG, KRT78, ISG15, CRNN 
       IL36A, PRSS27, CSF3, SLC9A3, XAF1, EPS8L1, MT2A, CRCT1, TFF3, OAS3 
       TMPRSS11E, CIITA, TNFAIP2, AURKB, IQGAP3, CAMK2N1, NUSAP1, VNN3, MKI67, MX2 
Negative:  PRDX1, ADH7, MAL2, ANXA1, SGK1, UGT2A1, AKR1C3, UPK1B, GCLC, DSTN 
       CSTA, TPD52L1, TM4SF1, DSG2, TSC22D1, AKR1C2, CASC4, GPX2, CCDC80, MT-ATP6 
       RCN1, FABP5, HSPA1B, ATP1B3, ALOX15, ASS1, S100A4, MT-ND3, PRKAR2B, GCLM

  gc()

             used    (Mb) gc trigger    (Mb) limit (Mb)   max used    (Mb)
Ncells    4730528   252.7   15705738   838.8         NA   15705738   838.8
Vcells 2016416182 15384.1 3196391621 24386.6     163840 3196082098 24384.2

  non_ciliated <- RunUMAP(non_ciliated, dims = 1:30, reduction = "pca", reduction.name = "umap.nc")

12:06:58 UMAP embedding parameters a = 0.9922 b = 1.112

12:06:58 Read 36140 rows and found 30 numeric columns

12:06:58 Using Annoy for neighbor search, n_neighbors = 30

12:06:58 Building Annoy index with metric = cosine, n_trees = 50

0%   10   20   30   40   50   60   70   80   90   100%

[----|----|----|----|----|----|----|----|----|----|

**************************************************|
12:07:00 Writing NN index file to temp file /var/folders/q8/kw1r78g12qn793xm7g0zvk94x2bh70/T//RtmpOWBeH2/filef989591187b1
12:07:00 Searching Annoy index using 1 thread, search_k = 3000
12:07:07 Annoy recall = 100%
12:07:07 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
12:07:08 Initializing from normalized Laplacian + noise (using RSpectra)
12:07:08 Commencing optimization for 200 epochs, with 1562686 positive edges
12:07:08 Using rng type: pcg
12:07:18 Optimization finished

  meta_data_columns <- colnames(non_ciliated@meta.data)
  columns_to_remove <- grep("^RNA_snn_res", meta_data_columns, value = TRUE)
  non_ciliated@meta.data <- non_ciliated@meta.data[, !(colnames(non_ciliated@meta.data) %in% columns_to_remove)]
  resolutions <- seq(0.1, 0.5, by = 0.1)
  non_ciliated <- FindNeighbors(non_ciliated, dims = 1:30, reduction = "pca")

Computing nearest neighbor graph
Computing SNN

  non_ciliated <- FindClusters(non_ciliated, resolution = resolutions )

Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 36140
Number of edges: 1240015

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9513
Number of communities: 11
Elapsed time: 3 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 36140
Number of edges: 1240015

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9347
Number of communities: 13
Elapsed time: 4 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 36140
Number of edges: 1240015

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9235
Number of communities: 15
Elapsed time: 4 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 36140
Number of edges: 1240015

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9142
Number of communities: 18
Elapsed time: 4 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 36140
Number of edges: 1240015

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9064
Number of communities: 21
Elapsed time: 4 seconds

clustree(non_ciliated, prefix = "RNA_snn_res.")

DimPlot(non_ciliated, group.by = "RNA_snn_res.0.1", reduction = "umap.nc", label = TRUE, label.size = 2.5, repel = TRUE, raster = FALSE )

DimPlot(non_ciliated, reduction = "umap.nc", label = TRUE, label.size = 2.5, repel = TRUE, raster = FALSE, split.by = "tissue" )

#DimPlot(non_ciliated, group.by = "donor", reduction = "umap.nc", label = TRUE, label.size = 2.5, repel = TRUE, raster = FALSE, split.by = "tissue" )

opt_res <- "RNA_snn_res.0.1"  
n <- nlevels(non_ciliated$RNA_snn_res.0.1)
non_ciliated$RNA_snn_res.0.1 <- factor(non_ciliated$RNA_snn_res.0.1, levels = seq(0,n-1))
non_ciliated$seurat_clusters <- NULL
non_ciliated$cluster <- non_ciliated$RNA_snn_res.0.1
Idents(non_ciliated) <- non_ciliated$cluster

k <- names(which(table(non_ciliated$celltypist_lung_airway) > 150))
non_ciliated$celltypist_lung2 <- ifelse(non_ciliated$celltypist_lung_airway %in% k,
                                  non_ciliated$celltypist_lung_airway, NA)

DimPlot(
    non_ciliated,
    group.by = "celltypist_lung2",
    reduction = "umap.nc",
    label = TRUE,
    repel = TRUE,
    label.size = 2.5,
    shuffle = TRUE
)

plot_ct_bar(non_ciliated, "celltypist_paeds")

plot_ct_bar(non_ciliated, "celltypist_lung_airway")

plot_ct_bar(non_ciliated, "celltypist_lung_atlas")

plot_ct_bar(non_ciliated, "celltypist_immune")

plot_ct_bar(non_ciliated, "celltypist_tonsil")

#plot_ct_bar(non_ciliated, "predicted.celltype.l2") # Tonsil Atlas level 2 
plot_ct_bar(non_ciliated, "cell_labels_v2") # previous labels

Checking for known markers-

features <- list(

  ## Basal lineage
  Basal_core = c("KRT5", "KRT14", "TP63", "KRT15", "ITGA6"),
  Basal_KHDRBS2 = c("KHDRBS2", "COL17A1", "ITGA6"),
  Basal_cycling = c("MKI67", "TOP2A", "CENPF"),

  ## Squamous / Hillock
  Hillock = c("KRT13", "KRT4", "CLCA2"),
  Squamous = c("KRT13", "KRT4", "IVL", "SPRR1A", "SPRR2A"),

  ## Secretory lineage
  Secretory = c("SCGB1A1", "KRT8", "KRT18"),
  Secretory_GALNT4 = c("GALNT4", "SCGB1A1"),
  Duct = c("KRT19", "KRT8"),
  Goblet = c("MUC5AC", "SPDEF", "AGR2"),
  Goblet_BPIFA2 = c("BPIFA2", "MUC5AC"),
  Goblet_PLAU = c("PLAU", "MUC5AC"),
  Goblet_inflammatory = c("CXCL8", "IL1B", "MUC5AC"),
  Transit_epithelial = c("KRT8", "KRT18", "LGALS3"),

  ## Rare epithelial types
  Ionocyte = c("FOXI1", "CFTR", "ASCL3"),
  Brush = c("POU2F3", "TRPM5", "GFI1B")
)

for (g in features) {
  print(
    FeaturePlot(
      non_ciliated,
      features = g,
      reduction = "umap.nc",
      raster = FALSE, repel = T, label = T, ncol = 3
    )
  )
}

Renaming idents

new_levels <- c(
  "0" = "Basal_Club_Goblet",
  "1" = "Basal_Club_Goblet",
  "2" = "Basal_Club_Goblet",
   "3" = "Basal_Club_Goblet",
  "4" =    "Proliferating Epithelial",
  "5" =  "Basal_Club_Goblet",
    "6" ="Basal_Club_Goblet",
"7" = "Duct",
"8" = "Squamous",
"9" = "Ionocytes",
"10" =  "Squamous"
)
non_ciliated <- RenameIdents(non_ciliated, new_levels)
non_ciliated$cell_labels_l2 <- Idents(non_ciliated)
non_ciliated$cell_labels_l2 <- factor(
  Idents(non_ciliated),
  levels = c(
    "Basal_Club_Goblet",
    "Proliferating Epithelial",
    "Duct",
    "Squamous",
    "Ionocytes"
  )
)

non_ciliated$cell_labels_l3 <- non_ciliated$cell_labels_l2
Idents(non_ciliated) <- non_ciliated$cell_labels_l2

table(non_ciliated$cell_labels_l2, non_ciliated$tissue)

                          
                           Tonsils Adenoids Nasal_brushings Bronchial_brushings
  Basal_Club_Goblet             84       59           31921                 119
  Proliferating Epithelial       1        3            1961                  88
  Duct                           0        0             976                   4
  Squamous                     212       37              51                 199
  Ionocytes                      0        0             173                  13
                          
                             BAL
  Basal_Club_Goblet           27
  Proliferating Epithelial     3
  Duct                         2
  Squamous                   206
  Ionocytes                    1

df_counts <- non_ciliated@meta.data |>
  count(cell_labels_l2, tissue, name = "n")
df_totals <- df_counts |>
  group_by(cell_labels_l2) |>
  summarise(total = sum(n), .groups = "drop")

ggplot(df_counts, aes(x = cell_labels_l2, y = n, fill = tissue)) +
  geom_bar(stat = "identity", position = "stack") +
  geom_text(data = df_totals,
            aes(x = cell_labels_l2, y = total, label = total),
            vjust = -0.3, size = 3, inherit.aes = FALSE) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  labs(x = "cluster at l2",
       y = "Number of cells",
       fill = "Tissue")

non_ciliated@meta.data %>%
    count(cell_labels_l2, donor) %>%
    group_by(cell_labels_l2) %>%
    mutate(prop = n / sum(n)) %>%  # proportion per L2
    ungroup() %>%
    ggplot(aes(x = cell_labels_l2, y = prop, fill = donor)) +
    geom_col(width = 0.9) +
    scale_y_continuous(labels = scales::percent) +
    scale_fill_viridis_d(option = "turbo", guide = guide_legend(ncol = 2)) +
    theme_classic() +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
    labs(
        x = NULL,
        y = "Proportion",
        fill = "Donor",
        title = "Donor composition per cluster"
    )

Subclustering basal/club/goblet

non_ciliated <- FindSubCluster(non_ciliated, cluster = "Basal_Club_Goblet", graph.name = "RNA_snn", resolution = 0.1, subcluster.name = "E_sub")

Idents(non_ciliated) <- non_ciliated$E_sub
non_ciliated$cell_labels_l3 <- non_ciliated$E_sub

non_ciliated$cell_labels_l3 <- factor(
  non_ciliated$cell_labels_l3,
  levels = c(
    "Basal_Club_Goblet_0",
    "Basal_Club_Goblet_1",
    "Basal_Club_Goblet_2",
    "Basal_Club_Goblet_3",
    "Basal_Club_Goblet_4",
    "Basal_Club_Goblet_5",
    "Proliferating Epithelial",
    "Duct",
    "Squamous",
    "Ionocytes"
  )
)

Idents(non_ciliated) <- non_ciliated$cell_labels_l3

DimPlot(non_ciliated, reduction = "umap.nc", raster =F, repel=T, label = T)

DE markers Non-ciliated

The marker genes for this subclustering can be found here-

Non-ciliated_population_subclusters

features.no.mt <- rownames(paed_sub)[!grepl("^MT-", rownames(paed_sub))]
gc()

             used    (Mb) gc trigger    (Mb) limit (Mb)   max used    (Mb)
Ncells    4719955   252.1   15705738   838.8         NA   15705738   838.8
Vcells 2016983967 15388.4 3196391621 24386.6     163840 3196082098 24384.2

paed_sub.markers <- FindAllMarkers(
  non_ciliated,
  only.pos = TRUE,
  min.pct = 0.25,
  logfc.threshold = 0.25,
  features = features.no.mt
)
gc()

             used    (Mb) gc trigger    (Mb) limit (Mb)   max used    (Mb)
Ncells    4704943   251.3   15705738   838.8         NA   15705738   838.8
Vcells 2016826519 15387.2 3835749945 29264.5     163840 3825797871 29188.6

top10 <- paed_sub.markers %>%
    group_by(cluster) %>%
    top_n(n = 10, wt = avg_log2FC) %>%
    ungroup() %>%
    distinct(gene, .keep_all = TRUE) %>%
    arrange(cluster, desc(avg_log2FC))


best.wilcox.gene.per.cluster <- paed_sub.markers |>
  group_by(cluster) |>
  slice_max(avg_log2FC, n = 1) |>
  select(cluster, gene)
best.wilcox.gene.per.cluster

# A tibble: 5 × 2
# Groups:   cluster [5]
  cluster                  gene   
  <fct>                    <chr>  
1 Basal_Club_Goblet        CCDC190
2 Proliferating Epithelial SPC25  
3 Duct                     DMBT1  
4 Squamous                 IL36A  
5 Ionocytes                PDE1C

FeaturePlot(
  non_ciliated,
  features = best.wilcox.gene.per.cluster$gene,
  reduction = "umap.nc",
  raster = FALSE,
  repel = TRUE,
  ncol = 1,
  label = TRUE,
  split.by = "tissue"
)

## Seurat top markers
cluster_colors <- paletteer::paletteer_d("pals::glasbey")[factor(top10$cluster)]

DotPlot(non_ciliated,    
        features = unique(top10$gene),
        group.by = "cell_labels_l3",
        cols = c("azure1", "blueviolet"),
        dot.scale = 3, assay = "RNA") +
    RotatedAxis() +
    FontSize(y.text = 8, x.text = 12) +
    labs(y = element_blank(), x = element_blank()) +
    coord_flip() +
    theme(axis.text.y = element_text(color = cluster_colors)) +
    ggtitle("Top 10 marker genes per cluster (Seurat)")

marker_counts <- paed_sub.markers %>%
    filter(avg_log2FC > 1 & p_val_adj < 0.05) %>%  
    count(cluster, name = "n_markers") %>%
    arrange(desc(n_markers))

print(marker_counts)

                   cluster n_markers
1 Proliferating Epithelial       595
2                 Squamous       532
3                Ionocytes       258
4                     Duct        96
5        Basal_Club_Goblet        39

ggplot(marker_counts, aes(x = reorder(cluster, n_markers), y = n_markers)) +
    geom_col(fill = "steelblue", alpha = 0.8) +
    coord_flip() +
    labs(title = "High-confidence markers per subcluster (logFC > 1)",
         x = "Subcluster", y = "Number of markers") +
    theme_minimal()

## Pairwise DE markers (Non-ciliated) {.tabset}

ids <- unique(non_ciliated$cell_labels_l3)
res_list <- list()

for (i in 1:(length(ids) - 1)) {
  for (j in (i + 1):length(ids)) {

    id1 <- ids[i]
    id2 <- ids[j]

    res <- FindMarkers(
      ciliated,
      ident.1 = id1,
      ident.2 = id2,
      only.pos = TRUE,
      min.pct = 0.25,
      logfc.threshold = 0.5,
      features = features.no.mt
    )

    top_genes <- res %>%
      filter(pct.1 > 0.1 & pct.2 > 0.1) %>%
      arrange(p_val_adj, desc(avg_log2FC)) %>%
      head(10) %>%
      rownames()

    res_list[[paste(id1, "vs", id2, sep = "_")]] <- res

    cat("\n\n##", paste(id1, "vs", id2), "\n\n")

    print(
      VlnPlot(
        ciliated,
        features = top_genes,
        idents = c(id1, id2),
        pt.size = 0
      ) + patchwork::plot_annotation(title = paste(id1, "vs", id2))
    )

    kable(
      head(res, 50),
      caption = paste("Top 50 markers for", id1, "vs", id2)
    )
  }
}

out_markers <- here("output",
            "CSV_v3","Epithelial_lineage",
            paste("Marker_genes_Reclustered_Non-ciliated.",opt_res, sep = ""))

dir.create(out_markers, recursive = TRUE, showWarnings = FALSE)

for (cl in unique(paed_sub.markers$cluster)) {
  cluster_data <- paed_sub.markers %>% dplyr::filter(cluster == cl)
  file_name <- here(out_markers, paste0("G000231_Neeland_", cl, ".csv"))
  if (!file.exists(file_name)) {
  write.csv(cluster_data, file = file_name)
  }
}

Overview of l1,l2 and l3

paed_sub$cell_labels_l2 <- as.character(paed_sub$cell_labels_l1)
paed_sub$cell_labels_l3 <- as.character(paed_sub$cell_labels_l2)

for (obj in list(ciliated, non_ciliated)) {
  cells <- intersect(colnames(paed_sub), colnames(obj))
  paed_sub$cell_labels_l2[cells] <- as.character(obj$cell_labels_l2[cells])
  paed_sub$cell_labels_l3[cells] <- as.character(obj$cell_labels_l3[cells])
}

paed_sub$cell_labels_l2 <- factor(paed_sub$cell_labels_l2)
paed_sub$cell_labels_l3 <- factor(paed_sub$cell_labels_l3)

## Seurat top markers
cluster_colors <- paletteer::paletteer_d("pals::glasbey")[factor(top10$cluster)]

DotPlot(paed_sub,    
        features = unique(top10$gene),
        group.by = "cell_labels_l3",
        cols = c("azure1", "blueviolet"),
        dot.scale = 3, assay = "RNA") +
    RotatedAxis() +
    FontSize(y.text = 8, x.text = 12) +
    labs(y = element_blank(), x = element_blank()) +
    coord_flip() +
    theme(axis.text.y = element_text(color = cluster_colors)) +
    ggtitle("Top 10 marker genes per cluster (Seurat)")

Warning: Vectorized input to `element_text()` is not officially supported.
ℹ Results may be unexpected or may change in future versions of ggplot2.

DimPlot(paed_sub, reduction = "umap.sub", raster = F, repel = T, label = T, group.by = "lineage")

DimPlot(paed_sub, reduction = "umap.sub", raster = F, repel = T, label = T, group.by = "cell_labels_l1")

paed_sub$cell_labels_l2 <- factor(
  paed_sub$cell_labels_l2,
  levels = c(
    "Basal_Club_Goblet",
    "Proliferating Epithelial",
    "Ciliated Epithelial",
    "Deuterosomal",
    "Secretory Epithelial",
    "Duct",
    "Squamous",
    "Ionocytes",
    "Erythroid cells"
  )
)
DimPlot(paed_sub, reduction = "umap.sub", raster = F, repel = T, label = T, group.by = "cell_labels_l2")

DimPlot(paed_sub, reduction = "umap.sub", raster = F, repel = T, label = T, group.by = "cell_labels_l2", split.by = "tissue")

paed_sub$cell_labels_l3 <- factor(
  paed_sub$cell_labels_l3,
  levels = c(
    "Basal_Club_Goblet_0","Basal_Club_Goblet_1","Basal_Club_Goblet_2",
    "Basal_Club_Goblet_3","Basal_Club_Goblet_4","Basal_Club_Goblet_5",
    "Proliferating Epithelial",
    "Ciliated Epithelial_0","Ciliated Epithelial_1","Ciliated Epithelial_2",
    "Ciliated Epithelial_3","Ciliated Epithelial_4","Ciliated Epithelial_5",
    "Ciliated Epithelial_6","Ciliated Epithelial_7","Ciliated Epithelial_8",
    "Ciliated Epithelial_9","Ciliated Epithelial_10",
    "Deuterosomal",
    "Secretory Epithelial",
    "Duct",
    "Squamous",
    "Ionocytes",
    "Erythroid cells"
  )
)
DimPlot(paed_sub, reduction = "umap.sub", raster = F, repel = T, label = T, group.by = "cell_labels_l3")

paed_sub@meta.data %>%
    count(cell_labels_l3, donor) %>%
    group_by(cell_labels_l3) %>%
    mutate(prop = n / sum(n)) %>%  # proportion per L3
    ungroup() %>%
    ggplot(aes(x = cell_labels_l3, y = prop, fill = donor)) +
    geom_col(width = 0.9) +
    scale_y_continuous(labels = scales::percent) +
    scale_fill_viridis_d(option = "turbo", guide = guide_legend(ncol = 2)) +
    theme_classic() +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
    labs(
        x = NULL,
        y = "Proportion",
        fill = "Donor",
        title = "Donor composition per L3 cell type"
    )

Save RDS

#saveRDS(paed_sub, "../EarlyAir_paper/output/RDS/Annotated_lineages/SEU_Epithelial_annotated.rds")

Some notes/observations from Epithelial cell lineage- Total Epithelial cells- 67,493 1. Majority of the samples are from Nasal_brushings 2. Ciliated cell subclustering gives distinct clusters 3. No clear distinction between Basal/Club/Goblet cells

Session Info

sessionInfo()

R version 4.3.3 (2024-02-29)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS 15.7.3

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Melbourne
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] future_1.40.0       gtools_3.9.5        presto_1.0.0       
 [4] circlize_0.4.16     BiocParallel_1.36.0 harmony_1.2.3      
 [7] Rcpp_1.0.14         patchwork_1.3.0     cowplot_1.1.3      
[10] data.table_1.17.2   RColorBrewer_1.1-3  kableExtra_1.4.0   
[13] ggridges_0.5.6      ggforce_0.4.2       viridis_0.6.5      
[16] viridisLite_0.4.2   paletteer_1.6.0     clustree_0.5.1     
[19] ggraph_2.2.1        Seurat_5.0.3        SeuratObject_5.1.0 
[22] sp_2.2-0            here_1.0.1          lubridate_1.9.4    
[25] forcats_1.0.0       stringr_1.5.1       dplyr_1.1.4        
[28] purrr_1.0.4         readr_2.1.5         tidyr_1.3.1        
[31] tibble_3.2.1        ggplot2_3.5.2       tidyverse_2.0.0    
[34] BiocStyle_2.30.0    workflowr_1.7.1    

loaded via a namespace (and not attached):
  [1] RcppAnnoy_0.0.22       splines_4.3.3          later_1.4.2           
  [4] prismatic_1.1.2        polyclip_1.10-7        fastDummies_1.7.5     
  [7] lifecycle_1.0.4        rprojroot_2.0.4        globals_0.17.0        
 [10] processx_3.8.6         lattice_0.22-5         MASS_7.3-60.0.1       
 [13] backports_1.5.0        magrittr_2.0.3         limma_3.58.1          
 [16] plotly_4.10.4          sass_0.4.10            rmarkdown_2.29        
 [19] jquerylib_0.1.4        yaml_2.3.10            httpuv_1.6.16         
 [22] sctransform_0.4.2      spam_2.11-1            spatstat.sparse_3.1-0 
 [25] reticulate_1.42.0      pbapply_1.7-2          abind_1.4-8           
 [28] Rtsne_0.17             tweenr_2.0.3           git2r_0.36.2          
 [31] ggrepel_0.9.6          irlba_2.3.5.1          listenv_0.9.1         
 [34] spatstat.utils_3.1-4   pheatmap_1.0.12        goftest_1.2-3         
 [37] RSpectra_0.16-2        spatstat.random_3.4-1  fitdistrplus_1.2-2    
 [40] parallelly_1.44.0      svglite_2.2.1          leiden_0.4.3.1        
 [43] codetools_0.2-19       xml2_1.3.8             tidyselect_1.2.1      
 [46] shape_1.4.6.1          farver_2.1.2           matrixStats_1.5.0     
 [49] spatstat.explore_3.4-3 jsonlite_2.0.0         tidygraph_1.3.1       
 [52] progressr_0.15.1       survival_3.5-8         systemfonts_1.2.3     
 [55] tools_4.3.3            ica_1.0-3              glue_1.8.0            
 [58] gridExtra_2.3          xfun_0.52              withr_3.0.2           
 [61] BiocManager_1.30.25    fastmap_1.2.0          callr_3.7.6           
 [64] digest_0.6.37          timechange_0.3.0       R6_2.6.1              
 [67] mime_0.13              textshaping_1.0.1      colorspace_2.1-1      
 [70] scattermore_1.2        tensor_1.5             spatstat.data_3.1-6   
 [73] utf8_1.2.5             generics_0.1.4         graphlayouts_1.2.2    
 [76] httr_1.4.7             htmlwidgets_1.6.4      whisker_0.4.1         
 [79] uwot_0.2.3             pkgconfig_2.0.3        gtable_0.3.6          
 [82] lmtest_0.9-40          htmltools_0.5.8.1      dotCall64_1.2         
 [85] scales_1.4.0           png_0.1-8              spatstat.univar_3.1-3 
 [88] knitr_1.50             rstudioapi_0.17.1      tzdb_0.5.0            
 [91] reshape2_1.4.4         checkmate_2.3.2        nlme_3.1-164          
 [94] cachem_1.1.0           zoo_1.8-14             GlobalOptions_0.1.2   
 [97] KernSmooth_2.23-22     vipor_0.4.7            parallel_4.3.3        
[100] miniUI_0.1.2           ggrastr_1.0.2          pillar_1.10.2         
[103] grid_4.3.3             vctrs_0.6.5            RANN_2.6.2            
[106] promises_1.3.2         xtable_1.8-4           cluster_2.1.6         
[109] beeswarm_0.4.0         evaluate_1.0.3         cli_3.6.5             
[112] compiler_4.3.3         rlang_1.1.6            crayon_1.5.3          
[115] future.apply_1.11.3    labeling_0.4.3         rematch2_2.1.2        
[118] ps_1.9.1               ggbeeswarm_0.7.2       getPass_0.2-4         
[121] plyr_1.8.9             fs_1.6.6               stringi_1.8.7         
[124] deldir_2.0-4           lazyeval_0.2.2         spatstat.geom_3.4-1   
[127] Matrix_1.6-5           RcppHNSW_0.6.0         hms_1.1.3             
[130] statmod_1.5.0          shiny_1.10.0           ROCR_1.0-11           
[133] igraph_2.1.4           memoise_2.0.1          bslib_0.9.0

sessionInfo()

R version 4.3.3 (2024-02-29)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS 15.7.3

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Melbourne
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] future_1.40.0       gtools_3.9.5        presto_1.0.0       
 [4] circlize_0.4.16     BiocParallel_1.36.0 harmony_1.2.3      
 [7] Rcpp_1.0.14         patchwork_1.3.0     cowplot_1.1.3      
[10] data.table_1.17.2   RColorBrewer_1.1-3  kableExtra_1.4.0   
[13] ggridges_0.5.6      ggforce_0.4.2       viridis_0.6.5      
[16] viridisLite_0.4.2   paletteer_1.6.0     clustree_0.5.1     
[19] ggraph_2.2.1        Seurat_5.0.3        SeuratObject_5.1.0 
[22] sp_2.2-0            here_1.0.1          lubridate_1.9.4    
[25] forcats_1.0.0       stringr_1.5.1       dplyr_1.1.4        
[28] purrr_1.0.4         readr_2.1.5         tidyr_1.3.1        
[31] tibble_3.2.1        ggplot2_3.5.2       tidyverse_2.0.0    
[34] BiocStyle_2.30.0    workflowr_1.7.1    

loaded via a namespace (and not attached):
  [1] RcppAnnoy_0.0.22       splines_4.3.3          later_1.4.2           
  [4] prismatic_1.1.2        polyclip_1.10-7        fastDummies_1.7.5     
  [7] lifecycle_1.0.4        rprojroot_2.0.4        globals_0.17.0        
 [10] processx_3.8.6         lattice_0.22-5         MASS_7.3-60.0.1       
 [13] backports_1.5.0        magrittr_2.0.3         limma_3.58.1          
 [16] plotly_4.10.4          sass_0.4.10            rmarkdown_2.29        
 [19] jquerylib_0.1.4        yaml_2.3.10            httpuv_1.6.16         
 [22] sctransform_0.4.2      spam_2.11-1            spatstat.sparse_3.1-0 
 [25] reticulate_1.42.0      pbapply_1.7-2          abind_1.4-8           
 [28] Rtsne_0.17             tweenr_2.0.3           git2r_0.36.2          
 [31] ggrepel_0.9.6          irlba_2.3.5.1          listenv_0.9.1         
 [34] spatstat.utils_3.1-4   pheatmap_1.0.12        goftest_1.2-3         
 [37] RSpectra_0.16-2        spatstat.random_3.4-1  fitdistrplus_1.2-2    
 [40] parallelly_1.44.0      svglite_2.2.1          leiden_0.4.3.1        
 [43] codetools_0.2-19       xml2_1.3.8             tidyselect_1.2.1      
 [46] shape_1.4.6.1          farver_2.1.2           matrixStats_1.5.0     
 [49] spatstat.explore_3.4-3 jsonlite_2.0.0         tidygraph_1.3.1       
 [52] progressr_0.15.1       survival_3.5-8         systemfonts_1.2.3     
 [55] tools_4.3.3            ica_1.0-3              glue_1.8.0            
 [58] gridExtra_2.3          xfun_0.52              withr_3.0.2           
 [61] BiocManager_1.30.25    fastmap_1.2.0          callr_3.7.6           
 [64] digest_0.6.37          timechange_0.3.0       R6_2.6.1              
 [67] mime_0.13              textshaping_1.0.1      colorspace_2.1-1      
 [70] scattermore_1.2        tensor_1.5             spatstat.data_3.1-6   
 [73] utf8_1.2.5             generics_0.1.4         graphlayouts_1.2.2    
 [76] httr_1.4.7             htmlwidgets_1.6.4      whisker_0.4.1         
 [79] uwot_0.2.3             pkgconfig_2.0.3        gtable_0.3.6          
 [82] lmtest_0.9-40          htmltools_0.5.8.1      dotCall64_1.2         
 [85] scales_1.4.0           png_0.1-8              spatstat.univar_3.1-3 
 [88] knitr_1.50             rstudioapi_0.17.1      tzdb_0.5.0            
 [91] reshape2_1.4.4         checkmate_2.3.2        nlme_3.1-164          
 [94] cachem_1.1.0           zoo_1.8-14             GlobalOptions_0.1.2   
 [97] KernSmooth_2.23-22     vipor_0.4.7            parallel_4.3.3        
[100] miniUI_0.1.2           ggrastr_1.0.2          pillar_1.10.2         
[103] grid_4.3.3             vctrs_0.6.5            RANN_2.6.2            
[106] promises_1.3.2         xtable_1.8-4           cluster_2.1.6         
[109] beeswarm_0.4.0         evaluate_1.0.3         cli_3.6.5             
[112] compiler_4.3.3         rlang_1.1.6            crayon_1.5.3          
[115] future.apply_1.11.3    labeling_0.4.3         rematch2_2.1.2        
[118] ps_1.9.1               ggbeeswarm_0.7.2       getPass_0.2-4         
[121] plyr_1.8.9             fs_1.6.6               stringi_1.8.7         
[124] deldir_2.0-4           lazyeval_0.2.2         spatstat.geom_3.4-1   
[127] Matrix_1.6-5           RcppHNSW_0.6.0         hms_1.1.3             
[130] statmod_1.5.0          shiny_1.10.0           ROCR_1.0-11           
[133] igraph_2.1.4           memoise_2.0.1          bslib_0.9.0

Lineages clustering: Eithelial cells

Epithelial subclustering

Gunjan Dixit

January 27, 2026

Load libraries

Epithelial cells

Previous labels

Level 1 labels

Adding CellTypist info

Ciliated Epithelial

Renaming idents

DE markers

Ciliated Epithelial_8 vs Ciliated Epithelial_3

Ciliated Epithelial_8 vs Ciliated Epithelial_1

Ciliated Epithelial_8 vs Ciliated Epithelial_4

Ciliated Epithelial_8 vs Deuterosomal

Ciliated Epithelial_8 vs Ciliated Epithelial_0

Ciliated Epithelial_8 vs Ciliated Epithelial_2

Ciliated Epithelial_8 vs Ciliated Epithelial_7

Ciliated Epithelial_8 vs Ciliated Epithelial_6

Ciliated Epithelial_8 vs Ciliated Epithelial_5

Ciliated Epithelial_8 vs Ciliated Epithelial_10

Ciliated Epithelial_8 vs Ciliated Epithelial_9

Ciliated Epithelial_3 vs Ciliated Epithelial_1

Ciliated Epithelial_3 vs Ciliated Epithelial_4

Ciliated Epithelial_3 vs Deuterosomal

Ciliated Epithelial_3 vs Ciliated Epithelial_0

Ciliated Epithelial_3 vs Ciliated Epithelial_2

Ciliated Epithelial_3 vs Ciliated Epithelial_7

Ciliated Epithelial_3 vs Ciliated Epithelial_6

Ciliated Epithelial_3 vs Ciliated Epithelial_5

Ciliated Epithelial_3 vs Ciliated Epithelial_10

Ciliated Epithelial_3 vs Ciliated Epithelial_9

Ciliated Epithelial_1 vs Ciliated Epithelial_4

Ciliated Epithelial_1 vs Deuterosomal

Ciliated Epithelial_1 vs Ciliated Epithelial_0

Ciliated Epithelial_1 vs Ciliated Epithelial_2

Ciliated Epithelial_1 vs Ciliated Epithelial_7

Ciliated Epithelial_1 vs Ciliated Epithelial_6

Ciliated Epithelial_1 vs Ciliated Epithelial_5

Ciliated Epithelial_1 vs Ciliated Epithelial_10

Ciliated Epithelial_1 vs Ciliated Epithelial_9

Ciliated Epithelial_4 vs Deuterosomal

Ciliated Epithelial_4 vs Ciliated Epithelial_0

Ciliated Epithelial_4 vs Ciliated Epithelial_2

Ciliated Epithelial_4 vs Ciliated Epithelial_7

Ciliated Epithelial_4 vs Ciliated Epithelial_6

Ciliated Epithelial_4 vs Ciliated Epithelial_5

Ciliated Epithelial_4 vs Ciliated Epithelial_10

Ciliated Epithelial_4 vs Ciliated Epithelial_9

Deuterosomal vs Ciliated Epithelial_0

Deuterosomal vs Ciliated Epithelial_2

Deuterosomal vs Ciliated Epithelial_7

Deuterosomal vs Ciliated Epithelial_6

Deuterosomal vs Ciliated Epithelial_5

Deuterosomal vs Ciliated Epithelial_10

Deuterosomal vs Ciliated Epithelial_9

Ciliated Epithelial_0 vs Ciliated Epithelial_2

Ciliated Epithelial_0 vs Ciliated Epithelial_7

Ciliated Epithelial_0 vs Ciliated Epithelial_6

Ciliated Epithelial_0 vs Ciliated Epithelial_5

Ciliated Epithelial_0 vs Ciliated Epithelial_10

Ciliated Epithelial_0 vs Ciliated Epithelial_9

Ciliated Epithelial_2 vs Ciliated Epithelial_7

Ciliated Epithelial_2 vs Ciliated Epithelial_6

Ciliated Epithelial_2 vs Ciliated Epithelial_5

Ciliated Epithelial_2 vs Ciliated Epithelial_10

Ciliated Epithelial_2 vs Ciliated Epithelial_9

Ciliated Epithelial_7 vs Ciliated Epithelial_6

Ciliated Epithelial_7 vs Ciliated Epithelial_5

Ciliated Epithelial_7 vs Ciliated Epithelial_10

Ciliated Epithelial_7 vs Ciliated Epithelial_9

Ciliated Epithelial_6 vs Ciliated Epithelial_5

Ciliated Epithelial_6 vs Ciliated Epithelial_10

Ciliated Epithelial_6 vs Ciliated Epithelial_9

Ciliated Epithelial_5 vs Ciliated Epithelial_10

Ciliated Epithelial_5 vs Ciliated Epithelial_9

Ciliated Epithelial_10 vs Ciliated Epithelial_9

Non-ciliated subclustering

Renaming idents