Chapter 29 Muraro human pancreas (CEL-seq)

29.1 Introduction

This performs an analysis of the Muraro et al. (2016) CEL-seq dataset, consisting of human pancreas cells from various donors.

29.3 Quality control

This dataset lacks mitochondrial genes so we will do without. For the one batch that seems to have a high proportion of low-quality cells, we compute an appropriate filter threshold using a shared median and MAD from the other batches (Figure ??).

Distribution of each QC metric across cells from each donor in the Muraro pancreas dataset. Each point represents a cell and is colored according to whether that cell was discarded.

Figure 29.1: Distribution of each QC metric across cells from each donor in the Muraro pancreas dataset. Each point represents a cell and is colored according to whether that cell was discarded.

We have a look at the causes of removal:

##              low_lib_size            low_n_features high_altexps_ERCC_percent 
##                       663                       700                       738 
##                   discard 
##                       773

29.5 Data integration

We use the proportion of variance lost as a diagnostic measure:

##           D28      D29      D30     D31
## [1,] 0.060847 0.024121 0.000000 0.00000
## [2,] 0.002646 0.003018 0.062421 0.00000
## [3,] 0.003449 0.002641 0.002598 0.08162

29.7 Clustering

Heatmap of the frequency of cells from each cell type label in each cluster.

Figure 29.4: Heatmap of the frequency of cells from each cell type label in each cluster.

##        Donor
## Cluster D28 D29 D30 D31
##      1  104   6  57 112
##      2   59  21  77  97
##      3   12  75  64  43
##      4   28 149 126 120
##      5   87 261 277 214
##      6   21   7  54  26
##      7    1   6   6  37
##      8    6   6   5   2
##      9   11  68   5  30
##      10   4   2   5   8
Obligatory $t$-SNE plots of the Muraro pancreas dataset. Each point represents a cell that is colored by cluster (left) or batch (right).

Figure 29.5: Obligatory \(t\)-SNE plots of the Muraro pancreas dataset. Each point represents a cell that is colored by cluster (left) or batch (right).

Session Info

R Under development (unstable) (2019-12-29 r77627)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.6 LTS

Matrix products: default
BLAS/LAPACK: /app/easybuild/software/OpenBLAS/0.2.18-GCC-5.4.0-2.26-LAPACK-3.6.1/lib/libopenblas_prescottp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=C               LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pheatmap_1.0.12             batchelor_1.3.8             scran_1.15.14              
 [4] scater_1.15.12              ggplot2_3.2.1               ensembldb_2.11.2           
 [7] AnnotationFilter_1.11.0     GenomicFeatures_1.39.2      AnnotationDbi_1.49.0       
[10] AnnotationHub_2.19.3        BiocFileCache_1.11.4        dbplyr_1.4.2               
[13] scRNAseq_2.1.5              SingleCellExperiment_1.9.1  SummarizedExperiment_1.17.1
[16] DelayedArray_0.13.2         BiocParallel_1.21.2         matrixStats_0.55.0         
[19] Biobase_2.47.2              GenomicRanges_1.39.1        GenomeInfoDb_1.23.1        
[22] IRanges_2.21.2              S4Vectors_0.25.8            BiocGenerics_0.33.0        
[25] Cairo_1.5-10                BiocStyle_2.15.3            OSCAUtils_0.0.1            

loaded via a namespace (and not attached):
  [1] Rtsne_0.15                    ggbeeswarm_0.6.0              colorspace_1.4-1             
  [4] XVector_0.27.0                BiocNeighbors_1.5.1           farver_2.0.1                 
  [7] bit64_0.9-7                   interactiveDisplayBase_1.25.0 codetools_0.2-16             
 [10] knitr_1.26                    zeallot_0.1.0                 Rsamtools_2.3.2              
 [13] shiny_1.4.0                   BiocManager_1.30.10           compiler_4.0.0               
 [16] httr_1.4.1                    dqrng_0.2.1                   backports_1.1.5              
 [19] assertthat_0.2.1              Matrix_1.2-18                 fastmap_1.0.1                
 [22] lazyeval_0.2.2                limma_3.43.0                  later_1.0.0                  
 [25] BiocSingular_1.3.1            htmltools_0.4.0               prettyunits_1.0.2            
 [28] tools_4.0.0                   igraph_1.2.4.2                rsvd_1.0.2                   
 [31] gtable_0.3.0                  glue_1.3.1                    GenomeInfoDbData_1.2.2       
 [34] dplyr_0.8.3                   rappdirs_0.3.1                Rcpp_1.0.3                   
 [37] vctrs_0.2.1                   Biostrings_2.55.4             ExperimentHub_1.13.5         
 [40] rtracklayer_1.47.0            DelayedMatrixStats_1.9.0      xfun_0.11                    
 [43] stringr_1.4.0                 ps_1.3.0                      mime_0.8                     
 [46] lifecycle_0.1.0               irlba_2.3.3                   statmod_1.4.32               
 [49] XML_3.98-1.20                 edgeR_3.29.0                  zlibbioc_1.33.0              
 [52] scales_1.1.0                  hms_0.5.2                     promises_1.1.0               
 [55] ProtGenerics_1.19.3           RColorBrewer_1.1-2            yaml_2.2.0                   
 [58] curl_4.3                      memoise_1.1.0                 gridExtra_2.3                
 [61] biomaRt_2.43.0                stringi_1.4.3                 RSQLite_2.2.0                
 [64] highr_0.8                     BiocVersion_3.11.1            rlang_0.4.2                  
 [67] pkgconfig_2.0.3               bitops_1.0-6                  evaluate_0.14                
 [70] lattice_0.20-38               purrr_0.3.3                   labeling_0.3                 
 [73] GenomicAlignments_1.23.1      cowplot_1.0.0                 bit_1.1-14                   
 [76] processx_3.4.1                tidyselect_0.2.5              magrittr_1.5                 
 [79] bookdown_0.16                 R6_2.4.1                      DBI_1.1.0                    
 [82] pillar_1.4.3                  withr_2.1.2                   RCurl_1.95-4.12              
 [85] tibble_2.1.3                  crayon_1.3.4                  rmarkdown_2.0                
 [88] viridis_0.5.1                 progress_1.2.2                locfit_1.5-9.1               
 [91] grid_4.0.0                    blob_1.2.0                    callr_3.4.0                  
 [94] digest_0.6.23                 xtable_1.8-4                  httpuv_1.5.2                 
 [97] openssl_1.4.1                 munsell_0.5.0                 beeswarm_0.2.3               
[100] viridisLite_0.3.0             vipor_0.4.5                   askpass_1.1                  

Bibliography

Muraro, M. J., G. Dharmadhikari, D. Grun, N. Groen, T. Dielen, E. Jansen, L. van Gurp, et al. 2016. “A Single-Cell Transcriptome Atlas of the Human Pancreas.” Cell Syst 3 (4):385–94.