Chapter 6 Quick Start

To make it as easy as possible to get started fast, here we simply provide a script that walks through a typical, basic scRNA-seq analysis in code, with prose as comments (#), and all visualization held until the end of the script. The next chapter - “A Basic Analysis” - will provide more commentary on the various steps throughout, as well as relevant intermediate plotting results.

Here, we use an example dataset from the Human Cell Atlas immune cell profiling project on bone marrow. This dataset is loaded via the HCAData package, which provides a ready to use SingleCellExperiment object.

Note that the HCAData bone marrow dataset is comprised of 8 donors, so we have added an integration step to ameliorate batch effects caused by different donors. However, for use cases where integration is not necessary (e.g. no expected batch effects), we note in the code what to skip and relevant arguments to replace.

Lastly, note that some arguments are added for the sake of reducing computational runtime and can be modified or removed. These include parallelization via BPPARAM, and different algorithms for SVD and nearest-neighbor via BSPARAM and BNPARAM. See the “Adaptations for Large-scale Data” chapter for more information on these arguments.

6.1 Code

6.2 Visualizations

6.3 Session Info

R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.6 LTS

Matrix products: default
BLAS/LAPACK: /app/easybuild/software/OpenBLAS/0.2.18-GCC-5.4.0-2.26-LAPACK-3.6.1/lib/libopenblas_prescottp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] RColorBrewer_1.1-2          slingshot_1.3.1            
 [3] princurve_2.1.4             fgsea_1.11.0               
 [5] Rcpp_1.0.2                  dplyr_0.8.3                
 [7] msigdf_5.2                  org.Hs.eg.db_3.8.2         
 [9] AnnotationDbi_1.47.0        igraph_1.2.4.1             
[11] batchelor_1.1.4             scran_1.13.9               
[13] scater_1.13.9               ggplot2_3.2.0              
[15] rhdf5_2.29.0                HCAData_1.1.1              
[17] SingleCellExperiment_1.7.0  SummarizedExperiment_1.15.5
[19] DelayedArray_0.11.4         BiocParallel_1.19.0        
[21] matrixStats_0.54.0          Biobase_2.45.0             
[23] GenomicRanges_1.37.14       GenomeInfoDb_1.21.1        
[25] IRanges_2.19.10             S4Vectors_0.23.17          
[27] BiocGenerics_0.31.5         BiocStyle_2.13.2           
[29] Cairo_1.5-10               

loaded via a namespace (and not attached):
 [1] ggbeeswarm_0.6.0              colorspace_1.4-1             
 [3] dynamicTreeCut_1.63-1         XVector_0.25.0               
 [5] BiocNeighbors_1.3.3           bit64_0.9-7                  
 [7] interactiveDisplayBase_1.23.0 knitr_1.23                   
 [9] zeallot_0.1.0                 jsonlite_1.6                 
[11] dbplyr_1.4.2                  shiny_1.3.2                  
[13] HDF5Array_1.13.4              BiocManager_1.30.4           
[15] compiler_3.6.0                httr_1.4.0                   
[17] dqrng_0.2.1                   backports_1.1.4              
[19] assertthat_0.2.1              Matrix_1.2-17                
[21] lazyeval_0.2.2                limma_3.41.15                
[23] later_0.8.0                   BiocSingular_1.1.5           
[25] htmltools_0.3.6               tools_3.6.0                  
[27] rsvd_1.0.2                    gtable_0.3.0                 
[29] glue_1.3.1                    GenomeInfoDbData_1.2.1       
[31] rappdirs_0.3.1                fastmatch_1.1-0              
[33] vctrs_0.2.0                   ape_5.3                      
[35] nlme_3.1-140                  ExperimentHub_1.11.3         
[37] crosstalk_1.0.0               DelayedMatrixStats_1.7.1     
[39] xfun_0.8                      stringr_1.4.0                
[41] miniUI_0.1.1.1                mime_0.7                     
[43] irlba_2.3.3                   statmod_1.4.32               
[45] AnnotationHub_2.17.6          edgeR_3.27.9                 
[47] zlibbioc_1.31.0               scales_1.0.0                 
[49] promises_1.0.1                yaml_2.2.0                   
[51] curl_4.0                      memoise_1.1.0                
[53] gridExtra_2.3                 stringi_1.4.3                
[55] RSQLite_2.1.2                 manipulateWidget_0.10.0      
[57] rlang_0.4.0                   pkgconfig_2.0.2              
[59] bitops_1.0-6                  rgl_0.100.26                 
[61] evaluate_0.14                 lattice_0.20-38              
[63] purrr_0.3.2                   Rhdf5lib_1.7.3               
[65] htmlwidgets_1.3               bit_1.1-14                   
[67] tidyselect_0.2.5              magrittr_1.5                 
[69] bookdown_0.12                 R6_2.4.0                     
[71] DBI_1.0.0                     pillar_1.4.2                 
[73] withr_2.1.2                   RCurl_1.95-4.12              
[75] tibble_2.1.3                  crayon_1.3.4                 
[77] BiocFileCache_1.9.1           rmarkdown_1.14               
[79] viridis_0.5.1                 locfit_1.5-9.1               
[81] grid_3.6.0                    data.table_1.12.2            
[83] blob_1.2.0                    webshot_0.5.1                
[85] digest_0.6.20                 xtable_1.8-4                 
[87] httpuv_1.5.1                  munsell_0.5.0                
[89] beeswarm_0.2.3                viridisLite_0.3.0            
[91] vipor_0.4.5