Chapter 19 416B Smart-seq2 dataset

19.1 Introduction

This performs an analysis of the (???) Smart-seq2 dataset, which contains two 96-well plates of 416B cells with and without induction of a CBFB-MYH11 oncogene.

19.2 Analysis code

19.2.4 Normalization

No pre-clustering is performed here, as the dataset is small and the cells are all similar.

19.2.5 Variance modelling

We take all of the genes with positive biological components.

19.2.6 Batch correction

The composition of cells is expected to be the same across the two plates, hence the use of removeBatchEffect() rather than more complex methods.

19.2.7 Dimensionality reduction

denoisePCA() automatically does its own feature selection, so further subsetting is not strictly required unless we wanted to be more stringent.

19.3 Results

19.3.4 Dimensionality reduction

## [1] 27

19.3.5 Clustering

We compare the clusters to the plate of origin.

##        Plate
## Cluster 20160113 20160325
##       1       41       39
##       2       19       17
##       3       17       15
##       4       11       13
##       5        5        8

We compare the clusters to the oncogene induction status.

##        Oncogene
## Cluster induced CBFB-MYH11 oncogene expression wild type phenotype
##       1                                     80                   0
##       2                                      0                  36
##       3                                      0                  32
##       4                                      0                  24
##       5                                     13                   0