Human immune

Human immune cells dataset from the scIB benchmarks



Luecken et al. (2021)
1.18 GiB
33506 × 12303

Used in


Human immune cells from peripheral blood and bone marrow taken from 5 datasets comprising 10 batches across technologies (10X, Smart-seq2).


dataset is an AnnData object with n_obs × n_vars = 33506 × 12303 with slots:


Name Description Type Data type Size
batch A batch identifier. This label is very context-dependent and may be a combination of the tissue, assay, donor, etc. vector category 33506
cell_type Classification of the cell type based on its characteristics and function within the tissue or organism. vector category 33506
size_factors The size factors created by the normalisation method, if any. vector float32 33506
tissue Specific tissue from which the cells were derived, key for context and specificity in cell studies. vector category 33506
feature_name A human-readable name for the feature, usually a gene symbol. vector object 12303
hvg Whether or not the feature is considered to be a ‘highly variable gene’ vector bool 12303
hvg_score A ranking of the features by hvg. vector float64 12303
knn_connectivities K nearest neighbors connectivities matrix. sparsematrix float32 33506 × 33506
knn_distances K nearest neighbors distance matrix. sparsematrix float64 33506 × 33506
X_pca The resulting PCA embedding. densematrix float32 33506 × 50
pca_loadings The PCA loadings matrix. densematrix float32 12303 × 50
counts Raw counts sparsematrix float32 33506 × 12303
normalized Normalised expression values sparsematrix float32 33506 × 12303
dataset_description Long description of the dataset. atomic str 1
dataset_id A unique identifier for the dataset. This is different from the obs.dataset_id field, which is the identifier for the dataset from which the cell data is derived. atomic str 1
dataset_name A human-readable name for the dataset. atomic str 1
dataset_organism The organism of the sample in the dataset. atomic str 1
dataset_reference Bibtex reference of the paper in which the dataset was published. atomic str 1
dataset_summary Short description of the dataset. atomic str 1
dataset_url Link to the original source of the dataset. atomic str 1
knn Supplementary K nearest neighbors data. dict 3
normalization_id Which normalization was used atomic str 1
pca_variance The PCA variance objects. dict 2


Luecken, Malte D., M. Büttner, K. Chaichoompu, A. Danese, M. Interlandi, M. F. Mueller, D. C. Strobl, et al. 2021. “Benchmarking Atlas-Level Data Integration in Single-Cell Genomics.” Nature Methods 19 (1): 41–50.