Speakers
Description
Single-cell RNA-seq (scRNA-seq) produces a plethora of data from which one can derive information about gene expression levels for individual cells. In order to efficiently classify cells based on the tissues they originated from, it is crucial to identify and select informative genes is preserve the differences occurring between distinct cell types while excluding as much redundant information as possible. Finding such a subset is a computationally challenging combinatorial optimization problem in scRNA-seq data analysis. Several state-of-the-art methods tackle this issue in different ways. The aim of this study is to evaluate state-of-the-art marker gene selection methods, comparing their classification accuracy, running time, and memory consumption using real-world datasets. Additionally, we will modify one of the methods under consideration, scGeneFit, allowing it to achieve higher accuracy while having significantly lower running times. We will compare it to the original implementation and the remaining state-of-the-art methods.