Linear regression is one of those things that is easy to use in practice, but difficult to develop a good intuition for. At least I struggle to have a good sense of what the regression coefficients are going to look like for all but the most trivial cases.
[Read More]
Reproducible analysis for the very lazy
These days, pretty much everyone acknowledges that scientific analysis should be reproducible. Of course, this is still very often not the case. For the most part, I don’t think that people start out trying to write code and structure their data so no one else can reproduce their results. But...
[Read More]
Batch effects in single cell RNA sequencing
Part 2 - Handling batch effects without batch correction
In a previous post, I wrote about the perils of over-diagnosing batch effect. In this post I’m going to do something even more controversial and provide advice for how to deal with batch effects without using some black box “batch correction” tool. Why would you want to do this? Are...
[Read More]
Batch effects in single cell RNA sequencing
Part 1 - Diagnosing batch effects from UMAP
With the ever increasing number of single cell transcriptomics data sets available, people are wanting to do combined analyses more and more frequently. Of course, the first thing that happens when people do this is that data from different samples, labs, and experiments don’t “mix well”. The identification of poor...
[Read More]
The covid R value
The mean hides an awful lot
I wanted to write a quick post pointing out something that I don’t think has been widely appreciated about the covid19 R value now familiar to everyone. You have probably seen a version of the graphic showing how one infected person leads to an exponentially growing number of cases if...
[Read More]