Hoo boy. That question gets us into the weeds very

Message Board

Previous

| Next

Post# of 157021

(Total Views: 1016)

Posted On: 06/11/2020 12:43:16 AM

Posted By: Rex Eupseiphos

Re: havasu78 #37368

Hoo boy. That question gets us into the weeds very fast. The short answer for why you get a very different p-value than they do is that the test you are using at Wolfram Alpha* begins to break down for small samples and collapses entirely when there are 0% or 100% fatalities. Near the bottom of the page they give a tiny little warning: "assuming two independent large simple random samples." The INDEPENDENT, LARGE, and RANDOM are all critical. If you have all three of those, it makes for easy statistics...just plug the numbers into an online calculator or dig out your Statistics 101 book from college or AP stats. Getting samples that are independent and random (or otherwise adequately representative) is actually pretty difficult in most cases, and it's one of things that they bend over backward trying to do in clinical trials. If your samples are bad, your statistics are garbage, no matter how good your models are. If you have small counts, it makes the modeling much more difficult.

In this case, Wolfram Alpha's test breaks down. Kiniksa uses some alternative, but they don't say what they did. A Fisher exact test gives 0.073 (don't let the name fool you, it ain't all that it's cracked up to be), which is close to what they got. In a different PR, they say p = 0.046 instead of 0.086, which looks suspiciously like a one-sided vs. two-sided test. Who knows for sure, but their number looks more reliable than Wolfram Alpha's because it gets (approximate) confirmation from the Fisher exact test.

I haven't seen any classical solution to the problem of modeling discrete counts with small means, so I usually jump into the Bayesian world for those. It is fairly easy to create robust and accurate Bayesian models for small counts, but it often amounts to writing custom models for a given problem, so you aren't likely to be able to find any online calculators to use or any plug-and-play formulas in Wikipedia or a textbook. However, there is a remarkably reliable approximation to a natural Bayesian solution: if you get all 0% survival (or mortality) for n patients, pretend like you got 1 mortality in 2n patients, and then use logistic regression for your test. I get about 0.02 for that test (one-sided). Or you can plug it into Wolfram Alpha and get 0.007 (which, again, is not great because of the small sample size).

A final caveat, ANY test with discrete random variables with a small mean will have difficulties because the discreteness means there are large probability gaps, and p-values will have only a finite---and most likely choppy---number of possible values, e.g., 0.0134, 0.0742, 0.186, 0.224, and so on with nothing to fill the gaps. But that being said, let's hope leronlimab has 0 fatalities in the s/c trial and 100% full recoveries in the m/m trials even if it makes their statisticians' jobs a little harder.

* search for "two binomial distribution test and use 0 for the "hypothesized parameter", which translates as "no difference between treatments" and is the natural null hypothesis