Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

How are p-values calculated, and why are they approximate?

April 26, 2017approximate calculated p-values

0

Posted

How are p-values calculated, and why are they approximate?

1 Answer

0

Posted

A raw functional association score is a ratio of the between, background, within, and baseline measures of average functional relationship for two gene sets. When this ratio is bigger, the two sets are more related – but quantifying “how much more” depends on the size of the two gene sets and on the current biological context. To analyze these scores in a more principled manner, we convert them to p-values by comparing them to bootstrapped null distributions generated by randomly calculating scores for thousands of gene sets over a wide ranges of sizes in every biological context. This yields a distribution of expected scores (per size of the two gene sets, per context) that is approximately normal with mean one, and comparing the score for a “real” gene set to this background distribution yields a p-value. However, the exact variance of the distribution is dependent on the size of the gene sets being analyzed and on the current context, and we can’t randomly generate thousands of gene