Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

Why do the sizes of Ensembl and IPI data sets differ so much?

April 26, 2017Data differ ensembl IPI

0

Posted

Why do the sizes of Ensembl and IPI data sets differ so much?

1 Answer

0

Posted

IPI is built in order to provide maximum coverage of the major publicly available protein (and gene) databases, yet also to minimize the redundancy of such this large body of data (more than 200,000 source database entries are reduced to 56000 entries in IPI human v3.12). This is done by merging data from different data source entries into a single IPI entry when there is evidence that these source entries represent the same protein (i.e. a particular gene product).