Why are some entries repeated in the protein-protein interaction table? Do the multiple entries have any significance?
This is the result of another overlooked data cleaning issue. There is no significance to the fact that some pairs have multiple entries. • Why are there some entries in the protein-protein interaction table in which a gene’s product interacts with itself (i.e. the gene listed in the second column is the same as the one listed in the first)? Certain proteins form homodimers, meaning that two copies of the same protein molecule bind to each other to form a complex. The instances of reflexive interaction (e.g. YNL331C, YNL331C) in the data set are putative homodimers. • If gene A’s protein interacts with gene B’s protein and gene B’s protein interacts with gene C’s protein, can we infer that A’s protein interacts with C’s protein? No, the interaction relation is not transitive. You cannot conclude that A’s protein physically interacts with C’s protein. However, it may be reasonable to conclude that A and C are related in some way. • Is the list in interactions.txt exhaustive. No, there a