About ECDomainMiner

Many entries in the protein data bank (PDB) are annotated to show their component protein domains according to the Pfam, SCOP, and CATH classifications, as well as their biological function through the enzyme commission (EC) numbering scheme. However, despite the fact that the biological activity of many proteins often arises from specific domain-domain and domain-ligand interactions, current on-line resources rarely provide a direct mapping from structure to function at the domain level. There is therefore a need to develop automatic structure-function annotation tools which can operate at the domain level.

ECDomainMiner is a novel recommender-based method to infer associations between EC numbers and Pfam domains. Overall, ECDomainMiner finds a total of 20,728 non-redundant EC-Pfam associations with a F-measure of 0.95 with respect to the InterPro database, which is treated here as a ``Gold Standard''.

Compared to the 1,515 manually curated EC-Pfam associations in InterPro, ECDomainMiner infers a 13-fold increase in the number of EC-Pfam associations. These EC-Pfam associations could be used to annotate some 68,152 protein chains in the PDB which currently lack any EC annotation.

People Involved