ProFAB – Open Protein Functional Annotation Benchmark
ProFAB is a benchmarking platform for GO term and EC number prediction. It provides several datasets, featurization and scaling methods, machine learning algorithms and evaluation metrics. These are collected in four independent modules as shown in the below figure.
- Dataset module: Individual datasets are created for each EC number and GO term.
- Preprocessing module: This module consists of 3 submodules which are splitting, featurization and scaling modules.
- Training module: The training module consists of several machine learning algorithms for binary classification. In this module, hyperparameter optimization is automatically performed to determine the best performing models.
- Evaluation module: This module provides several evaluation metrics to assess the performance of the trained models.