BioAutoML: Automated Feature Engineering and Metalearning for Classification of Biological Sequences
Home • Repository • Documentation • Citation
BioAutoML - Automated Feature Engineering and Metalearning - End-to-end Machine Learning Workflow - Protein
To use this model, follow the example below:
To run the code (Example): $ python BioAutoML-feature-protein.py -h
Where:
-fasta_train: fasta format file, e.g., fasta/protein_train_pos.fasta fasta/protein_train_neg.fasta
-fasta_label_train: labels for fasta files, e.g., positive negative
-fasta_test: fasta format file, e.g., fasta/protein_test_pos.fasta fasta/protein_test_neg.fasta
-fasta_label_test: labels for fasta files, e.g., positive negative
-estimations: number of estimations - BioAutoML - default = 50
-n_cpu: number of cpus - default = 1
-output: results directory, e.g., result
Running:
$ python BioAutoML-feature-protein.py -fasta_train MathFeature/Case\ Studies/CS-I/train_P.fasta MathFeature/Case\ Studies/CS-I/train_N.fasta -fasta_label_train positive negative -fasta_test MathFeature/Case\ Studies/CS-I/test_P.fasta MathFeature/Case\ Studies/CS-I/test_N.fasta -fasta_label_test positive negative -output experimental/protein
Note This example is in the Directory: MathFeature.
Running: In unknown sequences
$ python BioAutoML-feature-protein.py -fasta_train MathFeature/Case\ Studies/CS-I/train_P.fasta MathFeature/Case\ Studies/CS-I/train_N.fasta -fasta_label_train positive negative -fasta_test new_sequences.fasta -fasta_label_test unknown -output experimental/protein