Feature Extraction Package for Biological Sequences Based on Mathematical Descriptors
Home • Key Features • List of files • Dependencies • Installing • How To Use • Citation
Shannon Entropy
Fundamentally, we applid entropy, because we can reach a single value that quantifies the information contained in different observation periods (e.g., our case: k-mer - see our pipeline in this article). To use this model, follow the example below:
To run the tool (Example): $ python3.7 methods/EntropyClass.py -i input -o output -l label -k kmer -e Entropy
Where:
-h = help
-i = Input - Fasta format file, e.g., test.fasta
-o = output - CSV format file, e.g., test.csv
-l = Label - Dataset Label, e.g., lncRNA, mRNA, sncRNA
-k = Range of k-mer, e.g., 1-mer (1) or 2-mer (1, 2)
-e = Type of Entropy, E.g., Shannon
Running:
$ python3.7 methods/EntropyClass.py -i sequences.fasta -o sequences.csv -l mRNA -k 10 -e Shannon
Tsallis Entropy
To run the feature extraction tool (Example): $ python3.7 methods/TsallisEntropy.py -i input -o output -l label -k kmer -q entropic parameter
Where:
-h = help
-i = Input - Fasta format file, e.g., sequences.fasta
-o = output - CSV format file, e.g., dataset.csv
-l = Label - Dataset Label, e.g., lncRNA, mRNA, sncRNA
-k = kmer - Range of k-mer, E.g., 1-mer (1) or 2-mer (1, 2) ...
-q = Tsallis - q entropic parameter
Running:
$ python3.7 methods/TsallisEntropy.py -i sequences.fasta -o sequences.csv -l mRNA -k 10 -q 2.3
Note Input sequences for feature extraction must be in fasta format.
Note This example will generate a csv file with the extracted features.