Prediction of multi-drug resistance transporters dataset
Data file 1
Title: Data File PROSITE_positives_PS000125.fasta.
Legend: Sequence file in FASTA format of all positive examples for the ser/thr phosphatase model.
Data file 2
Data File PROSITE_negatives_PS000125.fasta.
Sequence file in FASTA format of all randomly selected negative examples for the ser/thr phosphatase model."
Data file 3
Data File PROSITE_positives_PS00028.fasta.
Sequence file in FASTA format of all positive examples for the zinc finger model.
Data file 4
Data File PROSITE_negatives_PS00028.fasta.
Sequence file in FASTA format of all randomly selected negative examples for the zinc finger model.
Data file 5
Data File PROSITE_PS00125.txt.
PROSITE record used for the ser/thr phosphatase model.
Data file 6
Data File PROSITE_PS00028.txt.
PROSITE record used for the zinc finger model.
Data file 7
Data File MDR_TCDB_positives.fasta.
Sequence file of MDR transporters used for training. FASTA format file of positive examples used in this study derived from the TCDB.
Data file 8
Data File MDR_TCDB_negatives.fasta.
Sequence file of non-MDR transporters used for training. FASTA format file of negative examples used in this study derived from the TCDB.
Data file 9
Data File PILGram_PATTERNS_PS00125.txt.
Regular expression generated by PILGram for the ser/thr phosphatase model.
Data file 10
Data File PS00125_alignments.out.
Sequence alignments of PILGram model matches to the positive examples in the ser/thr phosphatase model.
Data file 11
Data File PILGram_PATTERNS_PS00028.txt.
Regular expressions generated by PILGram for the zinc finger model.
Data file 12
Data File PS00028_alignments.out.
Sequence alignments of PILGram model matches to the positive examples in the zinc finger model and a summary score line that represents the overlap of the 10 different models for each sequence.
Data file 13
Data File PILGram_PATTERNS_MDRpred.txt.
The 36 regular expressions and associated physiochemical properties (where applicable) generated by PILGram for the MDR model .
Data file 14
Data File MDRpred_alignments.out.
Alignments of 36 PILGram model matches on the MDR positive example sequences.
Data file 15
Data File Pfam_transporters.txt.
A list of Pfam families that were used to identify transporters in the Hot Lake metagenome.
Data file 16
Data File HotLake_MDRpred_predictions.fasta.
A FASTA format file of 63 protein sequences from the Hot Lake metagenome that are matched by 30 or more MDRpred individual models (high confidence predictions), match Pfam families for transporters (Pfam e-value less than 1e-20), but are not identified by Pfam as multidrug resistance transporters.