2014-10-31

BLAST! No frequency ratios needed for composition-based statistics

While working on updating the NCBI BLAST+ wrapper for Galaxy for any changes in the new BLAST+ 2.2.30 release, I hit a cryptic error message from deltablast

$ deltablast -query rhodopsin_proteins.fasta -subject four_human_proteins.fasta -evalue 1e-08 -outfmt "6 qseqid sseqid score" -rpsdb /data/blastdb/cdd_delta
BLAST engine error: /data/blastdb/cdd_delta contains no frequency ratios needed for composition-based statistics.
Please disable composition-based statistics when searching against /data/blastdb/ncbi/cdd/cdd_delta.

To cut a long story short, to fix this you need to download and unpack a newer cdd_delta.tar.gz which now includes another file cdd_delta.freq containing frequency ratio information which the newer deltablast tool requires.

The same applies to the rpsblast tool, although here you just get a warning rather than an error:

$ rpsblast -query four_human_proteins.fasta -db /data/blastdb/cdd_delta -evalue 1e-08 -outfmt "6 qseqid sseqid score"
Warning: /data/blastdb/cdd_delta contain(s) no freq ratios needed for composition-based statistics.
RPSBLAST will be run without composition-based statistics.
sp|Q9BS26|ERP44_HUMAN    gnl|CDD|222416    401
...
sp|P06213|INSR_HUMAN    gnl|CDD|238021    137
sp|P08100|OPSD_HUMAN    gnl|CDD|215646    411