2012-03-16

Missing external exons in GenBank "with parts"

I recently stumbled on a problem in NCBI Entrez with the GenBank (with parts) return type. Some GenBank files don't actually contain a sequence at the end - instead they have a CONTIG section telling you how to construct the sequence from other referenced pieces. That's often inconvenient so the NCBI have the handy option of downloading it with all this parts pre-computed, which normally is great.

2012-03-09

BAM versus CRAM v0.7

CRAM 0.7 was released earlier this month, and includes support for storing arbitrary read tags - a key requirement for it to be evaluated in existing pipelines as a BAM alternative. However, it doesn't  preserve read names - which is a compression trick you can also do with plain BAM.