Sunday, July 10, 2011

BAMseek - large file viewer

Okay, time to get down to brass tacks (don't worry, even as a native English speaker, I don't know what that means).  First thing I will talk about (shamelessly plug) is a piece of software I have been working on as a hobby called BAMseek.  Bioinformatics have a lot of file formats.  Check out here to see a few.  With the large sizes of data produced by next-generation sequencing, efficient and compact storage of the data is required.  This makes viewing the files difficult for most people because of the large sizes (tens or even hundreds of gigabytes) and the binary format they are in.  I created a file viewer for these large, binary files to allow people to browse inside those gigantic, cryptic files.  Right now, BAMseek supports SAM/BAM, VCF, FASTQ, and SFF.  It indexes each 1000 lines of the file into pages and allows you to randomly jump to various pages throughout the file.  Let me know how the tool works for you.

First Post!

So, I decided to start a blog.  As the name suggests, I want to write about interesting things I see in the field of bioinformatics - tools, software, algorithms, papers, . . .  It is mainly a place where I can get down on paper (well uh electronic paper) some of the things I am currently thinking about.  I hope others can find this useful too.  I am a software developer working the past 3 years on next-generation sequencing data.  So let me know if you find this site useful or not, or just want to say "hi".  Thanks for visiting and enjoy your day!