Changes to Bio::Blast
Naohisa Goto has announced changes to the Bio::Blast.reports to support default -m 0 and tabular -m 8
formats in addition to XML (-m 7) form. I think this is really nice and convenient!
Previously it meant that for bioruby to parse a Blast file, you had to have your blast results in XML output which Bio::Blast::Reports would understand. However, by default Blast gives an -m0 output and without that prior knowledge you may spend hours wondering what is wrong when parsing default blast output files. With Bioruby 1.2.1 this will not work
require 'rubygems'
require 'bio'
report_file = "/home/george/esther_blast_files/blast_output2.txt"
Bio::Blast.reports(report_file) do |report|
puts report.class
end
Unless blast_output2.txt is in XML format
In the upcoming Bioruby 1.3 the default Blast file can now be parsed, for example,
require 'rubygems'
require 'bio'
report_file = "/home/george/esther_blast_files/blast_output2.txt"
Bio::FlatFile.open(Bio::Blast::Default::Report,report_file) do |ff| ff.each do |rep| puts rep.statistics rep.iterations.each do |itr| puts itr.hits.size itr.hits.each_with_index do |hit,i| puts hit.hit_id puts hit.len end end end end
Bio::Blast.remote now supports DDBJ in addition to Genomenet. It would be a nice idea to support NCBI as well.
Changes to Bio:sequence
It is possible to create sequence objects from Bio::GenBank, Bio::EMBL, and Bio::FastaFormat by using the to_biosequence method
gb = Bio::GenBank.new(genbank_file.gb) gb.to_biosequence
Bio::SQL Support
Thanks to Raoul and Naohisa, support for BioSQL has been rewritten by using ActiveRecord.
#Make a connection connection = Bio::SQL.establish_connection(path_to_database.yaml,'development') #list databases databases =Bio::SQL.list_databases #retrieve a sequence sample_seq = Bio::SQL.fetch_accession('some_accession_number') #get number of seqeunces in the database puts Bio::SQL.list_entries #get references associated with an entry puts sample_seq.references #create an embl format puts sample_seq.to_biosequence.output(:embl)
Changes to Bio::GFF2 and Bio::GFF3
GFF2/GFF3 formatted texts are now supported but there will be backward portability issues with bio 1.2.1 since some incompatible changes have been incorporated. Bio::GFF::Record.comments has been renamed to comment and comments= is now comment=
Both Bio::GFF::GFF2::Record.new and Bio::GFF::GFF3::Records.new, can now take 9 arguments that correspond to GFF columns making it easy to create a Record object directly without need for formatted text.
Both Bio::GFF::GFF2::Record#attributes and Bio::GFF::GFF3::Record#attributes have been changed to return a nested array containing tag, value pairs, to obtain a hash, use the to_hash method
To support data output for GFF2/GFF3, new methods have been added: Bio::GFF::GFF2#to_s, Bio::GFF::GFF3#to_s, Bio::GFF::GFF2::Record#to_s,and Bio::GFF::GFF3::Record#to_s
Lots of other changes have been incorporated for the GFF classes and you can view the change log at github
CodeML parser
A wrapper for PAML codeml program that is used for estimating evolutinary rate has been added. The class provides methods for generating the necessary configuration file. The new Bio::PAML::Codeml::Report and PAML::Codeml::Rates provides simpel classes for accessing the codeml report and rates file. This example is from the example given in the source code
require 'bio'
# Reads multi-fasta formatted file and gets a Bio::Alignment object.
alignment = Bio::FlatFile.open(Bio::Alignment::MultiFastaFormat, 'example.fst').alignment
# Reads newick tree from a file
tree = Bio::FlatFile.open(Bio::Newick, 'example.tree').tree
# Creates a Codeml object
codeml = Bio::PAML::Codeml.new
# Sets parameters
codeml.parameters[:runmode] = 0
codeml.parameters[:RateAncestor] = 1
# You can also set many parameters at a time.
codeml.parameters.update({ :alpha => 0.5, :fix_alpha => 0 })
# Executes codeml with the alignment and the tree
report = codeml.query(alignment, tree)
Lots of Bugs have been fixed and also support for Ruby 1.9 has been added. Its great thanks to the bioruby developers for their time and the excellent new changes!
