pharmavengers

pharmavengers

Sunday, 14 June 2015

The Goals of Bioinformatics Software


          Bioinformatics tools are software programs that are designed for extracting the meaningful information from the mass of molecular biology / biological databases & to carry out sequence or structural analysis. Listed are the bioinformatics software goal :

  1. To spread the knowledge of fundamental biological processes at organism, physiological, cellular and molecular levels.
  2. To educate people on basic understanding of principles of chemistry and their applications to living systems, properties of bio-molecules and their contribution to structure and function of cells.
  3.  Understanding of computer programming methodology, including algorithm design and program development. Capability of designing and applying software tools for biological data analysis.
  4. Proficiency in the use of mathematical tools including discrete mathematics, calculus, and statistics.
  5. Integrated knowledge and technical skills gained from diverse scientific disciplines of biochemical, mathematical, computational and life sciences, understanding key problems, possible solutions, and latest advances in bioinformatics.
  6. Understanding of the process of scientific inquiry, preparation for rigorous research, quantitative problem solving skills, data analysis and interpretation of results. 

Friday, 12 June 2015

BLAST :)

BLAST is the Basic Local Alignment Search Tool. It is a set of search programs designed to explore all available sequence databases in either protein or DNA. This software has been designed to achieve great speeds while keeping a well-defined statistical interpretation.

Setup

No setup is needed to run BLAST.

Usage

BLAST provides a variety of commands including:

    blastall
    • performs protein-protein (blastp) searches,
    • nucleotide-nucleotide (blastn) searches,
    • nucleotide to protein database (blastx) searches,
    • protein to translated nucleotide database (tblastn) searches,
    • nucleotide to translated protein database (tblastx) searches,
    • or position-specific interated (psiblastn) searches.
    megablast
    • performs nucleotide-nucleotide searches using an optimized greedy algorithm that concatenates queries to save time spent scanning the database.
    blastpgp
    • performs gapped blastp searches and can be used to perform iterative searches in psi-blast and phi-blast mode.
    bl2seq
    • performs a comparison between two sequences using either the blastn or blastp algorithm. Both sequences must be proteins or both sequences must be nucleotides.

So, here are the example of the usage of this database.

HIV BLAST

Purpose: Find the HIV database sequences most similar to your query(s).

Input
Paste your sequence(s)

or upload a file
or enter accession number(s)

Options
Output style 
Number of BLAST matches to display
Run BLAST against
or a background set of sequences you upload
E-mail Always email results
Show location of match in genome Only matters for nucleotide input; uncheck to speed the job
 

Details: Our DNA database contains most of the same HIV sequences found in GenBank, but a BLAST search here gives more informative output. The results will contain some of the fields we annotate, such as subtype, sampling country and isolation year.

Input: One nucleotide or amino acid sequence, or a bulk set of sequences. A single sequence can be in FastA format or raw sequence.

Run BLAST against: The default BLAST background is all sequences in the LANL HIV Database. You can also search only the sequences with assigned subtypes, or sequences of one pure subtype. If you want to BLAST against your own submitted background set, browse for a file that contains those sequences.

Subsequent analyses: From the BLAST results page, you can: 

  • Download and align all or a selection of your output sequences,
  • Use the Geography search to examine the origin of your BLAST results,
  • Run NCBI BLAST

HIV BLAST Examples

Output

All BLAST results begin with a table of the best matches to your query sequence. Matches are excluded if the %Identity is <50% or if the length of the match is <20% of the length of the query sequence.

sample BLAST output

Output columns:

  • Download. Use these boxes to select sequences for download. Check/uncheck the top box to select all or unselect all.
  • Accession. This will link you to the accession record for each sequence indicated.
  • Name. The common name of the sequence in our database.
  • Subtype. The sequence subtype, if defined.
  • Country. The sampling country.
  • Year. The sampling year.
  • Description. The GenBank sequence description.
  • Score. The score is calculated by the BLAST algorithm. It takes into account both the length of alignment and percent of matching bases. Click score to jump to the alignment of this sequence, below (pairwise output only).

    Note that results are listed by score, and score is not always correlated with percent identity. For example, if you BLAST a full-length sequence, the top scores will be other full-length sequences; shorter sequences of higher identity will be missed.

  • E value. The likelihood that this match is to occur by chance.
  • Identities. The number and percent identity between your query and this subject, across the longest continuous alignment.
  • Location of match. Shows the location and length of the subject sequence (yellow) and the matched region between query and subject (red).


Pairwise versus Master-Slave output

Following the list of best matches there appears an alignment of your query sequence to its matches. There are two different styles of alignment to choose from in the pop-up menu on the BLAST search submission page.


Pairwise:

In pairwise output, the query is matched against each single subject sequence and the identities are shown by the vertical bar ( | ) character.


Score = 541 bits (273), Expect = e-154
Identities = 273/273 (100%), Positives = 273/273 (100%)
Query: 1 gtaattagatccgccaatttcacagacaatactaaaatcataatagtacagctgaatgaa  60
         ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1 gtaattagatccgccaatttcacagacaatactaaaatcataatagtacagctgaatgaa  60
 
Query: 61 tctgtacaaattaattgtacaagacccaacaacaatacaagaaaaagtataaatatagga 120
          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 61 tctgtacaaattaattgtacaagacccaacaacaatacaagaaaaagtataaatatagga 120
 

Master-Slave with identities:

Query seq 1 gtaattagatccgccaatttcacagacaatactaaaatcataatagtacagctgaatgaa  60
Z29296    1 ............................................................  60
U95417  433 .............a.........g......g............................. 492
U95414  433 .....c.................g......g............................. 492
U95413  433 .............a.........g......g............................. 492
U95411  433 .............a.........g......g............................. 492
U95410  433 .............a.........g......g............................. 492
L21486   49 .............a.........g......g............................. 108
L21468   49 .............a.........g......g............................. 108
U95419  433 .............a.........g......g............................. 492
U95400  430 ............aat........g.............g............c......... 489
U95392  430 ............aat........g.............g............c......... 489
L21480   49 .............a.........g......g............................. 108
Z67943    4 ....................................g...................g...  63

Here the query is aligned against ALL sequences producing a BLAST match and the identities are shown by the dot character.

Occasionally you may see lines in the alignment that look like those below. 


QUERY    121  ccaggcagagcattttatacaacaggagaaataataggagatataagtcaagcacattgt 180
AF105870 121  ............................................................ 180
                                                   \                       
                                                   |                        
                                                   a                    

This means that an "a" nucleotide occurs in sequence AF105870 at position 157. The sequence of AF105870 in the region of this insertion reads tagAgag, where the "A" marks the inserted "a". If you choose to download a file of all or part of this alignment the insertions are handled as follows. The insertion is placed into its sequence and gaps are opened in all other sequences at that point. In the example above the alignment in the region of the "a" insertion would look like:

QUERY    tag-gag 
AF105870 tagagag


Thank you. This is all for this post. Hope you guys understand and enjoy it.




Autodock

AutoDock is a suite of automated docking tools. It is designed to predict how small molecules, such as substrates or drug candidates, bind to a receptor of known 3D structure. Current distributions of AutoDock consist of two generations of software: AutoDock 4 and Autodock Vina.

AutoDock 4 actually consists of two main programs:

  1.  autodock: performs the docking of the ligand to a set of grids describing the target protein
  2.  autogrid: pre-calculates these grid. 




Basically Autodock help us to build some particular molecule and do some docking to our molecule for example spinning, adding hydrogen bond and etc.

Meanwhile, Autodock Vina is a new generation of docking software from the Molecular Graphics Lab. It achieves significant improvements in the average accuracy of the binding mode predictions. Here are some screenshot example on the Autodock Vina:





AutoDock has applications in :
  • X-ray crystallography
  • structure-based drug design
  • lead optimization
  • virtual screening (HTS)
  • combinatorial library design
  • protein-protein docking
  • chemical mechanism studies

AutoDock 4 is free and is available under the GNU General Public License. AutoDock Vina is available under the Apache license, allowing commercial and non-commercial use and redistribution.
       

Thursday, 11 June 2015

Arguslab

There are many types of Computer Aided Drug-Design (CADD) nowadays. One of it is Argus Lab. Argus Lab is authored by a scientist and scientist software developer in the theoretical and computational chemistry field, Mark Thompson. It is a molecular modeling, graphics, and drug design program for electronic devices that use Windows operating systems. 

Its work by using new algorithms for modeling solvent effects by combining quantum mechanics with classical mechanics. These methods were used to simulate molecules that bind radionuclides from complex waste which is useful for drug design.
This type of CADD is currently used by scientists and students worldwide. ArgusLab is favourable because of its many advantages. Although it is quite a little dated by now, yet surprisingly still remains as one of the most popular CADD among the users. There are more than 20,000 downloads to date. 

This software is very convenient since the user can download and use it without license. However, the user is restricted from redistributing this from other website or sources. Specifically, this application software will helps us to calculate the docking affinity. Let us see how ArgusLab helps us in designing a drug! 
For example if we want to dock benzadymine molecule (inhibitor) into beta trypsin. It will helps us to calculate dock affinity of a ligand into its binding site. First of all, we need to make up the component of the ligand and the binding site. So, we are going to use the structure from online database which is Brookehaven Protein Data Bank (RCSB PDB). The PDB will looks as follows :

Inline image 1


Then, go to the search box on the upper right and search for benzamidine and open the structure file. Next, open the protein data bank file and make sure the format is PDB file.
Make sure the Molecule Tree View is visible and expand it all. You should see something like this.

Inline image 5

Then the screen will be showing the benzamidine molecule. Now, we can add hydrogen atom to the benzamidine molecule by clicking the ribbon button on the upper part.

Inline image 2


The silver part is the hydrogen atom that being added to the molecule.
Now, we are going to make the copy of this molecule. There will be two groups of molecule which are ligand and its binding site. 
Inline image 3
This will appear on your screen. Okay now its time to dock the ligand into the binding site! A table will appear upon clicking. We must make sure the the binding site bounding box is filled correctly. Then, set the required grid resolution. After all is complete, click start button and the system will start calculate the binding affinity between ligand and its binding site for you. The value will be showed in the diagram as below.
Inline image 4

So, that is how ArgusLab helps us in drug design. The scientist and student worldwide are using this type of CADD to develop a new drug. This is one of their way to choose the most potential lead compound. Obviously, this application is very effective and efficient

Bioinformatics General Conclusion

CONCLUSION


The overall conclusion about what we has learn in bioinformatic lecture and lab session is how advances it is now compare to past few years.Some of the topics included in the bioinformatics is :
  • BLAST Software.
  • Arguslab.
  • Autodock.
What bioinformatics has teach us is how the developments in information technologies has combine to produce large amount of information related to molecular biology.Lets take BLAST Software as a first example for the conclusion.BLAST Software help us in an algorithm for comparing primary biological sequence information.It help researcher to the sequence with much more less effort and can save a lot of time.


Secondly is Arguslab is use in a molecular modeling,graphics,and drug design program.It can help us in making a model of simple or complex molecule for drug design which is are compulsory for us to know how to get it done.Other than that is it also can help us in designing a graphics which is awesome.


Last but not least is the Autodock which use in a molecular modeling simulation software. It is especially effective for Protein-Ligand Docking which is so interesting such that we can open up the whole structure of protein to reconstruction it back to the favourable protein that we want.

So,there is a lot of advantages that we get from bioinformatics that we learn and its help us a lot in understanding the drug design more effectively.