Skip to main content

What is Bash Scripting?




      After sending a DNA sample to a sequencing company, the company will return "short reads" of sequences that are contained within the genome. These short reads come with a quality score that describes the certainty of nucleotide at that location, and these short reads are analyzed by programs utilizing bioinformatics tools. Many commands ran to analyze short reads require extensive time that would be inefficient to wait until that command is completed to begin the next command. This is when a bash script can compile commands into an algorithm, so alignment scripts can be created to make the sequential commands run more efficiently.

       Another useful feature of bash scripting allows for tools of various languages, such as Perl, Java and Python to be utilized within a single script. This allows a file to be created by one tool written in one language and modified by the next without the programmer being present. This is extremely useful because many tools take seconds to days to complete depending on the inputted file size, so the use of a script is very useful.  For example, to begin analyzing short- reads, Trimmoatic is used to determine which reads can be considered “good reads”.  The parameters of a good read, such as certain quality scores for a length of the  is inputted into the program and the output will be used by other tools, such as SAM (Sequence Alignment/Map) Tools.
      The use of variables can make scripting become more efficient since many experiments will require reads from numerous individuals in a population. This leads to the issue of having to change the file name in each line of the code, which is time consuming and unnecessary. It is unnecessary because the use of a variable allows the user to input a file name, such as ecoli shown in Figure 1, which can be called upon numerous times throughout the script as $ {File1}. Additionally, this program could be made even better by creating another script to change the file name to the next sample automatically after the previous sample has been completed, so the user can begin analysis and periodically check for errors while the program is running.
      Another useful feature this program incorporates is the use of the echo command, so the computer will print to screen the command that it is running. While it may seem like an extra command, using the echo command allows for troubleshooting errors within the program to be done efficiently. If there was not the echo command, a user would have to error check the entire script for a single error in one command; however, with echo, the command where the error occurred will be printed to screen and the user will be able to identify which tool contains an error.

      An example of a Bash Script with the echo command and variables being used is shown below:



Comments

Popular posts from this blog

Forging a New Frontier in Cannabis

Forging a New Frontier in Cannabis Written By: Christopher Pauli Our lives depend on plants, for food, clothes and beauty. And there are so many mysteries to explore. For instance: Why can a plant produce both male and female flowers when it appears genetically to be female? Can we determine the tissue-specific gene expression profiles of any organism? The Agricultural Genomics Foundation sponsors a research project at the University of Colorado Boulder that focuses on transcriptomics, or the study of RNA, the next genetic frontier for the Cannabis plant. With numerous transcriptomes sequenced, AGF is aiming to expand upon its previous work in the genome to include RNA data that will help us better understand how gene expression controls the plant. The genes expressed are called RNA, or ribonucleic acid, which collectively referred to as the transcriptome. By comparing the differences between these transcriptomes in different areas on the plant, we can unders

Support Cannabis Research Simply Through Shopping on Amazon

Hey everyone, If you could take a second to sign up for smile.amazon.com to support the Agricultural Genomics Foundation that supports the Cannabis Genomic Research Initiative that focuses on mapping the Cannabis genome to guide breeding, understand the history, and develop the molecular and genetic tests of the future!