Dictionaries

Understanding Dictionaries in Python -- A Deep Dive

What are python dictionaries?

In Python, a dictionary is one of the most versatile data structures. It allows us to store and retrieve data efficiently using keys and values. This structure is key for tasks ranging from data manipulation to biological computations, such as RNA sequence translation. Let’s break down some practical examples to illustrate how dictionaries work, and how they can be applied in bioinformatics.

Basic Structure of a Dictionary

A Python dictionary is created using curly braces {} with key-value pairs. The key is a unique identifier, and the value is the data associated with that key. Here’s an example:

info = {"FirstName" : "Ali", "LastName" : "Hassan", "Age" : 25}

In this dictionary:

You can access the value associated with a key like this:

print(info["FirstName"])  # Output: Ali

Iterating Over a Dictionary

You can easily iterate over a dictionary to get keys and values:

for key, value in info.items():
    print(f"{key}: {value}")

This will output:

FirstName: Ali
LastName: Hassan
Age: 25

Using Dictionaries in Bioinformatics

Dictionaries are especially useful in bioinformatics for mapping biological data. For instance, the RNA codon table is a perfect example where dictionaries come into play.

The RNA codon table maps nucleotide triplets (codons) to specific amino acids. Let’s take a look at how a dictionary might represent this table:

RNA_codon_table = {
# Second Base
# U C A G
# U
 'UUU': 'Phe', 'UCU': 'Ser', 'UAU': 'Tyr', 'UGU': 'Cys', # UxU
 'UUC': 'Phe', 'UCC': 'Ser', 'UAC': 'Tyr', 'UGC': 'Cys', # UxC
 'UUA': 'Leu', 'UCA': 'Ser', 'UAA': '---', 'UGA': '---', # UxA
 'UUG': 'Leu', 'UCG': 'Ser', 'UAG': '---', 'UGG': 'Urp', # UxG
# C
 'CUU': 'Leu', 'CCU': 'Pro', 'CAU': 'His', 'CGU': 'Arg', # CxU
 'CUC': 'Leu', 'CCC': 'Pro', 'CAC': 'His', 'CGC': 'Arg', # CxC
 'CUA': 'Leu', 'CCA': 'Pro', 'CAA': 'Gln', 'CGA': 'Arg', # CxA
 'CUG': 'Leu', 'CCG': 'Pro', 'CAG': 'Gln', 'CGG': 'Arg', # CxG
# A
 'AUU': 'Ile', 'ACU': 'Thr', 'AAU': 'Asn', 'AGU': 'Ser', # AxU
 'AUC': 'Ile', 'ACC': 'Thr', 'AAC': 'Asn', 'AGC': 'Ser', # AxC
 'AUA': 'Ile', 'ACA': 'Thr', 'AAA': 'Lys', 'AGA': 'Arg', # AxA
 'AUG': 'Met', 'ACG': 'Thr', 'AAG': 'Lys', 'AGG': 'Arg', # AxG
# G
 'GUU': 'Val', 'GCU': 'Ala', 'GAU': 'Asp', 'GGU': 'Gly', # GxU
 'GUC': 'Val', 'GCC': 'Ala', 'GAC': 'Asp', 'GGC': 'Gly', # GxC
 'GUA': 'Val', 'GCA': 'Ala', 'GAA': 'Glu', 'GGA': 'Gly', # GxA
  'GUG': 'Val', 'GCG': 'Ala', 'GAG': 'Glu', 'GGG': 'Gly' # GxG
}

With this table, we can translate a given RNA codon (a sequence of three nucleotides) into its corresponding amino acid. For example:

print(RNA_codon_table['AAA'])  # Output: Lys

Translating RNA Codons into Amino Acids

We can write a simple function to translate RNA codons into amino acids using our dictionary:

def translate_RNA_codon(codon):
    """Returns the amino acid for the given codon."""
    return RNA_codon_table[codon]

print(translate_RNA_codon("GGG"))  # Output: Gly

Here, the function translate_RNA_codon() takes a codon as input and returns the corresponding amino acid by looking it up in the RNA_codon_table.

Summary

Python dictionaries are a powerful tool for storing and manipulating key-value pairs. In bioinformatics, they can be used to map complex biological data, like RNA codons to amino acids. Understanding how to use and manipulate dictionaries effectively can enhance your ability to work with various biological datasets.

Whether you’re working with simple personal data or complex biological sequences, Python dictionaries offer an efficient and intuitive way to store and retrieve information.

Join us in the next post, Reading and Writing Files in Python, where we’ll learn how to handle files - from reading DNA sequences from FASTA files to writing analysis results. We’ll explore essential file operations that are crucial for processing biological data in your bioinformatics workflows.

← Previous Next →