Theory of Intelligent Design, the best explanation of Origins

This is my personal virtual library, where i collect information, which leads in my view to Intelligent Design as the best explanation of the origin of the physical Universe, life, and biodiversity


You are not connected. Please login or register

Theory of Intelligent Design, the best explanation of Origins » Intelligent Design » Information Theory, Coded Information in the cell » The various codes in the cell

The various codes in the cell

View previous topic View next topic Go down  Message [Page 1 of 1]

1 The various codes in the cell on Thu Oct 22, 2015 12:07 pm

Admin


Admin
The various codes in the cell

http://reasonandscience.heavenforum.org/t2213-the-various-codes-in-the-cell

Another outstanding implication of the existence of organic codes in Nature comes from the fact that any code involves meaning and we need therefore to introduce in biology, with the standard methods of science, not only the concept of biological information but also that of biological meaning. The study on the organic codes, in conclusion, is bringing to light new mechanisms that operated in the history of life and new fundamental concepts. It is an entirely new field of research, the exploration of a vast and still largely unexplored dimension of the living world, the real new frontier of biology.

The Genetic Code
The Splicing Codes
The Metabolic Code
The Signal Transduction Codes
The Signal Integration Codes
The Histone Code
The Tubulin Code
The Sugar Code
The Glycomic Code
The non-ribosomal code
The Calcium Code
The RNA code


CONTROL OF TRANSCRIPTION BY SEQUENCESPECIFIC DNA-BINDING PROTEINS
http://www.garlandscience.com/res/pdf/9780815341291_ch08.pdf

The transcription factor code: defining the role of a developmental transcription factor in the adult brain.
For the human brain to develop and function correctly, each of its 100 billion neurons must follow a specific and pre-programmed code of gene expression. This code is driven by key transcription factors that regulate the expression of numerous proteins, moulding the neurons identity to create its unique shape and electrical behaviour.
https://www.findaphd.com/search/projectdetails.aspx?PJID=41943

Unraveling a novel transcription factor code determining the human arterial-specific endothelial cell signature
Our pioneering profiling study on freshly isolated ECs unveiled a combinatorial transcriptional code that induced an arterial fingerprint more proficiently than the current gold standard, HEY2, and this codeconveyed an in vivo arterial-like behavior upon venous ECs.
http://www.bloodjournal.org/content/122/24/3982?sso-checked=true

The transcriptional regulatory code of eukaryotic cells--insights from genome-wide analysis of chromatin organization and transcription factor binding.
The term 'transcriptional regulatory code' has been used to describe the interplay of these events in the complex control of transcription. With the maturation of methods for detecting in vivo protein-DNA interactions on a genome-wide scale, detailed maps of chromatin features and transcription factor localization over entire genomes of eukaryotic cells are enriching our understanding of the properties and nature of this transcriptional regulatory code.
http://www.ncbi.nlm.nih.gov/pubmed/16647254

The Splicing code
rigin and evolution of spliceosomal introns
http://biologydirect.biomedcentral.com/articles/10.1186/1745-6150-7-11


The rna binding protein binding code
A compendium of RNA-binding motifs for decoding gene regulation
http://www.nature.com/nature/journal/v499/n7457/full/nature12311.html


microRNA binding code
The code within the code: microRNAs target coding regions
http://www.ncbi.nlm.nih.gov/pubmed/20372064

The Glycan or Sugar Code
Biological information transfer beyond the genetic code: the sugar code
http://www.ncbi.nlm.nih.gov/pubmed/10798195

The non-ribosomal code
A allowed the identification of amino acid residues that play a decisive role in the coordination of the substrate and have lead to the concept of the so-called nonribosomal code, which allows the prediction of A-domain selectivity on the basis of its primary sequence
https://www.duo.uio.no/bitstream/handle/10852/11331/DUO_668_ToomingKlunderud_17x24.pdf?sequence=1

CONTROL OF TRANSCRIPTION BY SEQUENCESPECIFIC DNA-BINDING PROTEINS 1

Coded information can always be tracked back to a intelligence, which has to set up the convention of meaning of the code, and the information carrier, that can be a book, the hardware of a computer, or  the smoke of a fire of a indian tribe signalling to another. All communication systems have an encoder which produces a message which is processed by a decoder. In the cell there are several code systems. DNA is the most well known, it stores coded information through the four nucleic acid bases. But there are several others, less known. Recently there was some hype about a second DNA code. In fact, it is essential for the expression of genes. The cell uses several formal communication systems according to Shannon’s model because they encode and decode messages using a system of symbols.  As Shannon wrote :

“Information, transcription, translation, code, redundancy, synonymous, messenger, editing, and proofreading are all appropriate terms in biology. They take their meaning from information theory (Shannon, 1948) and are not synonyms, metaphors, or analogies.” (Hubert P. Yockey,  Information Theory, Evolution, and the Origin of Life,  Cambridge University Press, 2005).

An organism’s DNA encodes all of the RNA and protein molecules required to construct its cells. Yet a complete description of the DNA sequence of an organism—be it the few million nucleotides of a bacterium or the few billion nucleotides of a human—no more enables us to reconstruct the organism than a list of English words enables us to reconstruct a play by Shakespeare. In both cases, the problem is to know how the elements in the DNA sequence or the words on the list are used. Under what conditions is each gene product made, and, once made, what does it do? The different cell types in a multicellular organism differ dramatically in both structure and function. If we compare a mammalian neuron with a liver cell, for example, the differences are so extreme that it is difficult to imagine that the two cells contain the same genome. The genome of a organism contains the instructions to make all different cells, and  the expression of either a neuron cell or liver cell can be regulated at many of the steps in the pathway from DNA to RNA to Protein. The most important imho is CONTROL OF TRANSCRIPTION BY SEQUENCESPECIFIC DNA-BINDING PROTEINS, called transcription factors or regulators. These proteins recognize specific sequences of DNA (typically 5–10 nucleotide pairs in length) that are often called cis-regulatory sequences.   Transcription regulators bind to these sequences, which are dispersed throughout genomes, and this binding puts into motion a series of reactions that ultimately specify which genes are to be transcribed and at what rate. Approximately 10% of the protein-coding genes of most organisms are devoted to transcription regulators. Transcription regulators must recognize short, specific cis-regulatory sequences within this structure. The outside of the double helix is studded with DNA sequence information that transcription regulators recognize: the edge of each base pair presents a distinctive pattern of hydrogen-bond donors, hydrogen-bond acceptors, and hydrophobic patches in both the major and minor grooves. The 20 or so contacts that are typically formed at the protein–DNA interface add together to ensure that the interaction is both highly specific and very strong.


These instructions are written in a language that is often called the ‘gene regulatory code’.  The preference for a given nucleotide at a specific position is mainly determined by physical interactions between the aminoacid side chains of the TF ( transcription factor ) and the accessible edges of the base pairs that are contacted.  It is possible that some complex code, comprising rules from each of the different layers, contributes to TF– DNA binding; however, determining the precise rules of TF binding to the genome will require further scientific research. So, Genomes contain both a genetic code specifying amino acids, and this regulatory code specifying transcription factor (TF) recognition sequences. We find that ~15% of human codons are dual-use codons (`duons') that simultaneously specify both amino acids and TF recognition sites. Genomes also contain a parallel regulatory code specifying recognition sequences for transcription factors (TFs) , and the genetic and regulatory codes have been assumed to operate independently of one another, and to be segregated physically into the coding and non-coding genomic compartments. the potential for some coding exons to accommodate transcriptional enhancers or splicing signals has long been recognized

In order for communication to happen, 1. The sequence of DNA bases located in the regulatory region of the gene is required , and 2. transcription factors that read the code. If one of both is missing, communication fails, the gene that has to be expressed, cannot be encountered, and the whole procedure of gene expression fails. This is a irreducible complex system. The gene regulatory code could not arise in a stepwise manner either, since if that were the case, the code has only the right significance if fully developed. Thats a example par excellence of intelligent design.. The fact that these transcription factor binding sequences overlap protein coding sequences, suggest that both sequences were designed together, in order to optimize the efficiency of the DNA code. As we learn more and more about DNA structure and function, it is apparent that the code was not just hobbled together by the trial and error method of natural selection, but that it was specifically designed to provide optimal efficiency and function.


 Stephen Meyer puts it that way in his excellent book: Darwins doubt pg.270:

INTEGRATED CIRCUITRY: DEVELOPMENTAL GENE REGULATORY NETWORKS 

Keep in mind, too, that animal forms have more than just genetic information. They also need tightly  integrated networks of genes, proteins, and other molecules to regulate their development—in other words, they require developmental gene regulatory networks, the dGRNs . Developing animals face two main challenges. First, they must produce different types of proteins and cells and, second, they must get those proteins  and cells to the right place at the right time.20 Davidson has shown that embryos accomplish this task by relying on networks of regulatory DNA-binding proteins (called transcription factors) and their physical targets. These physical targets are typically sections of DNA (genes) that produce other  proteins or RNA molecules, which in turn regulate the expression of still other genes.

These interdependent networks of genes and gene products present a striking appearance of design. Davidson's graphical depictions of these dGRNs look for all the world like wiring diagrams in an electrical engineering blueprint or a schematic of an integrated circuit, an uncanny resemblance Davidson himself has often noted. "What emerges, from the analysis of animal dGRNs," he muses, "is almost astounding: a network of logic interactions programmed into the DNA sequence that amounts  essentially to a hardwired biological computational device." These molecules collectively form a tightly integrated network of signaling molecules that function as an integrated circuit. Integrated circuits in electronics are systems of individually functional components such as transistors, resistors, and capacitors that are connected together to perform an overarching function. Likewise, the functional components of dGRNs—the DNA-binding proteins, their DNA target sequences, and the other molecules that the binding proteins and target molecules produce and regulate—also form an integrated circuit, one that contributes to accomplishing the overall function of producing an adult animal form. 

Davidson himself has made clear that the tight functional constraints under which these systems of molecules (the dGRNs) operate preclude their gradual alteration by the mutation and selection mechanism. For this reason, neo-Darwinism has failed to explain the origin of these systems of molecules and their functional integration. Like advocates of evolutionary developmental biology, Davidson himself favors a model of evolutionary change that envisions mutations generating large-scale developmental effects, thus perhaps bypassing nonfunctional intermediate circuits or systems. Nevertheless, neither proponents of "evo-devo," nor proponents of other recently proposed materialistic theories of evolution, have identified a mutational mechanism capable of generating a dGRN or anything even remotely resembling a complex integrated circuit. Yet, in our experience, complex integrated circuits—and the functional integration of parts in complex systems generally—are known to be produced by intelligent agents—specifically, by engineers. Moreover, intelligence is the only known cause of such effects. Since developing animals employ a form of integrated circuitry, and certainly one manifesting a tightly and functionally integrated system of parts and subsystems, and since intelligence is the only known cause of these features, the necessary presence of these features in developing Cambrian animals would seem to indicate that intelligent agency played a role in their origin 

The Calcium Code
http://stke.sciencemag.org/content/2001/89/tw4
Steady-state stomatal closure could be restored if calcium oscillations similar to wild type were imposed; thus, the cells have an intact downstream signaling pathway, but cannot initiate the proper calcium oscillation code to trigger the pathway.

Defined changes of cytosolic Ca2+ concentration are triggered by cellular second messengers, such as NAADP, IP3, IP6, Sphingosine-1-Phospate, and cADPR  and it is evident that the identity and intensity of a specific stimulus impulse results in stimulus-specific and dynamic alterations of cytosolic Ca2+ concentration.  This heterogeneity of increases in cytosolic-free Ca2+ ion concentration in terms of duration, amplitude, frequency, and spatial distribution lead A.M. Hetherington and coworkers to formulate the concept of “Ca2+ signatures”. Signal information would be encoded by a specific Ca2+ signature that is defined by precise control of spatial, temporal, and concentration parameters of alterations in cytosolic Ca2+ concentration.

The RNA code

http://www.nature.com/nature/journal/v542/n7642/pdf/542503a.pdf

In 2004, oncologist Gideon Rechavi at Tel Aviv University in Israel and his colleagues compared all the human genomic DNA sequences then available with their corresponding messenger RNAs — the molecules that carry the information needed to make a protein from a gene.

They were looking for signs that one of the nucleotide building blocks in the RNA sequence, called adenosine (A), had changed to another building block called inosine (I). This 'A-to-I editing' can alter a protein's coding sequence, and, in humans, is crucial for keeping the innate immune response in check. “It sounds simple, but in real life it was really complicated,” Rechavi recalls. “Several groups had tried it before and failed” because sequencing mistakes and single-nucleotide mutations had made the data noisy. But using a new bioinformatics approach, his team uncovered thousands of sites in the transcriptome — the complete set of mRNAs found in an organism or cell population — and later studies upped the number into the millions1.

Inosine is something of a special case: researchers can readily detect this chink in the armour by comparing DNA and RNA sequences. But at least one-quarter of our mRNAs harbour chemical tags — decorations to the A, C, G and U nucleotides — that are invisible to today's sequencing technologies. (Similar chemical tags, called epigenetic markers, are also found on DNA.) Researchers aren't sure what these chemical changes in RNA do, but they're trying to find out.

A wave of studies over the past five years — many of which focus on a specific RNA mark called N6-methyladenosine (m6A) — have mapped these alterations across transcriptomes and demonstrated their importance to health and disease. But the problem is vast: these marks coat not only mRNA but other RNA transcripts as well, and they cut across all the domains of life and beyond, marking even viruses with their presence.

The modifications themselves are not new. What has given them meaning and driven epitranscriptomics into the spotlight is the discovery of enzymes that can add, remove and interpret them. In 2010, chemical biologist Chuan He at the University of Chicago, Illinois, proposed that these chemical tags could be reversible and important regulators of gene expression. Not long afterwards, his group demonstrated2 the first eraser of these marks on mRNA, an enzyme called FTO. That discovery meant that m6A wasn't just a passive mark — cells actively controlled it. And this realization came at about the same time that global approaches, harnessing the power of next-generation sequencing, made it possible to map m6A and other modifications across the transcriptome.

1) http://www.garlandscience.com/res/pdf/9780815341291_ch08.pdf



Last edited by Admin on Mon Apr 10, 2017 12:29 pm; edited 13 times in total

View user profile http://elshamah.heavenforum.com

2 The Hidden Codes That Shape Protein Evolution on Tue Nov 24, 2015 12:57 pm

Admin


Admin
The Hidden Codes That Shape Protein Evolution 1

Despite redundancy in the genetic code (1), the choice of codons used is highly biased in some proteins, suggesting that additional constraints operate in certain protein-coding regions of the genome. This suggests that the preference for particular codons, and therefore amino acids in specific regions of the protein, is often determined by factors unrelated to protein structure or function (2, 3).  Stergachis et al. (4) reveal that transcription factors bind within protein-coding regions (in addition to nearby noncoding regions) in a large number of human genes. Thus, a transcription factor “binding code” may influence codon choice and, consequently, protein evolution. This “binding” code joins other “regulatory” codes that govern chromatin organization (3), enhancers (5, 6), mRNA structure (7), mRNA splicing (3), microRNA target sites (6, 8), translational efficiency (9), and cotranslational folding (10), all of which have been proposed to constrain codon choice, and thus protein evolution (see the figure).




Figure


Constraining codes. Regulatory elements within protein-coding regions (such as transcription factor binding) can influence codon choice and amino acid preference that are independent of protein structure or function
Redundancy in the genetic code might facilitate the existence of multiple overlapping regulatory codes within protein-coding regions of the genome.

How widespread is the phenomenon of “regulatory” codes that overlap the genetic code, and how do they constrain the evolution of protein sequences? Stergachis et al. address these questions for the transcription factor–binding regulatory code. They use deoxyribonuclease I (DNase I) footprinting to map transcription factor occupancy (a protein bound to DNA can protect that region from enzymatic cleavage) at nucleotide resolution across the human genome in 81 diverse cell types. The authors determined that ~14% of the codons within 86.9% of human genes are occupied by transcription factors. Such regions, called “duons,” therefore encode two types of information: one that is interpreted by the genetic code to make proteins and the other, by the transcription factor–binding regulatory code to influence gene expression. This requirement for transcription factors to bind within protein-coding regions of the genome has led to a considerable bias in codon usage and choice of amino acids, in a manner that is constrained by the binding motif of each transcription factor.


To investigate whether single-nucleotide variants within duons affect transcription factor binding, Stergachis et al. mapped the known variants that are associated with a disease or a trait onto duons. Of those, 17.4% quantitatively skew the allelic origins of DNA fragments protected from cleavage by DNase I in human cells, suggesting that such single-nucleotide variants affect transcription factor occupancy. They also determined that such variants are not biased toward whether they result in synonymous or nonsynonymous changes in the protein sequence. Intriguingly, a large fraction of the variants that result in a nonsynonymous change are predicted not to alter protein function. This indicates that some variants within duons might primarily affect transcription factor binding instead. This supports the emerging idea that single-nucleotide variants within protein-coding regions can lead to disease without affecting protein structure or function (11, 12). Thus, the whole spectrum of “regulatory” codes within protein-coding regions should be considered when assessing the impact of single-nucleotide variants and interpreting disease mutation data from exome sequencing (only the protein-coding regions of the genome) and cancer genome studies.
Do the regulatory codes harmoniously coexist? Evidence is emerging that there can be conflicts. For example, in the fruit fly Drosophila melanogaster, there is a striking decrease in the use of codons that are optimal for translation, but a rise in codons that enhance RNA splicing, toward the end of exons (13). This may indicate that the requirement for accurate RNA splicing has superseded that for optimal translation. Likewise, Stergachis et al. observed that the binding motifs of transcription factors within protein-coding genomic regions are selectively devoid of sequences that contain a stop codon.
What features might permit synergistic coexistence of the regulatory and genetic codes? One major constraint of protein-coding genes is the requirement for the encoded polypeptide segment to fold into a defined tertiary structure. It is possible that in regions where folding constraints are not present, such as in intrinsically disordered regions (14), there might be increased tolerance for protein-coding genomic regions to harbor more regulatory elements that can be interpreted by different regulatory codes.
Stergachis et al. make a number of important genome-scale observations, but several mechanistic questions remain to be answered. For instance, although the authors report a weak tendency for transcription factors to preferentially bind to the protein-coding regions of highly expressed genes, it is unclear how the binding of a transcription factor within protein-coding regions mechanistically influences the expression of a gene. Perhaps this type of binding might result in alternative promoters with different transcriptional start sites or affect the expression of neighboring genes (by acting as a distal enhancer element, for example). It is also unclear whether binding of a transcription factor within a protein-coding region may not directly affect gene expression but instead determine the formation and maintenance of higher-order chromatin structure.
Future research will need to determine the number of overlapping codes that can be tolerated by the genetic code. There is also the question of possible trade-offs, in terms of maintaining regulation and functionality, that have been made to accommodate coexistence of codes and whether this can lead to nonoptimal or deleterious consequences. For instance, protein-coding regions that cannot tolerate mutations due to multiple overlapping codes may be exploited by pathogens during host infection. The investigation of overlapping codes opens new vistas on the functional interpretation of variation in coding regions and makes it clear that the story of the genetic code has not yet run its course.

1) http://www.sciencemag.org/content/342/6164/1325

Professor Moran's opinion on this : http://sandwalk.blogspot.com.br/2014/01/press-release-hyperbole-and-duon.html



Last edited by Admin on Tue Nov 24, 2015 1:36 pm; edited 1 time in total

View user profile http://elshamah.heavenforum.com

Admin


Admin
Exonic transcription factor binding directs codon choice and impacts protein evolution1

The genetic code, common to all organisms, contains extensive redundancy, wherein most amino acids can be specified by 2–6 synonymous codons. The observed ratios of synonymous codons are highly non-random, and codon usage biases are fixtures of both prokaryotic and eukaryotic genomes (1). In organisms with short life spans and large effective population sizes codon biases have been linked to translation efficiency and mRNA stability (2–7). However, these mechanisms explain only a small fraction of observed codon preferences in mammalian genomes (7–11), which appear to be under selection (12),.
Genomes also contain a parallel regulatory code specifying recognition sequences for transcription factors (TFs) (13), and the genetic and regulatory codes have been assumed to operate independently of one another, and to be segregated physically into the coding and non-coding genomic compartments. However the potential for some coding exons to accommodate transcriptional enhancers or splicing signals has long been recognized (14–18).
To define intersections between the regulatory and genetic codes, we generated nucleotide-resolution maps of transcription factor occupancy in 81 diverse human cell types using genomic DNaseI footprinting (19). Collectively, we defined 11,598,043 distinct 6–40bp footprints genome-wide (~1,018,514 per cell-type), 216,304 of which localized completely within protein-coding exons (~24,842 per cell-type) (Fig. 1A–B, S1A, Table S1). ~14% of all human coding bases contact a TF in at least one cell type (avg. 1.1% per cell type; Figs. 1C, S1B) and 86.9% of genes contained coding TF footprints (avg. 33% per cell type) (Figs. S1C–D).


Figure 1



TFs densely populate and evolutionarily constrain protein-coding exons
(A) Distribution of DNaseI footprints. (B) Per-nucleotide DNaseI cleavage and ChIP-seq signal for coding CTCF (left) and NRSF (right) binding elements. (C) Proportion of coding bases within DNaseI footprints in each of 81 cell types (left), or any cell type (right). (D) Average footprint density within first, internal, or final coding exons (mean +/− SEM; p-value, paired t-test, n.s.: p-value> 0.1). (E) PhyloP conservation at 4FDBs within and outside footprints. (F) Estimated mutational age at all (grey), synonymous (brown) and nonsynonymous (red) coding SNVs (European) within and outside footprints (p-values per (21)) (G) Structure of DNA-bound KLF4 vs. average per-nucleotide DNaseI cleavage and evolutionary constraint at KLF4 footprints. (H) Average per-nucleotide conservation at 4FDBs (brown) and NDBs (red) overlapping KLF4 (left) and NFIC (right) footprints. (r = Pearson correlation, conservation at promoter bases vs. 4FDBs (top) or NDBs (bottom)). (I)Evolutionary constraint imparted by 63 TFs at promoter elements, 4FDBs and NDBs (Pearson correlations).


The exonic TF footprints we observed likely underestimate the true fraction of protein-coding bases that contact TFs since (i) TF footprint detection increases substantially with sequencing depth (13), and (ii) the 81 cell types sampled, though extensive, is far from complete, as we saw little evidence of saturation of coding TF footprint discovery (Fig. S2).


Figure 2

Transcription factors modulate global codon biases
(A) Proportions of all codons (grey), or codons outside of (yellow), or within (purple) footprints, that encode asparagine (top) or leucine (bottom). Note that codons with bias (AAC for asparagine and CTG for leucine) preferentially localize within footprints. (B) Preferential footprinting of biased codons, calculated as in (A) (p-values, Pearson's chi-squared test). (C) Preferential footprinting of each codon trinucleotide in coding vs non-coding regions (C = coding, NC = non-coding). (D) Difference in average evolutionary constraint at 3rd positions of biased codons outside vs. within footprints (p-values, Mann-Whitney test). (E) Proportions of amino acids encoded by CpG-containing codons among all codons (grey), codons outside footprints (yellow), or codons within footprints (purple)


To ascertain coding footprints more completely, we developed an approach for targeted exonic footprinting via solution-phase capture of DNaseI-seq libraries using RNA probes complementary to human exons (19). Targeted capture footprinting of exons from abdominal skin and mammary stromal fibroblasts yielded ~10-fold increases in DNaseI cleavage, equivalent to sequencing >4 billion reads per sample using conventional genomic footprinting (Fig. S3A), quantitatively exposing many additional TF footprints (Fig. S3B–D). Overall, we identified an average of ~175,000 coding footprints per cell type (Fig. S1E), 7-12-fold more than conventional footprinting.


Figure 3

TFs exploit and avoid specific coding features

(A) Percentage of TF motifs occupied in coding vs. non-coding regions (p-values, paired t-test). (B) Density of NFYA (left), AP2 (middle) and SP1 (right) footprints relative to translated region of first coding exons. (C) (top) Density of YY1 footprints across first coding exons. (bottom) YY1 recognition sequence and corresponding amino acid sequence within YY1 footprints overlapping start codons. (D) (top left and bottom) For NRSF as per (C). (right, arrow) Protein domain annotation of first exon third-frame NRSF footprints vs. SP1 footprints. (E) TF preference (avoidance) of stop codon trinucleotides within vs. outside footprints in non-coding regions (p-values, Pearson's chi-squared test).


While coding sequences are densely occupied by TFs in vivo, the density of TF footprints at different genic positions varied widely, with many genes exhibiting sharply increased density in the translated portion of their first coding exon (Figs. 1D, S4A). By contrast, internal coding exons were as likely as flanking intronic sequences to harbor TF footprints (Fig.1D). The total number of coding DNaseI footprints within a gene was related both to the length of the gene, and to its expression level (Fig. S4B–D).


Figure 4

Genetic variation in duons frequently alters TF occupancy

(A) Proportion of coding footprints overlapping a SNV in any of 81 cell-types. (B) Proportion of SNVs in duons that allelically alter TF occupancy. (C) (top) Per-nucleotide DNaseI cleavage at common nonsynonymous G→A SNV (rs8110393) in G/G and A/A homozygous cells. (bottom) Allelic SP1 occupancy in heterozygous (G/A) cells. (D) Proportion of synonymous and nonsynonymous variants in duons that allelically alter TF occupancy. (E–F) Proportion of nonsynonymous variants from (D) grouped by predicted impact of coding variant on protein function using (E) SIFT or (F) Polyphen-2. Note that none of the bins are significantly different (Fisher's exact test; n.s. indicates p-value > 0.1).

Given their abundance, we sought to determine whether exonic TF binding elements were under evolutionary selection. 4-fold degenerate coding bases are frequently used as a model of neutral (or nearly neutral) evolution (20), but may exhibit constraint when a functional signal impinges on coding sequence (11). Across the coding compartment, 4-fold degenerate bases (4FDBs) within TF footprints show significantly greater evolutionary constraint vs. non-footprinted 4FDBs (Figs. 1E, S5A–B), indicating that TF-DNA recognition constrains the third codon position.
To test for evolutionary constraint at coding footprints in modern human populations, we quantified the age of mutations arising within or outside of coding footprints using exome sequencing data from 4,298 individuals of European ancestry (Fig. S5C) and 2,217 individuals of African American ancestry (Fig. S5D) (21). This analysis revealed that mutations within coding footprints were on average 10.2% younger than those outside of footprints (Figs. 1F, S5E), signaling influence of coding TF elements on human fitness.
Strikingly, both synonymous and nonsynonymous mutations within coding footprints were significantly younger than those outside of footprints (Figs. 1F,S5E), indicating that coding TF binding constrains both codon and amino acid evolution. The genome-wide recognition sequence landscape of each TF has evolved to fit the molecular topography of its protein-DNA binding interface (13) (Fig. 1G). To study how specific TFs influence codon and amino acid choice at their recognition sites, we compared the per-nucleotide evolutionary conservation profiles of TF recognition sequences at non-coding, 4FDBs and non-degenerate coding bases (NDBs). For example, the conservation profiles at 4FBDs and NDBs at KLF4 and NFIC recognition sites closely mirror those of recognition sites in non-coding regions (promoter; Fig. 1H). As such, these TFs constrain both codon choice (via constraint on 4FDBs), and amino acid choice (via NDBs) encoded at their recognition sites. Analysis of conservation profiles for 63 TFs with prevalent occupancy within coding regions (19) showed that 73% constrain 4FDBs, and 51% constrain NDBs (Figs. 1I, S6, S7). Thus, individual TFs may influence both codon and amino acid choice.
To examine how TF binding relates to codon usage patterns, we examined -binding at preferred (biased) vs. non-preferred codons. For example, across all human proteins Asparagine is encoded by the AAC codon 52% of the time (vs. AAT, 48%), indicating a generalized 4% bias in favor of this codon. However, genome-wide, 60.4% of Asn codons within footprints are AAC, vs. only 50.8% outside of footprints (i.e., a 9.6% occupancy bias towards the preferred codon) (Fig. 2A). Strikingly, apart from Arginine (see below), for all amino acids encoded by two or more codons, the codon that is preferentially utilized genome-wide is also preferentially occupied by TFs (Fig. 2B, Table S2).
To determine whether preferential occupancy of biased codons is inherent to TF recognition sequences, we compared trinucleotide frequencies within coding vs. non-coding footprints. Trinucleotide combinations favored by TFs within coding sequence were equivalent to those favored in non-coding sequence (Fig. 3C), indicating that global TF binding preferences are directly reflected in the frequency of different codons. Notably, baseline trinucleotide frequencies within coding and non-coding sequence are largely independent of one another (Table S2). The fact that the third position of preferred codons overlapping footprints is under excess evolutionary constraint (Fig. 2D, Table S2) supports a general role for TFs in potentiating codon usage biases through the selective preservation of preferred codons.
While nearly all codon biases parallel TF recognition preferences genome-wide, Arginine, one of the 5 amino acids encoded by codons containing CpGs (4 out of 6 codons), was a notable exception. CpGs frequently occur in regulatory DNA (Table S2), yet have an elevated mutational rate (22). Consequently, although TFs may favor CpG-containing codons (Fig. 2E), and impart excess constraint thereto (Table S2), the higher mutational rate at such codons is likely incompatible with preferential utilization.
We note that codons outside footprints still exhibit usage biases (Fig. 2A andTable S2); however, it is likely that these biases also reflect the actions of TFs. Firstly, our conclusions above are drawn from a conservative and incomplete annotation of duons. Secondly, because TF trinucleotide preferences and codon biases have not changed substantially since the divergence of humans and mice (Fig. S8), preferences at any given codon may result from a TF binding element extant in some ancestral species to human. Third, codon usage bias can be exaggerated due to mutual reinforcement with other cellular factors such as tRNA abundances (23, 24). Indeed, such mechanisms could be linked to codon biases created by exonic TF occupancy through a feedback mechanism that potentiates intrinsic TF-imposed biases, resulting in both abundant and rare codons and associated tRNAs, differences in which could in turn affect protein synthesis and stability (25–27).
To analyze positional occupancy patterns of specific TFs within coding sequence, we systematically matched TF recognition sequences with footprints, providing an accurate measure of a TF's in vivo occupancy (13, 28). This analysis revealed that a subset of TFs selectively avoid coding sequences (Fig. 3A). Intriguingly, TFs involved in positioning the transcriptional pre-initiation complex, such as NFYA and SP1 (29), preferentially avoid the translated region of the first coding exon (Fig. 3A), and typically occupy elements immediately upstream of the methionine start codon (Figs. 3B, S9A). Conversely, TFs involved in modulating promoter activity, such as YY1 and NRSF, preferentially occupy the translated region of the first coding exon (30, 31) (Fig. 3A,C). These findings indicate that that the translated portion of the first coding exon may serve functionally as an extension of the canonical promoter.
More broadly, the repressor NRSF preferentially occupies and evolutionarily constrains sequences coding for leucine-rich protein domains, such as signal peptide and transmembrane domains (Figs. 3D, S9B,C). Also, TFs such as CTCF and SREBP1 preferentially occupy and constrain splice sites (Fig. S10A–D), which are otherwise generally depleted of DNaseI footprints (Fig. S10E). The above results suggest that specific protein structural and splicing features may undergo exaptation for specific regulatory purposes.
We also found that the occupancy of specific TFs within coding sequence parallels the extent of CpG methylation at their binding site (Fig. S11). This raises the possibility that gene body methylation, which is paradoxically extensive at actively transcribed genes (32, 33), may provide a tunable mechanism for thwarting opportunistic TF occupancy within coding sequence during transcription.
If TFs, through selective recognition sequences, could impose changes in protein sequence, deleterious consequences could arise if such changes resulted in a nonsense substitution. We observed that TFs generally avoid stop codons (Fig. S10E). Surprisingly, this finding extends to non-coding regions, where stop codon trinucleotides (TAA, TAG and TGA) are selectively depleted within footprints. This indicates that the global TF repertoire has been selectively purged of DNA binding domains capable of recognizing, and thus preferentially stabilizing, nonsense codons (Fig. 3E and S10F).
The high sequencing coverage provided by genomic footprinting revealed 592,867 heterozygous single nucleotide variants (SNVs) across the 81 cell type samples, and 3% of coding footprints harbored heterozygous SNVs (Fig. 4A). Functional SNVs that disrupt TF occupancy quantitatively skew the allelic origins of DNaseI cleavage fragments (13), and 17.4% of all heterozygous coding SNVs within footprints showed this signature (Figs. 4B, S12), including both synonymous and nonsynonymous variant classes (Fig. 4C). The potential of a coding SNV to disrupt overlying TF occupancy was independent of the class of variant (Fig. 4D), or whether a nonsynonymous variant was predicted to be deleterious to protein function (Fig. 4E–F).
Notably, 13.5% of common disease- and trait-associated SNVs identified by genome-wide associated studies (GWAS) (19) fall within duons (Fig. S13A). GWAS SNPs in duons encompass both synonymous (12%) and nonsynonymous (88%) substitutions (Fig. S13A), and may directly affect pathogenetic mechanisms (Fig. S13B–F, Table S3). As such, disease-associated variants within duons may compromise both regulatory and/or protein-structural functions. These findings have substantial practical implications for the interpretation of genetic variation in coding regions.
In summary, our results indicate that simultaneous encoding of amino acid and regulatory information within exons is a major functional feature of complex genomes. The information architecture of the received genetic code is optimized for superimposition of additional information (34, 35), and this intrinsic flexibility has been extensively exploited by natural selection. While TF binding within exons may serve multiple functional roles, we note that our analyses above is agnostic to these roles, which may be complex (36).


1) http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3967546/

Laurence Moran on the paper : http://sandwalk.blogspot.com.br/2014/01/the-duon-delusion-and-why-transcription.html

View user profile http://elshamah.heavenforum.com

4 Re: The various codes in the cell on Tue Nov 24, 2015 2:25 pm

Admin


Admin
The transcription factor code :

The transcription factor code: defining the role of a developmental transcription factor in the adult brain.
For the human brain to develop and function correctly, each of its 100 billion neurons must follow a specific and pre-programmed code of gene expression. This code is driven by key transcription factors that regulate the expression of numerous proteins, moulding the neurons identity to create its unique shape and electrical behaviour.

Unraveling a novel transcription factor code determining the human arterial-specific endothelial cell signature
Our pioneering profiling study on freshly isolated ECs unveiled a combinatorial transcriptional code that induced an arterial fingerprint more proficiently than the current gold standard, HEY2, and this codeconveyed an in vivo arterial-like behavior upon venous ECs. 

The transcriptional regulatory code of eukaryotic cells--insights from genome-wide analysis of chromatin organization and transcription factor binding.
The term 'transcriptional regulatory code' has been used to describe the interplay of these events in the complex control of transcription. With the maturation of methods for detecting in vivo protein-DNA interactions on a genome-wide scale, detailed maps of chromatin features and transcription factor localization over entire genomes of eukaryotic cells are enriching our understanding of the properties and nature of this transcriptional regulatory code.

Human Genes Encoding Transcription Factors and Chromatin-Modifying Proteins Have Low Levels of Promoter Polymorphism: A Study of 1000 Genomes Project Data
Genome-wide analysis of histone modifications revealed that, like transcription factors, each chromatin-remodeling protein can affect transcriptional level of thousands of genes, thereby orchestrating gene activity according to intracellular conditions or external stimuli [30].
Thus, both classes of proteins are involved in the complicated process of transcriptional control, ensuring correct expression of specific genes. Both so called “transcription factor-binding regulatory code” and “histonecode” may be effectively used for prediction of gene expression activity. Moreover, these codes are redundant for predicting gene expression

View user profile http://elshamah.heavenforum.com

5 The Splicing code on Tue Nov 24, 2015 2:38 pm

Admin


Admin
The Splicing code

Breaking the second genetic code
splicing code’ is indeed breakable. One difficulty with understanding alternative pre-mRNA splicing is that the selection of particular exons in mature mRNAs is determined not only by intron sequences adjacent to the exon boundaries, but also by a multitude of other sequence elements present in both exons and introns. These auxiliary sequences are recognized by regulatory factors that assist or prevent the function of the spliceosome — the molecular machinery in charge of intron removal.

15% to 50% of human disease mutations affect splice site selection. Tissue-dependent [b]splicing is regulated by trans-acting factors, cis-acting RNA sequence motifs, and other RNA features, such as exon length and secondary structure. For nearly two decades, researchers have sought to define a regulatory splicing code in the form of a set of RNA features that can account for abundances of spliced isoforms. Through detailed investigation of a small number of examples of regulated splicing, it is clear that a splicing code must account for various features that act together to control splicing. Furthermore, a code should enable the reliable prediction of the regulatory properties of previously uncharacterized exons and the effects of mutations within regulatory elements. Here we describe a method for inferring a splicing regulatory code that addresses these challenges 

Wang further observes that splicing "is a tightly regulated process, and a great number of human diseases are caused by the 'misregulation' of splicing in which the gene was not cut and pasted correctly." This implies that important protein products are produced by splicing, meaning that the splicing code plays an important functional role in cells.

After the gene is copied the transcript is edited, splicing out the introns and glueing together the exons. Not only is it a fantastically complex process, it also adds tremendous versatility to how genes are used. A given gene may be spliced into alternate sets of exons, resulting in different protein machines. There are three genes, for example, that generate over 3,000 different spliced products to help control the neuron designs of the brain.

And how does the splicing machinery know where to cut and paste? There is an elaborate code that the splicing machinery uses to decide how to do its splicing. This splicing code is extremely complicated, using not only sequence patterns in the DNA transcript, but also the shape of transcript, as well as other factors.


A few notes on the 'species-specific' alternative splicing code:
[/b]



Last edited by Admin on Sun Jun 19, 2016 8:14 am; edited 1 time in total

View user profile http://elshamah.heavenforum.com

6 The rna binding protein binding code on Tue Nov 24, 2015 4:28 pm

Admin


Admin
The rna binding protein binding code

A compendium of RNA-binding motifs for decoding gene regulation

The eukaryotic-wide RNA-binding protein specificity code
The RNA-binding proteins play substantial and diverse roles in the post-transcriptional regulation (PTR) of gene expression. For example, recent work estimates that 40-60% of the variability in human protein levels is controlled post-transcriptionally, suggesting that regulation by RBPs contributes as much to gene expression levels as transcription factors. In collaboration with Hughes lab, we have recently biochemically-measured RNA binding preferences for more than 200 RBPs and, because RBP RNA-specificity is highly conserved, we were able to infer motifs for nearly 5,000 more RBPs by homology. This work was recently published

View user profile http://elshamah.heavenforum.com

7 Re: The various codes in the cell on Tue Nov 24, 2015 4:44 pm

Admin


Admin
microRNA binding code

The code within the code: microRNAs target coding regions
We report here an analysis of published proteomics experiments that further support a functional role for coding region microRNA binding sites
Among possible genetic codes, the universal code has been shown to be nearly optimal for incorporating embedded information.Evidence thus far supports the conclusion that the coding regions of genes can contain additional information besides the amino acid sequence of the encoded protein, including functional microRNA binding sites.

microRNAs represent ~4% of the genes in the human genome.
This discovery suggests that the genome is far from being deciphered, and most importantly that miRNAs are likely to represent just the “tip of the iceberg” with many other small non-coding RNAs to be discovered.

View user profile http://elshamah.heavenforum.com

Admin


Admin
Astonishing DNA complexity demolishes neo-Darwinism

Multiple Codes
A major outcome of the studies so far is that there are multiple information codes operating in living cells. The protein code is the simplest, and has been studied for half a century. But a number of other codes are now known, at least by inference. Cell memory code. DNA is a very long, thin molecule. If you unwound the DNA from just one human cell it would be about 2 metres long! To squash this into a tiny cell nucleus, the DNA is wound up in four separate layers of chromatin structure (as described earlier). The first level of this chromatin structure carries a ‘histone code’ that  contains information about the cell’s history (i.e. it is a cell memory).8,9 The DNA is coiled twice around a group of 8 histone molecules, and a 9th histone pins this structure into place to form what is called a nucleosome. These nucleosomes can carry various chemical modifications that either allow, or prevent, the expression of the DNA wrapped around them. Every time a cell divides into two new cells, its DNA double-helix splits into two single strands, which then each produce a new double-strand. But nucleosomes
are not duplicated like the DNA-strands. Rather, they are distributed between either one or the other of the two new DNA double strands, and the empty spaces are filled by new nucleosomes. Cell division is therefore an opportunity for changes in the nucleosomal composition of a specific DNA region. Changes can also happen during the lifetime of a cell due to chemical reactions allowing inter-conversions between the different nucleosome types. The memory effect of these changes can be that a latent capacity that was dormant comes to life, or, conversely, a previously active capacity shuts down.

Differentiation code.
In humans, there are about 300 different cell types in our bodies that make up the different tissue types (nerves, blood, muscle, liver, spleen, eyes etc). All of these cells contain the same DNA, so how does each cell know how to become a nerve cell rather than a blood cell? The required information is written in code down the side of the DNA double-helix in the form of different molecules attached to the nucleotides that form the ‘rungs’ in the ‘ladder’ of the helix. This code silences developmental genes in embryonic stem cells, but preserves their potential to become activated during embryogenesis. The embryo itself is largely defined by its DNA sequence,
but its subsequent development can be altered in response to lineage-specific transcriptional programs and environmental cues, and is epigenetically maintained.

Replication Code.
The replication code was discovered by addressing the question of how cells maintain their normal metabolic activity (which continually uses the DNA as source information) when it comes time for cell division. The key problem is that a large proportion of the whole genome is required for the normal operation of the cell—probably at least 50% in unspecialized body cells and up to 70–80% in complex liver and brain cells—and, of course, the whole genome is required during replication. This creates a huge logistic problem—how to avoid clashes between the transcription machinery (which needs to continually copy information for ongoing use in the cell) and the replication machinery (which needs to unzip the whole of the DNA double-helix and replicate a ‘zipped’ copy back onto each of the separated strands). The cell’s solution to this logistics nightmare is truly astonishing. Replication does not begin at any one point, but at thousands of different points. But of these thousands of potential start points, only a subset are used in any one cell cycle—different subsets are used at different times and places. A full understanding is yet to emerge because the system is so complex; however, some progress has been made:

The large set of potential replication start sites is not essential, but optional. In early embryogenesis, for example, before any transcription begins, the whole genome replicates numerous times without any reference to the special set of potential start sites.

The pattern of replication in the late embryo and adult is tissue-specific. This suggests that cells in a particular tissue cooperate by coordinating replication so that while part of the DNA in one cell is being replicated, the corresponding part in a neighbouring cell is being transcribed. Transcripts can thus be shared so that normal functions can be maintained throughout the tissue while different parts of the DNA are being replicated.

DNA that is transcribed early in the cell division cycle is also replicated in the early stage (but the transcription and replication machines are carefully kept apart). The early transcribed DNA is that which is needed most often in cell function. The correlation between transcription and replication in this early phase allows the cell to minimize the ‘downtime’ in transcription of the most urgent supplies while replication takes place There is a ‘pecking order’ of control. Preparation for replication may take place at thousands of different locations, but once replication does begin at a particular site, it suppresses replication at nearby sites so that only one copy of the DNA is made. If transcription happens to occur nearby, replication is suppressed until transcription is completed. This clearly demonstrates that keeping the cell alive and functioning properly takes precedence over cell division.

There is a built-in error correction system called the ‘cell-cycle checkpoints’. If replication proceeds without any problems, correction is not needed. However, if too many replication events occur at once the potential for conflict between transcription and regulation increases, and/or it may indicate that some replicators have stalled because of errors. Once the threshold number is exceeded, the checkpoint system is activated, the whole process is slowed down, and errors are corrected. If too much damage occurs, the daughter cells will be mutant, or the cell’s self-destruct mechanism (the apoptosome) will be activated to dismantle the cell and recycle its components.

An obvious benefit of the pattern of replication initiation being never the same from one cell division to the next is that it prevents accumulation of any errors that are not corrected. The exact location of the replication code is yet to be pinpointed, but because it involves transcription factors gaining access to transcription sites, and this is known to
be controlled by chromatin structure, then the code itself is probably written into the chromatin structure.



https://creation.com/images/pdfs/tj/j21_3/j21_3_111-117.pdf

View user profile http://elshamah.heavenforum.com

9 The tubulin code on Wed Jan 13, 2016 6:34 pm

Admin


Admin
The tubulin code

The α- and β-tubulin heterodimer – the building block of microtubules – undergoes multiple post-translational modifications (PTMs) (Table above). The modified tubulin subunits are non-uniformly distributed along microtubules. Analogous to the model of the ‘histone code’ on chromatin, diverse PTMs are proposed to form a biochemical ‘tubulin code’ that can be ‘read’ by factors that interact with microtubules (Verhey and Gaertig, 2007).

Who are the Interpreters of the Tubulin Code?

A major implication of the tubulin code is that PTMs influence the recruitment of protein complexes (microtubule effectors), which in turn contribute to microtubule-based functions. Three major classes of microtubule binding proteins can be considered as interpreters of the tubulin code. First, microtubule associated proteins (MAPs) such as Tau, MAP1 and MAP2 that bind statically along the length of microtubules. Second, plus end tracking proteins (+TIPs) that bind in a transient manner to the plus-ends of growing microtubules. And third, molecular motors that use the energy of ATP hydrolysis to carry cargoes along microtubule tracks.


This is a relevant and amazing fact , and raises the question of how the " tubulin code "  beside the several other codes in the cell emerged. Once more this shows that intelligence was involved in creating these amazing biomolecular structures and specified complex coded instructing patterns  , since the formation of  coded information has always shown to be able only to be produced by intelligent minds. Furthermore: What good would the tubulin code be for, if no specific goal was in mind, that is, it acts as emitter , and if there is no destination of the information, there is no reason of the code to exist in the first place. So both, sender and receiver, must exist first as hardware, that is the microtubule with the post transcriptional modified tubulin units in a specified coded conformation, and the the receiver, which can be Kinesin or Myosin motor proteins, which are directed to the right destination, or other proteins.



Last edited by Admin on Wed Jan 04, 2017 5:14 am; edited 2 times in total

View user profile http://elshamah.heavenforum.com

10 The Glycan or Sugar Code on Wed Jan 13, 2016 6:54 pm

Admin


Admin
The Glycan or Sugar Code 1
Carbohydrates are essential for all forms of life, but the largest variety of their functions is now found in higher eukaryotes. The majority of eukaryotic proteins are modified by cotranslational and posttranslational attachment of complex oligosaccharides (glycans) to generate the most complex epiproteomic modification – protein glycosylation. 
most proteins are glycosylated: That is, complex carbohydrates are chemically bonded to them to generate enormous diversity in protein functions. [5] Since carbohydrate molecules are branched, they carry many more orders of magnitude of information than linear molecules such as DNA and RNA. This has been called the “sugar [b]code,” and although it is highly specified it is largely independent of DNA sequence information.  [/b]A third biochemical alphabet forming code words with an information storage capacity second to no other substance class in rather small units (words, sentences) is established by monosaccharides (letters). As hardware oligosaccharides surpass peptides by more than seven orders of magnitude in the theoretical ability to build isomers, when the total of conceivable hexamers is calculated.  A genetic program is not sufficient for embryogenesis: biological information outside of DNA is needed to specify the body plan of the embryo and much of its subsequent development. Some of that information is in cell membrane patterns, which contain a two-dimensional code mediated by proteins and carbohydrates. 2

According to the most widely held modern version of Darwin’s theory, DNA mutations can supply raw materials for morphological evolution because they alter a genetic program that controls embryo development. Yet a genetic program is not sufficient for embryogenesis: biological information outside of DNA is needed to specify the body plan of the embryo and much of its subsequent development. Some of that information is in cell membrane patterns, which contain a two-dimensional code mediated by proteins and carbohydrates. These molecules specify targets for morphogenetic determinants in the cytoplasm, generate endogenous electric fields that provide spatial coordinates for embryo development, regulate intracellular signaling, and participate in cell–cell interactions. Although the individual membrane molecules are at least partly specified by DNA sequences, their two-dimensional patterns are not. Furthermore, membrane patterns can be inherited independently of the DNA. I review some of the evidence for the membrane code and argue that it has important implications for modern evolutionary theory.



1) http://www.ncbi.nlm.nih.gov/pubmed/10798195

2) http://reasonandscience.heavenforum.org/t2071-carbohydrates-and-glycobiology-the-3rd-alphabet-of-life-after-dna-and-proteins?highlight=glycan+code



Last edited by Admin on Sun Feb 28, 2016 4:44 am; edited 1 time in total

View user profile http://elshamah.heavenforum.com

11 Re: The various codes in the cell on Sat Jan 16, 2016 2:44 pm

Admin


Admin
https://en.wikipedia.org/wiki/Edward_Trifonov...


Trifonov advocates[19]:4 the notion that biological sequences bear many codes contrary to the generally recognized one genetic code (coding amino acids order). He was also the first one to demonstrate[20] that there are multiple codes present in the DNA. He points out that even so called non-coding DNA has a function, i.e. contains codes, although different from the triplet code.

.

Trifonov recognizes[19]:5–10 specific codes in the DNA, RNA and proteins:

in DNA sequences
chromatin code (Trifonov 1980) is a set of rules responsible for positioning of the nucleosomes.
------------------------------------------------------
in RNA sequences

RNA-to-protein translation code (triplet code)
Every triplet in the RNA sequence corresponds (is translated) to a specific amino acid.

splicing code
is a code responsible for RNA splicing; still poorly identified.

framing code (Trifonov 1987)
The consensus sequence of the mRNA is (GCU)n which is complementary to (xxC)n in the ribosomes. It maintains the correct reading frame during mRNA translation.

translation pausing code (Makhoul & Trifonov 2002)
Clusters of rare codons are placed in the distance of 150 bp from each other. The translation time of these codons is longer than of their synonymous counterparts which slows down the translation process and thus provides time for the fresh-synthesized segment of a protein to fold properly.

------------------------------------------------------
in protein sequences

protein folding code (Berezovsky, Grosberg & Trifonov 2000)
Proteins are composed of modules. The newly synthesized protein is folded a module by module, not as a whole. 

fast adaptation codes (Trifonov 1989)
are present in all three types of biological sequences. They are represented by tandem repeats (AB...MN)n. The number of repetitions (n) can change in the cell genome as a response to stress which may (or may not) help the cell to adapt to the environmental pressure. 

------------------------------------------------------
codes of evolutionary past

binary code (Trifonov 2006)
The first ancient codons were GGC and GCC from which the other codons have been derived by series of point mutations. Nowadays, we can see it in modern genes as "mini-genes" containing a purine at the middle position in the codons alternating with segments having a pyrimidine in the middle nucleotides.

genome segmentation code (Kolker & Trifonov 1995)
Methionines tend to occur every 400 bps in the modern DNA sequences as a result of fusion of ancient independent sequences.

THE CODES CAN OVERLAP EACH OTHER SO THAT UP TO 4 DIFFERENT CODES CAN BE IDENTIFIED IN ONE DNA SEQUENCE (specifically a sequence involved in a nucleosome). According to Trifonov, other codes are yet to be discovered.


****THE CODES CAN OVERLAP EACH OTHER****

View user profile http://elshamah.heavenforum.com

12 Code Biology on Fri Apr 22, 2016 10:28 pm

Admin


Admin
Code Biology

Marcello Barbieri , page 14:

The Signal Transduction Codes Signal transduction is the process by which cells transform the signals from the environment, called first messengers, into internal signals, called second messengers. First and second messengers belong to two independent worlds because there are literally hundreds of first messengers (hormones, growth factors, neurotransmitters, etc.) but only four great families of second messengers (cyclic AMP, calcium ions, diacylglycerol and inositol trisphosphate) (Alberts et al. 2007). The crucial point is that the molecules that perform signal transduction are true adaptors. They consist of three subunits: a receptor for the first messengers, an amplifier for the second messengers, and a mediator in between (Berridge 1985). This allows the transduction complex to perform two independent recognition processes, one for the first messenger and the other for the second messenger. Laboratory experiments have proved that any first messenger can be associated with any second messenger, which means that there is a potentially unlimited number of arbitrary connections between them. In signal transduction, in short, we find all three essential components of a code:

(1) two independents worlds of molecules (first messengers and second messengers), 
(2) a set of adaptors that create a mapping between them, and 
(3) the proof that the mapping is arbitrary because its rules can be experimentally changed.

The cells that evolved new codes, such as splicing codes, cytoskeleton codes, compartment codes, histone code and so on, became eukarya and have generated increasingly complex cellular structures.

Before the origin of the genetic code, the common ancestor was engaged in evolving coding rules and was therefore a code exploring system.


Had the common ancestor not have already to have a sophisticated code and cipher system set up, and so the molecular machines required for transcription, translation and replication ? And why at all would random chance explore codes to set them up ? 

After the origin of the code, however, Introduction xv no other modification in coding rules was allowed and the cell became a code conservation system. 

There are 18 different Cell codes known. Why would natural mechanisms stop to allow other , different codes ? 

Another part of the ancestral cells, however, maintained the potential to evolve the rules of different codes and behaved as new code exploring, or code generating, systems. In the early Eukarya, for example, the cells had a code conservation part for the genetic code, but also a code exploring part for the splicing code, and this tells us something important about life.

Well, the splicing code has nothing to do with the genetic code.

The origin of the first cells was based on the ability of the ancestral systems to generate the rules of the genetic code

What systems were this, why  would that " system " generate a code and its rules at all ?

Another outstanding implication of the existence of organic codes in Nature comes from the fact that any code involves meaning and we need therefore to introduce in biology, with the standard methods of science, not only the concept of biological information but also that of biological meaning. The study on the organic codes, in conclusion, is bringing to light new mechanisms that operated in the history of life and new fundamental concepts. It is an entirely new field of research, the exploration of a vast and still largely unexplored dimension of the living world, the real new frontier of biology.

A Gallery of Organic Codes 
The Apparatus of Protein Synthesis
The Genetic Code 
Stereochemistry and Arbitrariness 
The Splicing Codes 
The Metabolic Code
The Signal Transduction Codes 
The Signal Integration Codes 
The Histone Code 
Is the “Histone Code” an Organic Code?
The Tubulin Code 
The Sugar Code 
The Glycomic Code



Last edited by Admin on Sun Jul 03, 2016 12:37 pm; edited 3 times in total

View user profile http://elshamah.heavenforum.com

13 The Metabolic Code on Fri Apr 22, 2016 11:01 pm

Admin


Admin
The Splicing Codes

Code Biology

Marcello Barbieri , page 14

The primary transcripts of the genes are often transformed into messenger RNAs by removing some RNA pieces (called introns) and by joining together the remaining pieces (the exons). This cutting-and-sealing operation, known as splicing, is a true assembly because exons are assembled into messengers, and we need therefore to find out if it is a catalyzed assembly (like transcription) or a codified assembly (like translation). In the first case splicing would require only catalysts (comparable to RNA-polymerases), whereas in the second case it would need an assembly machine and a set of adaptors (comparable to ribosome and tRNAs). These parallels immediately suggest that splicing is a codified process because it is implemented by structures that are very much comparable to those of protein synthesis. The splicing bodies, known as spliceosomes, are huge molecular machines like ribosomes, and employ small molecules, known as small-nuclear- RNAs (snRNAs) which are comparable to tRNAs. The similarity, however, goes much deeper than that because splicing is carried out by molecular structures that are true adaptors. They perform two independent recognition processes, one for the beginning and one for the end of each exon, thus creating a specific correspondence between primary transcripts and messenger RNAs. Splicing, in other words, is a codified process based on adaptors and takes place with sets of rules that have been referred to as splicing codes. It must be underlined, however, that there are two outstanding complications in splicing. One is the fact that the order in which the exons are joined together can be shuffled in various ways, an operation, called alternative splicing, that allows many species to generate a whole family of variant proteins from the same gene. The expression of these proteins, furthermore, can change from one tissue to another and in different stages of embryonic development, thus enormously increasing the protein variety that can be associated to a gene. Alternative splicing has in this way a powerful role in the generation of biological complexity, and splicing mistakes often have pathological effects; it has been estimated that they account for about one fifth of all inherited diseases.

The other great complication of splicing is the fact that many introns carry sequences that are similar to exons but translate into nonsense and for this reason are called pseudo exons or pseudo genes. They would create havoc if incorporated into mRNAs and the splicing machinery needs the means to differentiate real exons from pseudo ones. The result is that real exons contain internal identity marks that are known as exonic splicing enhancers (ESEs) and exonic splicing silencers

Question: Had these indentity marks not have to be present in the process right from the beginning ? Would the absence of these or ones not fully developed  not make the process impossible to happen without mistakes ? 

The presence of these marks, in turn, means that the adaptors of the splicing codes are not single molecules but combinations of molecules because they must be able to recognize not only the beginning and the end of the real exons, but also their internal identity marks.

This makes the whole process even more impossible to emerge in a stepwise manner, since both, the recognition of the beginning and the end of the exons is required, that means, the genome needs to have the start and stop signals at the right place, and the molecular machines, programmed to recognize the signals must be in place, fully developed, and fully programmed, and the identity marks are required beside the hardware as well. Furthermore, this seems to be one more irreducible complex system , since both, the software, and the hardware, had to be in place, just right , fully developed and programmed since the beginning. 

The actual deciphering of the splicing codes has already started but it is taking considerably longer than that of the genetic code because it is incredibly more complex. Let us keep in mind that the discovery of the genetic code has been facilitated by two particularly favourable features. More precisely, by the fact that 

(1) the adaptors are single molecules (the tRNAs) and 
(2) the coding units form a closed set (64 codons and 20 amino acids).

 In the case of splicing, instead, the adaptors are combinations of molecules (combinatorial codes), and the domain (or alphabet) of the codes is open and potentially unlimited. The overall complexity of splicing is such that the most practical way of discovering its codes is by building computational models that are capable of predicting new splicing rules on the basis of existing data. Such models have already started appearing in the literature , and represent our first glimpse of the rules of the splicing codes.

The Metabolic Code 

This is the first organic code that came to light after the discovery of the genetic code. It was described in Science, in 1975, by Gordon Tomkins, a professor of biochemistry at the University of San Francisco. Tragically, Tomkins died that very year, aged 49, from a brain tumour, and apparently his idea died with him. Recently, however, there has been an attempt to rescue his work from oblivion (Swan and Goldberg 2010) and here we will try to show that such attempt is amply justified. Tomkins investigated the evolution of metabolism and started from the need of the ancestral cells to obtain energy. “Since both nucleic acid and protein synthesis are endergonic reactions, primordial cells were almost certainly endowed with the capacity to capture the necessary energy from the environment and to transform it into usable form, presumably ATP (adenosine triphosphate). The biosynthetic capabilities of primitive cells were, however, probably quite limited ::: survival would therefore have required the evolution of regulatory mechanisms that could maintain a relatively constant intracellular environment in the face of changes in external conditions” (Tomkins 1975). Granted this basic need of the cells to evolve regulatory mechanisms, Tomkins distinguished between two types of regulation that he called simple and complex, elationship (positive or negative) between the components of a metabolic circuit, and the end products affecting their own metabolism. 

Complex regulation is characterized by two new entities that Tomkins called symbols and domains. In order to illustrate them, Tomkins made the example of molecules that are accumulated inside a cell as a consequence of a particular environment and become a symbol of that environment. In most microorganisms, for example, cyclic AMP is accumulated as a result of carbon starvation and becomes a symbol of that deficiency. Another example is ppGpp (guanosine50 -diphosphate30 - diphosphate) that accumulates as a result of amino acid starvation and represents a symbol of that condition. These molecules are symbols because they bear no structural relationship to the molecules that promote their accumulation (cyclic AMP, for example is accumulated as a result of glucose starvation, but it is not a chemical analog of glucose). This is what suggested to Tomkins the existence of a metabolic code. “Since a particular environmental condition is correlated with a corresponding intracellular symbol, the relationship between the extra- and intracellular events may be considered as a metabolic code in which a specific symbol represents a unique state of the environment.” Tomkins went on to show how metabolic coding in unicellular organisms might have evolved into the endocrine system of the metazoa, and described what happens in the slime mold Dictyostelium discoideum. “Given sufficient nutrients, this organism exists as independent myxamoebas. Upon starvation, they generate cyclic AMP and release it into the surrounding medium. This substance serves as a chemical attractant that causes the aggregation of a large number of myxamoebas to form a multicellular slug. In this case, as in E. coli, cyclic AMP acts as an intracellular symbol of carbon-source starvation. In addition, however, the cyclic nucleotide is released from the Dictyostelium cells in which it is formed and diffuses to other nearby cells, promoting the aggregation response. Cyclic AMP thus acts in these organisms both as an intracellular symbol of starvation and as a hormone which carries this metabolic information from one cell to another.”

Hormones, according to Tomkins, evolved in order “to carry information from sensor cells in direct contact with the environment, to more sequestered responder cells. Specifically, the metabolic state of a sensor cell, represented by the levels of its intracellular symbols, is encoded by the synthesis and secretion of corresponding levels of hormones. When hormones reach the responder cells, the metabolic message is decoded into corresponding primary intracellular symbols. In this way, endocrine cells act as both sensors and responders, that is, intermediates in the transmission of metabolic information from primary sensor cells to the tissues in which the final chemical responses take place.”

The Signal Transduction Codes

Living cells react to many physical and chemical stimuli from the environment, and in general their reactions consist in the expression of specific genes. We need therefore to understand how the environment interacts with the genes, and the turning point, in this field, came from the discovery that the external signals (known as first messengers) never reach the genes. They are invariably transformed into a different world of internal signals (called second messengers) and only these, or their derivatives, reach the genes. In most cases, the molecules of the external signals do not even enter the cell and are captured by specific receptors of the cell membrane, but even those that do enter (some hormones) must interact with
intracellular receptors in order to influence the genes (Sutherland 1972). The transfer of information from environment to genes takes place therefore in two distinct steps: one from first to second messengers, called signal transduction, and a second path from second messengers to genes which is known as signal integration. The surprising thing about signal transduction is that there are literally hundreds of first messengers (ions, nutrients, hormones, growth factors, neurotransmitters, etc.) whereas the second messengers belong to only four molecular families: cyclic AMP or GMP, calcium ions (Ca2+), inositol trisphosphate (IP3), and diacylglycerol (DAG) (Alberts et al. 2007). First and second messengers, in other words, belong to two very different worlds, and this suggests immediately that signal transduction may be based on organic codes. This is reinforced by the discovery that there is no necessary connection between first and second messengers, because it has been proven that the same first messengers can activate different types of second messengers, and that different first messengers can act on the same type of second messengers (Alberts et al. 2007). The only plausible explanation is that signal transduction is based on organic codes, but of course one would like a direct proof. The signature of an organic code, as we have seen, is the presence of adaptors and the transmembrane receptor proteins of signal transduction do have the defining characteristics of the adaptors. 

The transduction system consists of at least three types of molecules: 

a receptor for the first messengers, 
an amplifier for the second messengers and 
a mediator in between (Berridge 1985). 

This transmembrane system performs two independent recognition processes, one for the first and the other for the second messenger, and the two steps are connected by the bridge of the mediator. This connection, on the other hand, could be implemented in countless different ways since any first messenger can be coupled with any second messenger, and this makes it imperative to have a selection in order to guarantee biological specificity. 

In signal transduction, in short, we find the three defining features of a code: 

(1) two independents worlds of objects (first messengers and second messengers), 
(2) a potentially unlimited number of arbitrary connections produced by adaptors, and 
(3) a set of coding rules (a selection of the adaptors) that ensures the specificity of the correspondence. 

The effects that external signals have on cells, in short, do not depend on the energy or the information that they carry, but on the meaning that cells give them with sets of rules that have been referred to as signal transduction codes (Barbieri 1998, 2003). One may wonder at this point why signal transduction codes are never mentioned in biochemistry books despite the fact that the their molecules are true adaptors. The problem here is that the study of signal transduction started when organic codes were not known, and it has always been assumed a priori that in this process there is no need for them. A code, in short, has not been found simply because it has never been looked for. The genetic code, on the contrary, was predicted on theoretical grounds, and it was discovered precisely because experiments were devised with the specific purpose to look for it.

The Signal Integration Codes

We have seen that there are only four families of second messengers in the cell, and yet the reactions that they set in motion can pick up an individual gene among tens of thousands. How this is achieved is still a mystery, but some progress has been made. Perhaps the most illuminating discovery, so far, is that second messengers do not act independently. Calcium ions and cyclic-AMPs, for example, have effects that in some occasions reinforce each other whereas in others are mutually exclusive. The cell, in short, can combine its internal signals in countless different ways, and it is precisely this combinatorial ability that explains why a small number of second messengers can generate an extraordinarily high number of specific genetic responses. The activation of second messengers, in other words, sets in motion a cascade of reactions that normally ends with the expression of a target gene, and again we need to understand if they are normal catalized reactions or if at least some of them are based on the rules of a code. One of the most interesting clues, in this field, is the fact that signalling molecules have in general more than one function. Epidermal growth factor, for example, stimulates the proliferation of fibroblasts and keratinocytes, but it has an antiproliferative effect on hair follicle cells, whereas in the intestine it is a suppressor of gastric acid secretion. Other findings have proved that all growth factors can have three distinct functions, with proliferative, anti-proliferative, and proliferationindependent effects. They are, in short, multifunctional molecules. In addition to growth factors, it has been found that many other molecules have multiple functions. Adrenaline, for example, is a neurotransmitter, but it is also a hormone produced by the adrenal glands to spring the body into action by increasing the blood pressure, speeding up the heart and releasing glucose from the liver. Acetylcholine is another common neurotransmitter in the brain, but it also act on the heart (where it induces relaxation), on skeletal muscles (where the result is contraction), and in the pancreas (which is made to secrete enzymes). Cholecystokinin is a peptide that acts as a hormone in the intestine, where it increases the bile flow during digestion, whereas in the nervous system is a neurotransmitter. Encephalins are sedatives in the brain, but in the digestive system are hormones which control the mechanical movements of food. Insulin is universally known for lowering the sugar levels in the blood, but it also controls fat metabolism and in other less known ways it is affecting almost every cell of the body. The discovery of multifunctional molecules suggests that their function is not decided solely by their structure, but also by the context in which they find themselves. What matters, in other words, is not their ability to catalize a specific reaction, but the fact that they are employed as molecular signs that can be given one meaning in a certain context and a different meaning in another one. A second finding that points to the existence of codes in signal integration is the fact that the regulation processes set in motion by second messengers are strongly conserved in evolution, and yet the actual reactions involved have undergone great changes in the history of life. The regulation of cellular energy homeostasis, for example, has been highly conserved from yeast to man, with the key role being played by a protein kinase that is called AMPK in animals and Snf1 in yeast. Despite this overall conservation, it has been found that an evolutionary divergence of about 150 million years between two species of budding yeasts (Saccharomyces cerevisiae and Kluyveromyces lactis) has produced substantial differences in their Snf1 regulatory networks. Again, what seems to matter in these regulation processes is not a specific set of catalysts, but a set of rules that can be implemented in many different ways. The information carried by first messengers, in conclusion, undergoes two great transformations in its journey towards the genes. First, it is transformed into internal messengers with the rules of the signal transduction codes, and then it is channelled along complex three-dimensional circuits that integrate it with other signals according to the rules of one or more signal integration codes.

The Histone Code

The classic double helix described by Watson and Crick has a width of 2 nm (two millionths of a millimeter), but in eukaryotes many segments of this filament are folded around groups of eight histone proteins and form blocks, called nucleosomes, that give to the filament a ‘beads-on-a-string’ appearance. This string, called chromatin, is almost six times thicker than the double helix and is further folded into spirals of nucleosome groups, called solenoids, that arrange it in fibers of increasing thickness and ultimately into the 600 nm fiber of the chromosome. These multiple foldings allow the eukaryotic cells to pack their long chromosomes into the tiny space of their nuclei, and for this reason it was initially assumed that the histones have a purely packaging role. The experimental data, however, have shown that the ‘tails’ of the histones (the parts that protrude from the surface of the nucleosomes) are subject to a wide variety of post-translational modifications (in particular acetylation, methylation and phosphorylation) that have highly dynamic roles and are involved in the activation or repression of gene activity. The histone tails represent about 25–30 % of the histone mass, and their posttranslational modifications can alter the chromatin either directly or indirectly. The direct modifications are those that physically open or close the molecular space (in particular the electrostatic barrier) that surrounds the genes and in this way control the transit of DNA-binding proteins. Several discoveries, however, have shown that the most frequent effects are obtained by indirect mechanisms. In these cases, the modified histone tails provide ‘marks’ on the surface of the nucleosomes that are recognized by specialized effector proteins which set in motion chains of biological reactions that eventually end in the activation or the repression of specific gene. A crucial breakthrough, is this field, was the discovery that the post-translational modifications of the histones do not act individually. Most of them are involved in both the activation and the repression of genes (the phosphorilation of histone H3, for example, takes part in the condensation as well as in the decondensation of chromatin), which means that the final result is due to a combination of histone marks rather than a single one. This led David Allis and colleagues to propose that the histone marks operate in combinatorial groups, like letters that are put together into the words of a molecular ‘language’ that was referred to as histone code. The same concept was independently proposed by Brian Turner who argued that there is an epigenetic code at the heart of the regulation mechanisms that are initiated by histone tail modifications. Turner pointed out that these modifications are epigenetic because they operate in addition to genetic changes, and underlined that they have both short-term and long-term effects. The shortterm modifications change rapidly in response to external signals and represent a mechanism by which the genome quickly responds to the environment. The long-term modifications, instead, are those that are put in place at early stages of embryonic development and allow the transcription or the silencing of specific genes at more advanced stages. The existence of long-term effects was revealed by the discovery that many histone modifications survive the trauma of mitosis and are transmitted to the daughter cells. This is particularly important in embryonic development where the cells must perpetuate their state of differentiation into distinct tissues. The histone modifications, in other words, provide a mechanism of cell memory, in the sense that they enable the cells to ‘remember’ their specific pattern of gene expression for many generations. It has been shown, for example, that the expression of Hox genes in embryonic development is regulated by histone modifications . Another example of long-term effects is provided by the histone modifications that allow neural cells to generate faster action potentials the more they are used, making the transmission of action potentials increasingly easier. Today, in conclusion, a large number of data support the idea that the regulation of genetic activity by histone modifications plays a fundamental role in all eukaryotes and is based on the rules of a combinatorial code that has become known as ‘histone code’.

Is the “Histone Code” an Organic Code?

This question is the title of a paper where Stefan Kühn and Jan-Hendrik Hofmeyr described the results of a research project dedicated to find out whether or not the histone code has all the essential characteristics of an organic code. The prototype example of the genetic code shows that an organic code requires three things: 

(1) two independent molecular worlds, 
(2) a set of molecular adaptors that create a mapping between them, and 
(3) the demonstration that the mapping is arbitrary because its rules can be changed. Kühn and Hofmeyr tested the histone code in respect to all these points.

1. The Two Independent Worlds of the Histone Code
An organic code is a mapping between organic signs and organic meanings, and in many cases signs and meanings are both organic molecules. The genetic code, for example, is a mapping between codons and amino acids, whereas the signal transduction code is a mapping between first and second messengers. Kühn and Hofmeyr, however, pointed out that the organic meanings can be biological effects rather than molecules. In principle this may not seem an extension of the original definition because biological effects are necessarily implemented by molecules, but in practice it is a very useful generalization because there are cases in which a biological function is an experimental reality even when its molecular components are not fully known. And this is precisely the case in the histone code, where the organic signs are groups of histone modifications and the organic meanings are biological reactions that promote the activation or the repression of specific genes. The histone code, in other words, is a mapping between two independent worlds.

2. The Adaptors of the Histone Code
The effector proteins of the histone code are the molecules that establish a bridge between organic signs and organic meanings, but in order to prove that they are true adaptors it is necessary to show that they operate independently on signs and meanings. Kühn and Hofmeyer underlined that this is precisely what happens because the effector proteins have two distinct domains: one that recognizes histone modifications and a different type that initiates biological reactions. It has been shown, for example, that the acetylated lysines are specifically recognized only by the bromodomains of the effector proteins . The methylated amino acids are recognized by a greater variety of domains but again each recognition step is absolutely specific . The effector  proteins, in other words, perform two independent recognition processes on signs and meanings and are therefore true adaptors.

3. The Arbitrariness of the Histone Code
An organic code is arbitrary when its rules are not dictated by physical necessity and in this case it must be possible, at least in principle, to exchange the part of an adaptor that recognizes an organic sign with a different one and show that the modified adaptor associates the old organic meaning to the new sign. Kühn and Hofmeyr noticed that the experimental data support this possibility because there is evidence that the chromodomains of the effector proteins can be interchanged. The histone code, in conclusion, did pass the three tests and Kühn and Hofmeyr ended their paper with these words: “Although we probably do not yet know the complete histone code, we have more than enough information to be able to recognize the histone code as a bona fide organic code.”

Nucleosome Positioning Codes 1

DNA molecules are much longer than the cells that contain them. This requires their compaction, which introduces also an opportunity: the regulation of transcription through a differentiated fashion of DNA packaging. In eukaryotes DNA molecules can guide their own packaging into nucleosomes by having the desired mechanical properties (stiffnesses and intrinsic curvature) written into their base-pair (bp) sequence. This has been referred to as the “nucleosome positioning code” . Nucleosomes are the fundamental packaging units of eukaryotic DNA, where 147 bp are wrapped in a 1 3/4 left-handed superhelical turn around an octamer of histone proteins. As the DNA is strongly deformed when wrapped around the histones, sequence-dependent geometrical and mechanical properties could—at least locally—overrule other effects that also influence nucleosome positioning like the presence of proteins that compete for the same DNA stretch or the action of chromatin remodellers.


The Tubulin Code

Tubulin is the major component of the microtubules, the filaments that form an internal scaffolding in all eukaryotic cells and give origin to organelles such as cilia, centrioles, basal bodies and the mitotic spindle. Most microtubules are in a state of rapid turnover by dynamic instability and alternate very quickly between growth and shrinkage. Within the cell, however, there is also a population of microtubules that are relatively stable, in the sense that their turnover is measured in hours rather than minutes. The function of the stable microtubules is still not completely known, but there are clear indications that they are involved in the morphogenesis of the eukaryotic cell. What is certain, is that the stable microtubules undergo a variety of post-translational modifications (PTMs) that have been strongly conserved  because they are found in all eukaryotic taxa. These PTMs consist in processes like acetylation, phosphorylation, polyglutamylation, polyglycylation, detyrosination, and palmitoylation that act preferentially on stable microtubules. They have been studied with various tests on purified tubulin, but the experiments have failed to detect any direct effect of the PTMs on the dynamics of the microtubules. This means that PTMs do not act by changing directly the intrinsic properties of the microtubules, but rather by providing combinatorial signals for the recruitment of proteins that interact with the microtubules. Different combinations of PTMs, in other words, act like signposts that specify the properties that stable microtubules are going to have in different regions of the cell or in different periods of the cell cycle. To this set of signposts that operate on stable microtubules, Kristen Verhey  and Jacek Gaertig  gave the name of Tubulin code. Any organic code, as we have seen, requires molecules that act like adaptors between two different domains. Verhey and Gaertig have called these molecules ‘interpreters’, and have identified three major classes of microtubule binding proteins that can be considered interpreters of the tubulin code:

 “First, microtubule associated proteins (MAPs) such as Tau, MAP1 and MAP2 that bind statically along the length of microtubules. 
Second, plus-end tracking proteins (+TIPs) that bind in a transient manner to the plus-ends of growing microtubules. 
And third, molecular motors that use the energy of ATP hydrolysis to carry cargoes along microtubule tracks.” 

Verhey and Gaertig have also called attention to a unique characteristic of the tubulin code. Many epigenetic modifications are transmitted from one generation to the next, but this does not usually happen in the tubulin world: “Some microtubule-based organelles (e.g., centrosomes and basal bodies) are inherited by a template-driven mechanism but there is no evidence that the template organelle directly influences the PTM pattern in the new organelle. Rather, the PTM pattern is recreated in the newly formed organelle in a gradual manner : : : Other microtubulebased structures, such as cytoplasmic microtubules, the mitotic spindle and cilia, are formed de novo mostly, if not entirely, from unmodified tubulin heterodimers. Thus, in case of both template-dependent and template-independent microtubular structures, PTM patterns are probably recreated without a direct influence of preexisting PTMs.” The existence of the tubulin code, in conclusion, is based on sound experimental evidence but the actual deciphering of its rules is still at a preliminary stage and requires a detailed understanding of how the PTMs influence the recruitment of proteins and regulate the functions of the stable microtubules.

The Sugar Code


For a long time, sugars have been regarded as molecules that provide energy (mostly in the form of glucose and glycogen) or structural support (like cellulose in plants), but molecular biology has shown that they also have a third outstanding function: by binding to proteins they generate glycoproteins, molecules that take part in countless communication processes in and between cells. The addition of sugars to proteins is a post-translational modification, called glycosylation, that greatly expands the potentialities of many protein families and gives origin to glycoproteins that perform a wide variety of functions. Some operate on the cell membrane and act as antennae for receiving molecular signals or as docking sites for importing compounds. Other glycoproteins take part in cell-to-cell interactions, for example in sperm-oocyte attachment, in bacteria-to-cell relationships and in the aggregation of platelets. A third family operates in the immune system where glycoproteins interact with antigens, recognize white blood cells, and take part in the major histocompatibility complex (MHC). Yet another family is that of the glycoproteins that act as hormones, like human corionic gonadotropin (HCG), thyroid-stimulating hormone (TSH) and erythropoietin (EPO). Then there are glycoproteins that have protective functions (mucins), some that are involved in transport (transferrin) and others that act as enzymes (alkaline phosphatase). The key point in these interactions is that in most cases it is the sugar component that determines the recognition ability of the glycoproteins. This point has been particularly underlined by Winterburn and Phelps (1972), who convincingly argued that “the significance of the glycosyl residues is to impart a discrete recognitional role on the protein”. Sugars, in other words, are carriers of information because their sequences have specific biological functions, and yet the information they carry is only partially contained in the genome. In most cases it is due to subtle epigenetic modifications in the terminals of the sugar antennae. It has been found, furthermore, that sugars have a capacity to store information that is many orders of magnitudes higher than that of nucleotides and amino acids . This makes us realize that, after nucleotides and amino acids, sugars are a third great family of informational molecules, but how do they transmit their messages to the other components of the cell? The key discovery, on this point, is that the functions that are associated with sugars are not set in motion by the sugars. In most cases, they are set in motion by proteins that interact with the sugars and recognize the specific role that they have in any given set of circumstances. These sugar-binding proteins became popular in the early 1900s mainly because they served to determine the chemical structure of the ABO blood groups and were originally called agglutinins. In 1954, however, Boyd argued that they should be given a new name that reflects the unique function that they actually perform, i.e., the highly specific selection of carbohydrates. To this purpose he proposed to call them lectins, on the ground that this term derives “from the Latin lectus, the past principle of legere meaning to pick, choose or select” (Boyd 1954). The next step in the discovery of the informational properties of the sugars was the recognition, by Hans-Joachim Gabius, that their messages must be decoded in order to have biological effects, and that lectins are the decoding devices in this process. Gabius, in other words, realized that lectins are adaptors, molecules that act as intermediaries between sugars and biological reactions and establish connections between them that are not determined by physical necessity. This is why he proposed that there is a Sugar code at the basis of the communication processes that involve sugars, and that “lectins are the translators of the Sugar code”.



Last edited by Admin on Fri Apr 21, 2017 4:06 pm; edited 2 times in total

View user profile http://elshamah.heavenforum.com

14 The Glycomic Code on Thu Apr 28, 2016 7:23 am

Admin


Admin
The Glycomic Code

Not only the universe , but biological systems as well are fine-tuned on a razors edge. There are things, that are easily overlooked, but determine the arise of advanced life  on planet earth. Who could imagine, that the structure of  plant cell walls require complex coded information and the assembly to form special complex matrix structures that are  controlled by rules that are arbitrary in order to prevent microorganisms to enter plant cells and destroy them ? If that were the case, we would not be here......  

http://reasonandscience.heavenforum.org/t2213-the-various-codes-in-the-cell#4851

An extracellular matrix called cell wall surrounds all plant cells and one of its most common component is cellulose, a polymer formed by long chains of glucose that bind to each other with such great affinity that most of the water is excluded from their surface. The result is a structure that is very hard to hydrate and to break. Cell walls, however, are not made of cellulose only. There are other polymers that occur in significant amounts and in most cases they are similar to cellulose in structure, but are branched in more complex ways. Because of their similarity to cellulose, these branched polysaccharides have been called hemicelluloses. They surround the cellulose microfibrils and interact with each other with non-covalent bonds in such a complex way that the hemicelluloses are even harder to disassemble. On top of that, there is an even higher level of complexity: the cellulose-hemicellulose domain is embedded in a matrix of pectins (polysaccharides with very complex chemical structure) which forms a jelly-like structure that retains water and at the same time it further reduces the pores of the cell wall. The complexity is probably due to the fact that the cell walls, in addition to controlling the expansion and the growth of the plant cells, must also form a barrier that prevents, or makes it extremely difficult for, microorganisms to enter into the cell cytoplasm. When microorganisms invade a plant cell, it may seem that all they need to gain access is a few enzymes that degrade pectins, hemicellulose and cellulose, but that is not the case. In fact, if microorganisms could easily enter into plant cells, most plants would not survive and life on Earth would not exist in its present form. So, how did plants manage to defend themselves? The second most abundant polysaccharide on Earth after cellulose, is xyloglucan, and by using enzymes such as cellulases, researchers could study whether the oligosaccharides found in xyloglucan were arranged randomly or not. A first answer to this question came from the discovery by Buckeridge et al. of a new xyloglucan polymer that contains two families of oligosaccharides, one with four and the other with five glucoses (tetramers and pentamers).  


There are regularities in the tetramers and pentamers of the xyloglucan molecules. This was probably the first proof that the constitutive blocks of xyloglucan are not arranged randomly. After that finding, Marcos Buckeridge and Amanda De Souza performed experiments on a large number of hemicelluloses and found that some enzymes have higher specificity for certain regions in all branched polymers, which implies that their molecules too are non-randomly organized. These regularities in hemicelluloses suggested that their assembly is controlled by rules, and the fact that they are the result of contingent  developments indicated that the rules are arbitrary. This is why the authors proposed that there is a glycomic code in plant cell wall hemicelluloses. A consequence of this proposal is the idea that 


plants have an increasingly complex system of coding rules for the assembly of hemicelluloses in order to keep at bay the invading organisms by forcing them to develop an increasingly high number of specific enzymes. As a result, only a few microorganisms managed to find the key to enter any given plant cell.  The constraints imposed by the glycomic code on plant cell walls are so severe that many organisms – including us – have digestive systems that are totally dependent on cell walls (familiarly known as food fibers) 

Furthermore, plants themselves hardly degrade their own walls. It is true that in a forest cell walls are eventually degraded, but this is achieved by communities of microorganisms and never (or rarely) by a single species. If the glycomic code of plant cell walls did not exist, in conclusion, we would probably not be here because plants would be utterly different from what they presently are.

Breaking the “Glycomic Code” of Cell Wall Polysaccharides May Improve Second-Generation Bioenergy Production from Biomass 2

Plant cell walls display a highly complex organization that confers resistance (recalcitrance) to enzymatic hydrolysis. This poses a barrier  due to the difficulty of enzymes in accessing wall polymers. Here, we examine the fine structure of some of the main cell wall hemicelluloses and present some evidences that lend support to the idea of a glycomic code, which can be defined as the diversity of encrypted results of the biosynthetic mechanisms of plant cell wall polysaccharides that give rise to fine-structural domains containing information in polysaccharides. These are responsible for the formation of polymer composites with different levels of polymer-polymer interactions and recalcitrance to hydrolysis. Polysaccharide motifs that are recalcitrant to hydrolysis are here called pointrons, and the ones that are available to enzyme attack are named pexons. From the biotechnological viewpoint, the understanding of the glycomic code will require further identification of pointrons and possibly the transformation of them into pexons so that walls would become suitable to hydrolysis. 

Do plant cell walls have a code? 3

A code is a set of rules that establish correspondence between two worlds, signs (consisting of encrypted information) and meaning (of the decrypted message). A third element, the adaptor, connects both worlds, assigning meaning to a code. We propose that a Glycomic Code exists in plant cell walls where signs are represented by monosaccharides and phenylpropanoids and meaning is cell wall architecture with its highly complex association of polymers. Cell wall biosynthetic mechanisms, structure, architecture and properties are addressed according to Code Biology perspective, focusing on how they oppose to cell wall deconstruction. Cell wall hydrolysis is mainly focused as a mechanism of decryption of the Glycomic Code. Evidence for encoded information in cell wall polymers fine structure is highlighted and the implications of the existence of the Glycomic Code are discussed. Aspects related to fine structure are responsible for polysaccharide packing and polymer-polymer interactions, affecting the final cell wall architecture. The question whether polymers assembly within a wall display similar properties as other biological macromolecules (i.e. proteins, DNA, histones) is addressed, i.e. do they display a code?

The Hox Codes

Code biology, Barbieri, page 107

In 1979, David Elder proposed a model that was capable of accounting for the regularities that exist in the bodies of many segmented worms (annelids). The segments of these animals are often subdivided into annuli whose number varies according to a simple rule: if a segment contains n annuli, the following segment contains either the same number n (repetition) or n plus or minus 1 (digital modification). Elder noticed that this type of rules is known to the designers of electronic circuits as a Gray code, a code that is binary (because it employs circuits that have only one of two states), combinatorial (because its outcomes are obtained by combinations of circuits) and progressive (because consecutive outcomes must be coded by combinations that differ in the state of one circuit only). The results obtained with these rules describe with great accuracy what is observed in segmented worms, and Elder proposed therefore that the body plan of these animals is based on a combinatorial code that is a biological equivalent of the Gray code. He underlined in particular that the coding principle cannot be the classical “one geneone pattern”, but “one combination of genes-one pattern” and for this reason he called it epigenetic code (Elder 1979). After the discovery of the Hox genes, it became increasingly clear that they are used in many different permutations, according to a combinatorial set of rules that became known as Hox code. The term Hox code was introduced independently by Paul Hunt and colleagues (1991) and by Kessel and Gruss (1991) to account for the finding that the individual characteristics of the vertebrae are determined by different combinations of Hox genes. Later on, it was found that this is true in most other organs and it became standard practice to refer to any combination of Hox genes as a Hox code. The epigenetic code proposed by Elder, in particular, is a Hox code because it is Hox genes that are responsible for the body plan of the segmented worms. It must be underlined that the Hox genes can be used in different combinations not only in various parts of a body, but also in different stages of embryonic development. At the phylotypic stage, for example, the Hox genes specify characteristics of the phylum, whereas in later stages they determine characteristics at lower levels of organization. There is, in short, a hierarchy of Hox gene expressions, and therefore a hierarchy of Hox codes. At this point, however, we have to face a key definition problem: is it legitimate to say that the Hox codes are true organic codes? More precisely, that they have the basic features that we find, for example, in the genetic code? An organic code is a mapping between two independent worlds and cannot exist without a set of adaptors that physically realize the mapping. The Hox codes have been defined instead as patterns of combinatorial gene expression and do not require adaptors because a molecular pattern in one world is not a mapping between two independent worlds. We have therefore two different definitions of code, one based on mapping and the other on patterns, or sequences, and it is important to keep them separate because they have different biological implications.

2) http://link.springer.com/article/10.1007%2Fs12155-014-9460-6#/page-1
3) http://www.ncbi.nlm.nih.gov/pubmed/26706079

View user profile http://elshamah.heavenforum.com

Sponsored content


View previous topic View next topic Back to top  Message [Page 1 of 1]

Permissions in this forum:
You cannot reply to topics in this forum