Showing posts with label sine. Show all posts
Showing posts with label sine. Show all posts

Tuesday, March 21, 2023

Tolerating Your Non-self!

Immune cells get comfortable with cancer
Courtesy https://deepai.org

A hallmark of cancer, autoimmunity and disease is the aberrant transcription of typically silenced, repetitive genetic elements that mimic Pathogen-Associated Molecular Patterns (PAMP's) that bind Pattern Recognition Receptors (PPR's) triggering the innate immune system and inflammation. Unrestrained, this 'viral mimicry' activates a generally conserved mechanism that, under restraint, supports homeostasis. These repetitive viral DNA sequences normally act as a quality control over genomic dysregulation responding in ways that preferentially promote immune conditions for stability. If aberrantly unrestrained and the 'viral mimicry' is transcribed it may result in undesirable immune reactions that disrupt the homeostasis of cells.

Mitochondrial DNA (mtDNA) are one source of cytosolic double stranded RNA (dsRNA) that is commonly present in cells. Trp53 Mutant Embryonic Fibroblasts (MEF's) contain innate immune stimulating endogenous dsRNA, from mtDNA that mimic PAMP's. The immune response, via RIG-1 like PRR, leads to expression of type 1 interferon (IFN) and proinflammatory cytokine genes. Further, Natural Killer cells also produce a multitude of cytokines that can promote or dampen an immune response. Wild-type p53 suppresses viral repeats and contributes to innate immunity by enhancing IFN-dependent antiviral activity independent of its function as a proapoptotic and tumor suppressor gene. 

Post-translationally modified P53, located in the cytoplasm, enhances the permeability of the mitochondrial outer membrane thus stimulating apoptosis. However, treating Trp53 mutant MEF's with DNA demethylating agent caused a huge increase in the level of transcripts encoding short interspersed nuclear elements and other species of noncoding RNAs that generated a strong type 1 IFN response. This did not occur in p53 wild-type MEF's. Thus it appears that another function of p53 is to silence repeats that can accidentally induce an immune response.

This has several implications for how we understand self versus non-self discrimination. When pathogen-associated features were quantified, specific repeats in the genome not only display PAMP's capable of stimulating PRRs but, in some instances, have seemingly maintained such features under selection. For organisms with a high degree of epigenetic regulation and chromosomal organization immuno-stimulatory repeats release a danger signal, such as repeats released after p53 mutations. Here, immune stimulation may act as back-up for the failure of other p53 functions such as apoptosis or senescence due to mutation. This supports the hypothesis that specific repeats gained favor by maintaining non-self PAMPs to act as sensors for loss of heterochromatin as an epigenetic checkpoint of quality control that avoids genome instability generally. 

When P53 mutates it begins to fail its restraint of viral suppression, this enables a 'viral mimicry' and aberrant immune reactions. These may promote survival of cells that can leverage immunity, promote angiogenesis and heightened proliferation of cancers, or other diseases under modified conditions for non-self tolerance. 



Thursday, May 13, 2021

Non-Coding DNA Key Sequences

DNA Structural Inherency

Wind two strands of elastic, eventually it will knot, ultimately it will double up on itself. Separate the strands. From the point of unwinding, forces will be directed to different regions and the separation will approximately return to the wound state of the band. Do the same with each of 10 different bands or strings of any type, they will all behave in much the same way. For a given section of DNA being transcribed, the effect of separation will be much the same. For a given gene, there will be sequences that can tolerate force to greater or lesser degrees. For different transcripts, of a gene variation at those sequences may be crucial to the integrity of transcription machinery that separates DNA strands to initiate replication to RNA and for the outcome.

Cellular biology is enormously complex in all regards. The physics of molecular interaction, fluid dynamics, and chemistry combine in a system where cause and effect is near impossible to predict. At the most elementary level we hypothesize some non-coding DNA (ncDNA) possess structural inherencies that can be deployed to direct gene proteins and cell function for diagnosis or therapy.

Coding DNA and its regulatory, non-coding gene compliment is transcribed and spliced from a transcribed gene. Transcription to RNA, edited mRNA, spliced non-coding RNA and ultimately mRNA translation to protein can produce wide ranging, variable outcomes that may not be re-captured experimentally. 

A single nucleotide polymorphism (SNP) or SNP combinations within a gene may affect the finely tuned balance that results. Under different environmental conditions this could be material to the protein produced. Additionally other mutations of the gene could add complexity to the environment and/or the  resulting protein translation. 

At this level of cellular biology, genetic DNA stores instruction for protein assemblies to produce new protein required for the fully functional cell. However, DNA's stored mutations can lead to different functional or non-functional versions of protein depending on many different factors. Relationships between ncDNA, including mutations and the transcripts' edited, protein coding mRNA may represent unexplored inherencies that can regulate the gene's mRNA or translated protein.

We built an algorithm to elaborately compare ncDNA sequences of multiple protein coding transcripts of the same gene. For each transcript it steps through every variable length ncDNA sequence (kmer) (specifically intron1), computes a signature for each and indexes it to the constant of the transcripts' mRNA signature. For each step these signatures order the kmers for each of the transcript's. The order is represented in a vector of all the transcripts being compared.  

At millions of successive steps (depending on total intron 1 length's) transcripts mostly retain their vector ordering except, as expected at a kmer length change. Mostly transcript order in the vector does not change, occasionally a few positions change, vary rarely do all positions change. Position changes that cause another, like a domino effect are filtered out. For the rarest positions changes at a step, we look to the root causes in the kmer (sequence). We call this a Key Sequence because it is identified by the significance of changes to transcript positions in the vector compared to the vector at the next step. 

Therefore, Key Sequences cause the most position changes between transcripts being compared by the algorithm. This relative measure is step dependent and Key Sequences are discovered by comparing transcript positions in the vector at the next step location. Logically, this infers a genes structural inherency discovered through ncDNA Key Sequence relationships to mRNA, to other transcripts, error in gene alignments, sequenced reads or the algorithm. 

In assay testing we were able to predict and synthesize non-coding RNA Key Sequences that significantly reduced proliferation of HeLa cells. In our pre-clinical work, based on comparisons to transcripts of the TP53 we will be predicting the efficacy of cell and tissue selections that educate and activate Natural Killer cells.

If Key Sequences are inherent they could open a new frontier for diagnosis and therapy.








Saturday, February 13, 2021

Cell's with an Index like Google?

Its been a while since I last wrote about DNA repeats or their RNA descendants. In that time advanced research has emerged relating repeats to increasing numbers of viral or other disease. Generally the repeats of interest here can be either long or short sequences of nucleotides that from part of an unspliced gene. Logically, counts of long sequences that repeat would be less than short sequences, but when normalized to their respective nucleotide lengths the indexed results can shift the relative order of repeating sequences quite dramatically.

In most knowledge systems repeats in low level data present redundancy and opportunity to improve efficacy in local or global upstream processes acting on that data. We see this in the structure of efficient alphabets that had a significant impact on whether or not a language survived continuous use. Why use ten words when precise meaning, including abstracts can be derived from three. Or why alpha when, at least for some period in the language history alphanumeric made it more effective? 

Search engines reduce their primary index to the least redundant data set used to drive efficient data access by upstream requests and processes to satisfy any query. However, at the storage level, data redundancy is permitted because energy efficiency is gained. Similarly genetic DNA is massively redundant. Redundant data stores can make highly indexed systems more efficient because frequently accessed data elements are more accessible at multiple locations and parallel processes can more efficiently satisfy upstream requests.

Repetitive sequences constitute 50%–70% of the human genome. Some of these can transpose positions, these transposable elements (TE's) are DNA transposons and retrotransposons. The latter are predominant in most mammals and can be further divided into long terminal repeat (LTR)-containing endogenous retrovirus transposons and non-LTR transposons including short interspersed nuclear elements (SINEs) and long interspersed nuclear elements (LINEs). The most abundant subclass of SINEs comprises primate-specific Alu elements in human with more abundant GC-rich DNA. Humans have up to 1.4 million copies of these repeats, which constitute about 10.6% of the genomic DNA. Long interspersed element-1 (LINE1 or L1), are abundant in AT-rich DNA, constitute 19% of the human genome and make up the largest proportion of transposable element-derived sequences.

Most TE classes are primarily involved in reduced gene expression, but Alu elements are associated with up regulated gene expression. Intronic Alu elements are capable of generating alternative splice variants in protein-coding genes that illustrate how Alu elements can alter protein function or gene expression levels. Non-coding regions were found to have a great density of TEs within regulatory sequences, most notably in repressors. TEs have a global impact on gene regulation that indicates a significant association between repetitive elements and gene regulation.

In liquid systems, phase separation is one of the most fundamental phase transition phenomena and ubiquitous in nature. De-mixing of oil and water in salad dressing is a typical example. The discovery of biological phase separation in living cells led to the identification that phase-separation dynamics are controlled by mechanical relaxation of the network-forming dense phase, where the limiting process is permeation flow of the solvent for colloidal suspensions and heat transport for pure fluids. The application of this derived governing universal law is a step to understanding and defining the liquid biological indexing equivalence of data-processing systems and inherent genetic redundancy.

Repeats have been widely implicated. In plant immunity a TE has been domesticated through histone marks and generation of alternative mRNA isoforms that were both directly linked to immune response to a particular pathogen. p53 transcription sites evolved through epigenetic methylation, deamination and histone regulation that constituted a universal mechanism found to generate various transcription-factor binding sites in short TE's or Alu repeats. In disease cytoplasmic synthesis of Alu cDNA was implicated in age related macular degeneration and there is transient increase of nearly 20-fold in the levels of Alu RNA during stress, viral infection and cancer.

In chromosomal DNA, each sequence, relative to its length may conveniently describe a phase-separated indexed location and method for discovery. Repeats within genetic DNA may present precisely sensitive phase-separated guidance to drive histone, epigenetic and transcription factors to specific genetic locations at the cells' 'end-of-line' from where the genetic response to upstream membrane bound changes begin.