Genetic DNA are single acid nucleotide's stringed along a sugar-phosphate spine that winds around proteins, called histones and collapses into a chromosome assembly. At specific 'gene' locations DNA are often unwound and replicated into smaller, related RNA strings that can be incorporated by clustered proteins to attract and assemble amino acid combinations that may fold into functional proteins. Aqueous proteins aggregate in complex units and interact with DNA, RNA, amino acids and other proteins to build life on planet earth.
Entropy can disrupt the order of liquid-liquid phase separation (LLP) and other density based separations that govern events effecting DNA and are central to cellular bio-physics. Since the discovery of DNA in 1869 and its double helix structure in 1953, research has been directed to decipher the vast string assemblies of billions of these ordered acid combinations that govern cells of different species. Recently research has more beautifully described how orders of short repeating DNA sequences govern cellular mechanics and provides insight to the delicate balance in aqueous separations.
Chromosomes of cells that divide and replicate are tethered via centromere including concentrated short, ordered DNA combinations repeated at extending distances along the sugar-phosphate spine. They attract proteins and other epigenetic factors that may direct the cells centrosome - a protein tube geared to a vast cytoskeleton spindle to move chromosomes and the cells skeletal structure in response to activity on its centromere and distant regions.
Intron regions of genes are considered regulatory since exons or DNA coding regions, when replicated into RNA exclusively translate combinations of amino acids for protein. The intron regions of yeast centromeres were found to promote formation of centromeric heterochromatim - DNA wound around histones and methylated to repress regions and maintain lineage during replication.
A study of centromere heterochromatin surprisingly showed that distant euchromatic regions enriched in repressed methylated genes also interacted with the hierarchical organization of centromeric DNA. These 3D spacial interactions are likely mediated by LLP (similar to how oil and vinegar separate in salad dressing), resulting liquid-like fusion events and can influence the fitness of individuals. Repressed gene's were identified as Transposable Elements (TE's), sequences often associated with pathogenic DNA insertions that have been persistently retained.
A study found 96.3% of TEs enriched in 156 gene bodies overlapped introns, in line with the normally observed distribution of introns and exons in the human genome. Across cells in different tissues, genes that are consistently replicated are less likely to be associated with TE's. Multiple TE's in tissue-specific, active regulatory regions are enriched in intron enhancer sequences to attract and bind protein transcription factors as master replication regulators.
TE's have mostly been analyzed by the frequency of short identical repeating sequences, but methods have not revealed the full extent of the TE repeat hierarchy. When any part of TE's are replicated and released from their sugar-phosphate spine the hierarchy of repeats may effect dissociation. Codondex built a uniform analytic to tease out the inherent hierarchy of repeating sequences that may expose separation potential whether or not the DNA is classified as a TE.
As outlined, repressed DNA regions with more frequent repeats are less actively replicated into RNA. Therefore, actively transcribed regions yield more RNA for coding proteins and edited intron RNA can accumulate to concentrate in the liquid nucleus, be transported to the cytoplasm or be degraded. A cell's machinery must be finely tuned to process the RNA remnant of DNA replication, but mutations and aberrant separations can disrupt the order of these finely tuned micro-organisms.
If repeats define a universal separation hierarchy that is heavily weighted toward regulatory introns then de novo chromosome and gene repeat analysis may identify distant and centromeric influences to the centrosome. The iScore(TM) algorithm repeatedly explodes any DNA or RNA string into its ordered, theoretical hierarchy of repeats until the smallest required string length and may provide a structural basis for liquid separations. A repeat-hierarchy, for any gene would have to also relate to its chromosome repeats for inherent, universal influence over 3D spacial interaction and potentially cell function.
The complete record of repeats for an average length gene explodes to 100,000,000+ ordered strings representing its iScore signature. If a repeat hierarchy does exit for aqueous aggregations, a gene transcripts' intron iScore should be sufficient to measure and compare its inherent repeat potential to other transcripts. Significant consecutive iScore variations with any of the 100,000,000+ strings could be used to expose systemic, structural separation differences for that transcript in context of other transcripts in their aqueous environments.