Since that time we have been conducting numerous laboratory tests in conjunction with Professor Noam Shomron at TelAviv University to confirm that sequences we identified from our TP53 bioinformatic produced predictable results that were precisely directed in cells.
From our initial results, it appears we can elicit an important relationship between intron1 and sequenced proteins of same transcripts. Further that these relationships are non-random and that they can be used to identify the highly specific DNA intron1 sequences that drive this non-randomness.
We previously published the chart below indicating that men1 k-mers ordered into a 15-variant transcript vector were producing length bias despite our algorithm being length agnostic. The scatter-graph is a plot of k-mers (15 variants) by intra transcript-repeats:length (horizontal axis) that gather into vector color bands by charting the k-mers repeats.
On the basis that repeats for (length)ATCG(count) would be expressed as (4)ATCG(3), (3)ATC(1), (3)TCG(1), (2)TC(3) and (2)CG(2), the count for (4)ATCG(3) equates with (2)TC(3).
15 variant men1 Intron1 Transcript - kmer repeats |
Relying on the unique ordering for each variants k-mer's in the transcript vector, we made selections of TP53 k-mers where variant order in vectors most significantly changed compared to the previous vector. For this we discovered that most disrupted vectors were caused by k-mers of very low lengths. Further, in comparison almost all vector positions in most vectors remained stable.
12 TP53 Intron1 Transcripts |