This site is currently under reconstruction.
Our e-book, Pseudocolor in Pure and Applied Mathematics, has been moved here.
Our 1990 work Using SVD on the Genetic Code is here.
Please reference anything you need from this website (in genomics) as: "The Structure of the Genetic Code", Version 1, ISBN: 978-0-9849696-1-6, by Douglas C. Youvan / firstname.lastname@example.org / 620-875-0108, which includes what is being written here. This e-book includes SAGC: Search for an Alternative Genetic Code. As our computational research advances, we will archive the current contents of this website and move on to Version 2.
YOUVAN Inc. formed on 1/6/2015. The company writes software in Mathematica for special uses in genomics. Several algorithms have been developed as part of SAGC to detect DNA sequences that are translated by a genetic code with major alterations. We predict that if an alternative code exists and it preserves the hydropathy features of Complementary Proteins, then the most likely changes will involve the assignment of amino acid residues to the codon's second position A/T. In our "Universal" code, 2nd position A codes for hydrophilic residues while T codes for hydrophobic residues. This assignment could change and preserve the overall hydropathy features of the code if it changed the assignment of hydrophilic / hydrophobic residues from second position A/T to T/A, G/C, or C/G.
While working on these algorithms and code, YOUVAN Inc. found a short-cut solution in linear algebra for the technique of Singular Value Decomposition (SVD). This short-cut method was dubbed "Youvan's Inverse" or YI on Wikipedia, and it was found equal to the SVD solution of a rectangular matrix if the rows of the matrix were a complete set of the elements of a Tuple. The digital representation of the genetic code happens to be a Tuple of A, C, G, T.
YOUVAN Inc. is active in building distributed computers running genomics code in Mathematica. The source code is open to our clients, and it is written in an expansive style. Thus it facilitates computation experiments in genomics at a research level in both molecular biology and mathematics.
Recently, we completed a program that plots all of the hydropathy profiles from an entire bacterial genome, based on nothing more than a knowledge of the sequence of the three stop codons. The genetic code is not used. Using the 3.7M base pair sequence of Rhodobacter capsulatus, we found that approximately half of the DNA is in runs of 800 base pairs or longer with no stop codons in either strand. Because of highly efficient algorithms and the way we have virtual memory setup on PCs, this entire process required only 3 minutes. If the code changes, such oscillations could be found in a channel other than 2nd position T-A. If stop codons change, we would search all 64 codons in pairs or triplets (or more) that yield the longest and most numerous putative complementary protein pairs. This is the second of two possible SAGC techinques. The primary technique is "Subset Homology", as described in the proposal. That code has yet to be written.
All three reading frames were analyzed for the 3.7M nucleotide Rhodobacter capsulatus genome using the 2nd codon position T-A method. This video was constructed with over 12,000 images, and when it is fully loaded, it will act like an oscilloscope with waveforms moving from right to left as the waveform progresses through the genome.
This video shows 500 frames of "Tuple Images", each a 3-tuple, just like the
genetic code. In this case, we are running random tuples against
themselves and then windowing particular values of RGB. An interface such
as this could become the bases of a new type of nucleotide BLAST search which
will be needed in the search for alternative genetic codes.