This research applies large language models to decode and design proteins by treating amino acid sequences as biological languages. By identifying hidden structural and functional patterns across massive protein datasets, the work enables creation of novel proteins for medicine, cancer therapy, carbon capture, and environmental remediation beyond naturally evolved biological systems.
About 8% of the human genome originates from ancient viruses. This research uses bioinformatics and evolutionary comparisons to understand why viral DNA persists and how cells silence it through DNA methylation. Identifying how genomes separate useful from non-functional DNA helps clarify which genetic elements matter for human health and disease.