Learning from DNA sequences: from convolutional neural networks to transformer architectures
DNA is often referred to as the instruction book of life. No matter how much we agree, it nevertheless contains the description of the coding genes and a multiplicity of regulatory elements, from transcription factors to miRNAs. Furthermore, deep learning techniques have been developed and improved in recent years, from convolutional networks to recurrent and long short-term memory networks. The revolution in Neural Language Processing consisted of injecting the attention mechanisms in networks leading to more powerful transformer architectures.
These network types have proven to be particularly helpful for many tasks: DNA-protein binding prediction, methylcytosine prediction, non-coding-function prediction, enhancer identification, DNABERT, and AlfaFold.
The talk will focus on gene expression prediction from the DNA sequences, oncogenic probability prediction of gene fusion alterations, and Sars-Cov-19 variant identification from the Spike gene sequence.