-
- –
Please note that this conference will take place at the Fondation Victor Lyon, Cité internationale universitaire de Paris, at the “Cité Universitaire” stop (Tram/RER B).
Pangenomes capture population-level diversity as graphs that enumerate core and accessory variation, yet the explosive growth of sequencing makes these structures increasingly difficult to optimize and explore at scale. In parallel, Natural Language Processing (NLP) has transformed knowledge representation by learning from unlabelled corpora using self-supervised foundation models.
Taking a cue from this shift, we advocate learning directly from raw DNA using self-supervised objectives rather than attempting to enumerate all allelic routes. This talk distills how LLMs work and shows how they are being adapted to genomics within a growing ecosystem of models.
We illustrate capabilities with Evo/Evo2: zero-shot scoring of variant effects and gene essentiality, and generative design of promoters, operons, and multi-gene constructs.
We then outline the challenges, biases, and next research steps needed to ensure reliability for biological application.
After the conference, you are warmly invited to join the poster session presented by the students of the DU Création, analyse et valorisation de données omiques,
followed by a cocktail reception.
Guillaume Gautreau – Chargé de Recherches INRAE – Jouy-en-Josas,