Affichage des résultats 1501 à 1520 sur 1671 au total
Infectious diseases account for one fourth of human deaths worldwide. With pathogen collections of ever-increasing sizes, it becomes possible in theory to reconstruct past epidemic events on continental or worldwide scales and to gain actionable insights into the driving forces of pathogen dispersal and evolution. Current population genetics methods excel at this task, however their computational cost hampers their application on massive (n > 1,000) genomic datasets. Here we introduce a novel approach, ancestral state interpolation (AncSI), to reconstruct epidemics through space and time in a computationally efficient fashion. AncSI infers past information (including location, resistance or transmission success) relative to all isolates in the study population in a given time period. By computing series of fine-grained time period, AncSI allows for the visualization of epidemic dispersal in the form of video files. We reconstruct the epidemic progression across Eurasia and Africa of two deadly bacterial pathogens, namely the Mycobacterium tuberculosis Beijing family (n = 4,000 isolates with an evolution on the millenial scale) and the Salmonella Typhi H58 clone (n = 2,000 isolates with an evolution on the decade scale). In both cases, AncSI-inferred epidemic dynamics exhibited a near-perfect match with the conclusions of previous studies based on hypothesis-driven population genetics analyses. Furthermore, AncSI results highlighted previously unreported features of the epidemics such as a Korean (rather than Chinese) emergence of M.tuberculosis Beijing. Our results indicate that an accurate reconstruction of past epidemics can be obtained efficiently from genomic datasets, potentially leading to novel discoveries by leveraging the fast growing collections of pathogen genomes.
.
Thèse de Mathieu Fauvernier le 24 septembre 2019 à 14 h, salle des thèses, Rockfeller
Transposable elements (TEs) are now known to occupy a huge proportion of most multicellular eukaryotic genomes, yet the factors influencing their proliferation are only beginning to be investigated. In addition to giving a brief overview of their role in genome evolution, I will discuss recent work on a few important factors (sex, horizontal transfer, and transposition rates) that influence the coevolutionary dynamics between TEs and the host genomes they inhabit.
Basal-like breast cancers are among the breast cancers with the poorest prognoses and patients do not benefit from any targeted therapy yet. We aim to identify the deregulated signaling pathways using genomic, transcriptomic and proteomic (RPPA) data in order to identify therapeutic targets. In this talk, I will focus on the analysis of SNP and CGH data. More specifically, I will discuss several statistical and algorithmic challenges directly related to their statistical analysis.1) Normalisation One important issue when analyzing SNP profiles is their normalisation. Indeed, especially with tumour profiles, it cannot be assumed that most of the genome is normal and it has been shown that not taking these genomic alterations into account while normalising leads to over-correction. We propose a method to estimate the signal (or copy number) and correct technical artefacts simultaneously.2) Exact and Fast segmentation A CGH profile can be viewed as a succession of segments representing regions in the genome that share the same DNA copy number. Multiple-change-point detection methods constitute a natural framework for their analysis and the detection of breakpoints. However, recovering the optimal position of the breakpoints is not an easy task, especially for large SNP profiles such as Affymetrix SNP 6.0. We propose an algorithm to recover quickly the best segmentation (the maximum likelihood estimate).3) Assessing the quality of a segmentation Assessing the quality of a segmentation and in particular the confidence we have in a particular breakpoint is a difficult problem. We propose algorithms and statistical methods to assess and take into account the quality of possible segmentations.
RésuméJe profiterai de ce séminaire pour présenter aux différentes équipes mes travaux de recherche et les projets sur lesquels j'ai l'intention de m'investir à court et moyen terme.Dans un premier temps, j'expliquerai le contexte applicatif de mon travail, à savoir la recherche d'aberrations chromosomiques à l'aide de la technologie des microarrays CGH. C'est une technologie qui a été principalement développée pour ses applications à l'étude des génomes tumoraux, mais sa disponilibité a rendu possible l'investigation des défauts chromosomiques de petite taille sur les populations humaines. Ces études de génétique humaine ont permis la découverte de nouveaux marqueurs génétiques, les CNV pour Copy Number Variants qui concerneraient 12% du génome humain. En plus d'avoir permis d'établir de véritable portraits moléculaires des tumeurs, les microarrays CGH ont donné un éclairage nouveau à toutes les études classique de génétique humaine (association/liaison).D'un point de vue statistique, les modèles sur lesquels je travaille sont des modèles de segmentation, ou de détection de ruptures dans un signal gaussien. La difficulté nouvelle étant de segmenter plusieurs signaux simultanément afin de prendre en compte plusieurs patients. Cette problématique soulève deux questions majeures : une question de modélisation, comment modéliser des segmentations jointes, et une question algorithmique, comment estimer la position des ruptures en une complexité raisonnable.Dans un dernier temps, j'évoquerai l'application croissante de la technologie des tiling arrays aux CGH, et discuterai les problèmes méthodologiques posés par cette nouvelle technologie.
La révolution technologique que vit depuis ces 10 dernières années la recherche biomédicale permet de mettre à disposition de la communauté scientifique et médicale des outils très puissants pour appréhender de manière agnostique l'ensemble des mécanismes génétiques et épigénétiques associées aux maladies humaines, qu'elles soient fréquentes ou rares. Les premières technologies de puces à ADN ou à ARN ("micro-array") ont tout d'abord permis la réalisation des études d'association génome-entier ("GWAS") et des études transcriptomiques dont les succès sont désormais légions. Depuis 3 ans, la technologie des puces fait petit à petit place aux outils de séquençage haut-débit ("Next Generation Sequencing" ou "NGS") permettant non seulement de déterminer de manière exhaustive la variabilité génétique de l'ADN mais également, à partir d'un type cellulaire donné ou un échantillon de fluides (plasma, sérum, urine), de quantifier précisément l'ensemble des isoformes d'un gène exprimé, de détecter et de quantifier l'ensemble des ARNs non codants, mais également de mesurer le degré de méthylation de l'ADN.L'objectif de cette présentation est de présenter les grandes stratégies de recherches actuelles basées sur les différentes technologies haut-débit dans le domaine des maladies cardiovasculaires. Une attention toute particulière sera portée aux besoins en biostatistique et en bioinformatique que requièrent l'application de ces nouvelles technologies à la recherche en génétique et épigénétique.
-
Thèse de Martin Wannagat - Lundi 4 juillet 2016 - 14:00 - Amphi Lavoisier
The local protein composition of chromatin controls important processes such as transcription, replication and DNA repair, yet the diversity of chromatin and its distribution along chromosomes is still poorly characterized.Using DamID in Drosophila Kc cells, we generated high-resolution genome-wide binding maps of 53 chromatin proteins from a wide range of functional categories. For most of those proteins, no binding data was previously available.By constructing a non supervised classifier, we find that there are five principal chromatin types defined by unique yet overlapping combinations of proteins.Two types correspond to Polycomb and HP1-bound regions, respectively. The novel 'BLACK' chromatin type covers half of the genome and induces strong transcriptional repression on inserted transgenes. Remarkably, this chromatin type is devoid of the classic 'heterochromatin' proteins Polycomb and HP1. Thus, our data reveal the existence of a prominent repressive chromatin type that has largely been overlooked.Active genes are associated with one of the other two remaining combinations of proteins. H3K36 methylation is associated with only one of them, yet it was previously thought to mark every transcribed gene. In addition, active genes involved in growth and cell proliferation, and those involved in signal transduction are located in a distinct chromatin types.The five chromatin types modulate the interactions of transcription factors with DNA. We observe that most transcription factors bind their cognate motif only if it sits in the favored chromatin context. Our data rule out a simple exclusion mechanism but support a model whereby synergistic interactions target transcription factors to their binding site.Finally, genomic regions in the 5 chromatin types follow different evolutionary processes. The vast majority of synteny breaks with Drosophila pseudoobscura occurs in only one of the transcriptionally active types. Besides, the speed of evolution of genes located in that chromatin type is higher than for other types.In summary, our integrative approach identifies five major chromatin types, which are defined by unique combinations of proteins and have distinct functional properties.