Probabilistic topic models are algorithms that aim to discover and annotate large collections of documents with thematic information without any prior annotations. They work by analyzing the statistical co-occurrence of words to identify topics, where a topic is a probability distribution over words. Documents are represented as mixtures of topics. For example, a document may have a 60% probability of being about biology, 30% about physics, and 10% about mathematics. Topics emerge from the statistical analysis and provide interpretable groups of correlated terms.