Data Collection: Gather a set of PubMed research articles. Preprocessing: Clean the articles by removing any irrelevant information. Transform the articles into a format suitable for analysis, much like creating a list of ingredients from a full recipe. Feature Extraction – Bag of Words: Extract frequently used words and phrases from the articles. This step will create our "Bag of Words", which will be like a word cloud highlighting the essential words that can hint at a topic. Statistical Analysis: Analyze the frequency and relationships of words to understand the main topics present in our collection of articles. Classification & Clustering: Sort the articles into predefined categories (classification) and discover new topic groups within the articles (clustering). Comparison with Pre-trained Models: Evaluate the effectiveness of our method by comparing it with established models like BERT, BART, DeBERTa, and GPT-2. It's like comparing a newly trained librarian with veteran librarians who have been sorting books for years.