Semantic Similarity, Phrase Analysis, and Expert Evaluation of Human versus LLM-Generated Abstracts

Semantic Similarity, Phrase Analysis, and Expert Evaluation of Human versus LLM-Generated Abstracts Journal of Digital Information Management Pit Pichappan 24 1 2026 https://doi.org/10.6025/jdim/2026/24/1/40-61 https://www.dline.info/fpaper/jdim/v24i1/jdimv24i1_3.pdf This research examines abstracts in scientific papers and how AI generates them. Abstracts are crucial to information use because they are the first sources researchers consult to decide whether a paper is worth checking out. The study analyses the abstracts from the December 2025 issues of Antioxidants and PLOS Computational Biology. These abstracts are generated by the authors themselves, ChatGPT, or Qwen. To evaluate, we used semantic similarity (Jaccard index), phrase occurrence frequency, and expert scores. It seems this covers quality, detectability, and what it all means for science writing. The results showed that the AI-generated abstracts were more similar to one another than to human generated abstracts. The mean Jaccard index was around 0.66 to 0.68 for the AIs compared to themselves, but lower with the author written stuff. That points to both AIs writing in a similar style, regardless. Domain specific terms appeared in both humans and AIs, but the way they used them, such as frequency and exact types, differed between ChatGPT and Qwen. Expert scoring assigned higher grades to AI abstracts based on clarity, structure, scientific sense, and originality or relevance. Qwen got a mean of 9.29, ChatGPT 9.02, while all human authored ones averaged 7.75. The ANOVA test reflects that both human and resulted in about 79 per cent of the variation in the scores. This finding suggests that AI can generate more impressive and comprehensive summaries than humans. Still, there are ethical problems to consider, such as how AI might fabricate references, spread misinformation, or even hijack peer review. The analysis estimates that 10 to 14 per cent of recent biomedical abstracts indicate AI assistance. It indicates that we need better ways to detect it, clearer rules, and to focus more on the actual research ideas rather than on how slick the writing is.