Mapping Key Nodes and Global Trends in AI and Large Language Models for Medical Education: A Bibliometric Study

Kaining Lu; Shuben Sun; Wanzhang Liu; Junhui Jiang; Zejun Yan

doi:10.2147/AMEP.S538362

Back to Journals » Advances in Medical Education and Practice » Volume 16

Original Research

Mapping Key Nodes and Global Trends in AI and Large Language Models for Medical Education: A Bibliometric Study

Authors Lu K, Sun S, Liu W, Jiang J, Yan Z

Received 4 May 2025

Accepted for publication 2 August 2025

Published 14 August 2025 Volume 2025:16 Pages 1421—1438

DOI https://doi.org/10.2147/AMEP.S538362

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 4

Editor who approved publication: Prof. Dr. Balakrishnan Nair

Download Article [PDF]

Kaining Lu,^{1– 4,}^* Shuben Sun,^{1– 4,}^* Wanzhang Liu,^{1– 4} Junhui Jiang,^{1– 4} Zejun Yan^{1– 4}

¹Department of Urology, The First Affiliated Hospital of Ningbo University, Ningbo, Zhejiang, 315010, People’s Republic of China; ²Ningbo Clinical Research Centre for Urological Disease, The First Affiliated Hospital of Ningbo University, Ningbo, Zhejiang, 315010, People’s Republic of China; ³Translational Research Laboratory for Urology, The Key Laboratory of Ningbo, The First Affiliated Hospital of Ningbo University, Ningbo, Zhejiang, 315010, People’s Republic of China; ⁴Zhejiang Engineering Research Center of Innovative Technologies and Diagnostic and therapeutic Equipment for Urinary System diseases, Ningbo, Zhejiang, 315010, People’s Republic of China

*These authors contributed equally to this work

Correspondence: Zejun Yan, Email [email protected]

Background: Artificial intelligence (AI) and large language models (LLMs) are transforming medical education by enhancing teaching and assessment methods. Research output has surged, but key bibliometric trends remain underexplored.
Methods: We retrieved 547 publications using the Web of Science Core Collection and conducted bibliometric analysis with CiteSpace and other bibliometric tools to examine publication volume, collaboration networks, citations, keywords, and other important bibliometric indicators.
Results: The United States, the United Kingdom and China lead publication output, with institutions like the University of London, the National University of Singapore and Harvard University at the forefront. JMIR Medical Education is a pivotal journal. Research on ChatGPT and LLMs dominates, with growing focus on nursing education, digital health, medical exams, and medical ethics. Clinical reasoning, undergraduate education, and virtual reality have been identified as underexplored areas of research.
Conclusion: AI and LLMs in medical education constitute a fast-evolving field, with journal calls shaping its bibliometric landscape and advancing the discipline. The field remains in a developmental phase, with subfields yet to be clearly defined. Topics such as nursing education, digital health, medical examinations, and conversational agents are gaining traction. Research on ChatGPT and LLMs holds a central and influential role. Emerging areas of focus include medical ethics, training methodologies, and skills development. Clinical reasoning, undergraduate education, and virtual reality in AI and LLMs for medical education are understudied, offering research opportunities.

Keywords: bibliometric analysis, artificial intelligence, large language model, medical education, ChatGPT

Introduction

The application of artificial intelligence (AI) and large language models (LLMs) in medical education is expanding rapidly, revolutionizing teaching methods, assessment tools, and learning experiences. The integration of AI and LLM technologies enhances the efficiency and personalization of education while introducing new research directions and challenges in medical education.¹ AI is reshaping healthcare, from diagnostics to robotic surgery. A survey of Indian medical students revealed 74.8% supported structured AI training, 72.3% believed AI reduces medical errors, yet only 3.7% felt confident explaining AI to patients.² In recent years, numerous studies have investigated AI and LLMs applications in areas such as medical examinations, curriculum design, medical ethics, and even generative illustration studies in medical education.^3–5 The emergence of ChatGPT has sparked intense discussions in the field, with some scholars emphasizing that clinicians should develop AI and LLM literacy and recognizing AI and LLMs as the next frontier of innovation in healthcare.⁶ Despite the growing body of research and literature, the patterns and potential insights within these documents and their extensive references have often been overlooked.

Bibliometrics, the quantitative analysis of scientific output through mathematical and statistical methods, enables researchers to measure connections between studies, research areas, and significant events.⁶ This approach provides insights into the growth of publications and the flow of knowledge within a field over time, using data from web databases such as citations, keywords, countries, or institutions.⁷ It can also analyze thousands of articles to uncover patterns and emerging trends.⁸ In particular, citations and terms are the most commonly used bibliometric data, with co-citation and co-word analyses being the most widely applied methods. Co-citation analysis explores the structure of scientific knowledge by identifying relationships when two articles are cited by a third article.⁹ Co-word analysis, a content analysis technique based on term co-occurrence, reveals current hotspots and emerging trends by analyzing literature terms.⁷

Using bibliometric methods, this study has the following objectives: 1) Mapping the global publication distribution of leading countries and institutions in AI and LLMs for medical education; 2) Investigating the knowledge base, research themes and their evolution by network analysis; 3) Visualizing the distribution of research hotspots and key nodes of knowledge flow, alongside analyzing development trends; and 4) Examining bibliometric patterns, gaps, and journal influences shaping the early development of this emerging field.

Methods

Data Collections

Search Strategy

The search strategy is as follows: (artificial intelligence OR ChatGPT OR machine learning OR Large Language Model OR AI OR Chat Generative Pre -Trained Transformer OR Neural Networks OR GPT) AND ((medical education) OR (medical students) OR (clinical education) OR (medical class) OR (medical curriculum) OR (medical graduate student) OR (graduate student in medicine) OR (rotating physician) OR (medical schools) OR (intern nurse) OR (medical teacher) OR (nursing education) OR (medical training) OR (clinical skills training) OR (clinical operations training) OR (surgical skills training) OR (nursing skills training) OR (medical skills training) OR (intern physician) OR (medical intern) OR (clinical training) OR (surgical training) OR (medical educator) OR (nursing curriculum)).

Data Inclusion, Exclusion and Download

In this part, the research data in this part were downloaded from the Web of Science Core Collection database. To obtain more accurate screening results, we compared the results obtained from topic retrieval and title retrieval on a case-by-case basis. We found that the results from title retrieval had the highest relevance. Therefore, we chose title retrieval as our primary retrieval strategy. We searched the data by title retrieval. As of September 21, 2024 we retrieved a total of 652 publications published since 1986. To ensure the accuracy of the analysis sample, the research team developed the following inclusion and exclusion criteria for the literature:

Inclusion Criteria

Thematic Relevance

Literature must focus on AI (including machine learning, ChatGPT, large language models, neural networks, etc.) applications in medical education, covering curriculum design, student training, clinical skills teaching, nursing education, or medical ethics, among other aspects.

Publication Type

Only peer-reviewed original research or review articles were included to ensure academic rigor.

Language

Literature must be published in English for analysis consistency and international comparability.

Exclusion Criteria

Thematic Irrelevance

Literature unrelated to AI or medical education, such as general medical AI or non-educational AI applications.

Non-Peer-Reviewed

Conference abstracts, editorials, commentaries, letters, or non-peer-reviewed publications were excluded.

Non-English

Non-English publications were excluded for language consistency.

Two researchers independently reviewed the retrieved results, with disputes resolved by the corresponding author. Included publications were selected and downloaded for analysis.

After a thorough review, we excluded 105 publications that were not relevant to this field., ultimately including 547 publications in our analysis.

Analysis by Citespace Software

We used Citespace6.3.R1, developed by Chaomei Chen, is a prominent knowledge visualization tool used for mapping literature across various educational fields.⁸ It employs co-citation analysis of references, authors, and journals, along with co-occurrence analysis of authors, keywords, institutions, and countries, to create scientific knowledge network maps and identify research trends and hot spots.

This study examines the national and regional distributions and collaborations among authors in AI in medical education by constructing a network cooperation map. We identified the knowledge base and core authors through literature and author co-citation networks and pinpointed leading journals via co-citation analysis. Prominent keywords were determined through keyword co-occurrence and clustering analysis, focusing on their frequency and centrality to explore global research trends.

Our methodology included cooperative network analysis, revealing core authors, leading research institutions, and national/regional collaborations in AI and LLMs in medical education. Nodes in the visual representation are depicted as circles, with larger circles indicating a greater number of items, such as papers and authors. Betweenness centrality in CiteSpace serves as a key indicator of node importance, with values above 0.1 indicating significant nodes. The circle size reflects citation frequency, with deeper purple circles denoting higher betweenness centrality, thus suggesting greater importance in the field.⁸

Co-citation analysis explored relationships between cited articles, authors, and journals. A co-citation relationship is established when two articles are cited by a third, indicating thematic connections. This analysis ranks key papers by citation frequency and elucidates their relationships through centrality values, identifying literature clusters and revealing research hot spots. The frequency and relevance of citations characterize significant research trends over time, forming a foundational knowledge base for emerging topics in AI in medical education. We utilized ArcGIS software in conjunction with CiteSpace software to create a world map illustrating the volume of publications.

Analysis by Bibliometrix Tool

Bibliometrix is an advanced analysis tool based on R package for comprehensive bibliometric analysis, enabling researchers to conduct systematic reviews and quantitative studies of scientific literature.⁹ It allows for data extraction from various bibliographic databases, facilitating the analysis of citation patterns, co-authorship networks, and thematic trends. With its visualization capabilities, Bibliometrix helps scholars clearly depict complex bibliometric relationships and insights, making it a valuable tool for assessing the impact and evolution of research fields over time. In this study, we utilized the Bibliometrix online analysis toolkit to generate word clouds, conduct Conceptual Structure Map (CSM) analysis, and build strategic coordinates in the field of AI and LLMs in medical education. This approach allowed us to identify current research hotspots and trends effectively.

Results

Temporal perspective, The number of publications in the field of AI and medical education has shown a significant upward trend (Figure 1A). This trend underscores the increasing importance of artificial intelligence technologies in medical education. After screening and selecting literature, the search results indicate that publications on this topic span from 1986 to 2024, with a consistent overall increase in the number of publications. During the initial phase from 1986 to 2018, the growth of literature was relatively slow, with the global annual publication rate remaining in single digits. However, from 2019 to 2024, there has been an explosive increase in the number of publications. This surge is likely closely related to the rapid advancements in artificial intelligence technologies and their widespread adoption in medical education. The map showing the number of publications by country, generated using ArcGIS software, is presented in Figure 1B. From a global perspective, Table 1 and Figure 1B clearly illustrate the current state of research in the field of medical education. Artificial intelligence is emerging as a key technology driving innovation in global medical education. The United States leads with 178 research studies, followed closely by the United Kingdom, China, and Canada, highlighting these countries’ advancements in integrating AI and LLMs into medical education. The rapid growth of Asian countries such as China and India indicates the region’s rising prominence in this field. Additionally, developed countries like Germany and Australia continue to contribute stable research outcomes.

Table 1 Distribution of Publications by Country

Figure 1 (A) Annual Publication Counts. (B) World Map of Publication Volume Generated Using ArcGIS and CiteSpace.

Distribution of Disciplines and Journals

From the perspective of the Web of Science (WOS) discipline distribution, the research literature on artificial intelligence and medical education encompasses a total of 68 categories. Table 2 presents the distribution of these publications across major disciplinary categories. According to the data in Table 2, research on artificial intelligence in medical education not only spans multiple disciplines but is primarily concentrated in four main areas: Science Education, Internal Medicine, Health Care Sciences and Services, and Educational Research. A smaller portion of the research is focused on Nursing, Radiology, Nuclear Medicine and Medical Imaging, Surgery, Medical Informatics, Computer Science (Artificial Intelligence), and Biomedical Engineering.

Table 2 Distribution of the Top Ten Disciplines

In the journals indexed by the Web of Science (WOS), research literature in the field of artificial intelligence and medical education from 1986 to 2024 encompasses a total of 547 publications. Among these, the top 10 journals account for 168 articles, representing 31% of the total literature, indicating their significant influence and central role in this field. These journals are characterized by relatively high academic quality and impact factors, contributing to the advancement and leadership of research in artificial intelligence and medical education.

The three journals with the highest publication volumes in this field are as follows (Table 3): JMIR Medical Education, with 37 articles, focuses on the application of technology, innovation, and openness in medical education, including e-learning and virtual training, with particular attention to developing healthcare professionals’ abilities to use digital tools in the post-COVID world; BMC Medical Education, with 25 articles, covers various aspects of medical education, including teaching methods, assessment strategies, online learning tools, simulation training, curriculum design, continuing medical education, and the application of educational technologies in medical training; and Cureus Journal of Medical Science, with 22 articles, is an open-access journal dedicated to publishing high-quality research in the medical field through a rapid peer-review process.

Table 3 Distribution of the Top Ten Journals

These journals not only exert considerable influence in the realm of artificial intelligence and medical education but also maintain high academic quality and impact factors, making them vital platforms for researchers to publish and access the latest findings. With the ongoing advancement of artificial intelligence technologies, it is anticipated that research activities within these categories will continue to grow, further driving innovation and development in medical education.

Author and Institutional Collaboration

Author Collaboration

From the perspective of collaboration, the research field of artificial intelligence and medical education exhibits a diverse and interdisciplinary cooperative landscape (see Figure 2A). According to the data presented in Table 4, the most prolific collaborator is Viroj Wiwanitkit, who has partnered on 7 publications. Wiwanitkit actively promotes interdisciplinary collaboration by integrating artificial intelligence technologies to innovate educational methods and develop intelligent educational tools to enhance the learning experience, thereby advancing the modernization of medical education. Following him is Amnuay Kleebayoon, who has collaborated 5 times. He has made significant contributions in the realms of intelligent robotics, ChatGPT, and medical education, actively facilitating the deep integration of artificial intelligence into medical training.

Table 4 Top Ten Authors in Collaborative Publications

Figure 2 (A) Authors Collaboration Network. (B) Institutional Collaboration Network.

Warren M. Rozen and Haniye Mastour are tied for third and fourth place, each with 3 collaborations. Their research involves interdisciplinary projects that apply artificial intelligence technologies to areas such as medical image analysis, patient data management, and personalized medical education. This data highlights how researchers in the field of artificial intelligence and medical education work together to address complex scientific challenges and how teamwork fosters the development and application of technologies. Such collaboration not only enhances the quality and impact of research but also cultivates the next generation of medical education experts capable of effectively integrating artificial intelligence technologies into medical training and practice.

Institutional Collaboration

From a collaborative perspective, a robust network of cooperation has been established among research institutions in the field of artificial intelligence and medical education. As illustrated in Figure 2B and detailed in Table 5, the University of London leads with a collaboration frequency of 18, highlighting its pioneering role in fostering the integration of artificial intelligence into medical education. Following closely is the National University of Singapore, which ranks second with 16 collaborations, reflecting not only its deep engagement in interdisciplinary partnerships but also the synergistic effects of its various institutions. Harvard University ranks third with 15 collaborations, further confirming its active position in global research partnerships.

Among the top ten institutions (Table 5), six are located in the United States, underscoring the country’s significant influence in the research and education sectors related to artificial intelligence and medical education. This prominence can be attributed to the U.S.’s strong research infrastructure, ample funding support, and policies that encourage innovation and entrepreneurship. Additionally, all top ten institutions are from developed countries, such as the United States, the United Kingdom, and Singapore. This observation reflects the comprehensive advantages of these nations in terms of economic strength, research capabilities, educational quality, and policy support, which collectively lay a solid foundation for research and innovation in the field of artificial intelligence and medical education.

Table 5 Top Ten Institutions in Collaborative Publications

The close collaboration among these research institutions not only demonstrates their active engagement in the field but also suggests that, with the promotion of interdisciplinary cooperation, the sector is poised to witness further innovations and breakthroughs in the future. Through such partnerships, experts from various domains can complement each other and collaboratively explore new research directions and application scenarios, thereby accelerating the progress and development of the entire field.

Highly Cited Authors

Table 6 lists the top ten most frequently cited scholars in the field of artificial intelligence and medical education. Leading the list is Pinto Dos Santos D, who has achieved an impressive citation count of 71. Following closely is Tiffany Kung, with a citation count of 66. Ranking third is Ketan Paranjape, with a citation count of 54. These data underscore a core principle: researchers who achieve significant academic impact in the domain of artificial intelligence and medical education must be committed to fostering deep integration and innovation between artificial intelligence and the medical field to promote the widespread application of AI and LLM technologies in medical practice.

Table 6 Top 10 Highly Cited Authors

Co-Citation Network Analysis

The co-citation analysis of all articles reveals the current research hotspots in the field. As shown in Figure 3A, the most cited article is “Medical Students’ Attitude Towards Artificial Intelligence: A Multicentre Survey”, which has garnered 71 citations.¹⁰ This article primarily investigates medical students’ attitudes toward the application of artificial intelligence (AI) in healthcare, exploring their acceptance of AI technologies, perceptions of potential applications, and the demand for AI integration in medical education. The findings reveal both positive and negative views held by medical students regarding AI, as well as various factors influencing their attitudes. These insights provide medical educators and healthcare policymakers with guidance on how to better integrate AI education and training to prepare future healthcare professionals for effectively utilizing AI technologies.

Figure 3 (A) Co-citation Network of References of AI in Medical Education. (B) Keyword Co-occurrence Network. (C) Keyword Network Cluster Analysis Map. (D) Keyword Timeline Visualization in CiteSpace.

The second most cited article, “Performance of ChatGPT on USMLE: Potential for AI-Assisted Medical Education Using Large Language Models”, has received 66 citations.¹¹ This study examines the performance of the large language model ChatGPT on the United States Medical Licensing Examination (USMLE), reporting that it achieved scores that met or approached passing standards without specialized training. This finding highlights the potential for applying AI in medical education, particularly in terms of providing consistent explanations and insights. The research offers preliminary evidence for the application of AI in both medical education and clinical decision-making support.

Ranking third is the article “Introducing Artificial Intelligence Training in Medical Education”, with 54 citations.¹² This article discusses the necessity of incorporating AI training into medical education. As medical practice transitions into the era of AI, the use of data is expected to increase, thereby driving the need for proficient interactions between medicine and technology. The article emphasizes the importance of comprehensive training for medical professionals regarding AI technologies, including their advantages in enhancing cost-effectiveness and accessibility in healthcare, as well as potential drawbacks such as transparency and accountability issues. It proposes a framework for seamlessly integrating AI education across various aspects of the curriculum, from medical school admissions to clinical internships and continuing medical education. Furthermore, the article discusses how AI can transform medical practice and patient outcomes, as well as the range of skills future physicians will need, including mathematical concepts, foundational knowledge of AI, data science, and related ethical and legal considerations.

Collectively, these articles explore the application of AI and LLMs in medical education and its significance for training medical students. They emphasize that as the medical field enters the AI era, reforms in medical education are essential to incorporate AI technologies, ensuring that future healthcare professionals can understand and effectively utilize AI tools to improve the quality, cost-effectiveness, and accessibility of healthcare. Additionally, the articles highlight the challenges faced in integrating AI into medical education, including issues related to technological understanding, ethics, legal considerations, and the need for reform within existing educational systems.

Keywords Network Analysis

Data from 547 articles obtained from the Web of Science were imported into CiteSpace, with the time slicing set from 1986 to 2024, each year representing one partition. The node type was selected as keywords, and the top 50 keywords from each time partition were extracted. The resulting co-occurrence map of keywords is illustrated in Figure 3B. The keyword co-occurrence analysis indicates that “artificial intelligence” ranked first with a frequency of 249 occurrences, followed by “medical education” with 130 occurrences, and “medical students” with 52 occurrences. Additionally, “machine learning”, “large language models”, and “nursing education” appeared 48, 21, and 18 times, respectively. These high-frequency keywords encompass various aspects of medical education, including educational technology, student training, data processing techniques, and learning outcomes.

These prominent keywords not only reveal the current research landscape at the intersection of artificial intelligence and medical education but also provide insights into future research directions. By analyzing these keywords and their interrelationships, researchers can gain a better understanding of the developmental trends in artificial intelligence and medical education, offering valuable references for future studies and practices. Such research endeavors not only aim to enhance the quality of medical education but also have the potential to positively impact the learning experiences of medical students and the teaching methodologies of educators.

Keyword Clustering Analysis

A cluster analysis was conducted on the high-frequency keyword co-occurrence map, resulting in a keyword clustering diagram. Cluster analysis showed a Q value of 0.5059 and S value of 0.8216, indicating significant community structure (Q > 0.3) and high clustering reliability (S > 0.7). Clustering details are displayed in the top-left corner of the cluster analysis figure.⁸ In the research domain of artificial intelligence and medical education, the high-frequency keywords were classified into 23 categories. For the purpose of analysis, we selected the top seven clusters with over 20 articles each, as illustrated in Figure 3C. The timeline analysis part (Figure 3D) illustrates the evolution of these clusters and their internal nodes over time. In Figure 3D, we observe a significant increase in the aggregation of thematic nodes around 2020, marking the first notable shift in the field’s research direction. From a chronological perspective, the hotspots in this field have transitioned in recent years from deep learning and big data to machine learning, and most recently to the evolution of knowledge and technology in generative large language models.

These clusters are as follows:

#0 Machine Learning

#1 Artificial Intelligence in Medicine

#2 Artificial Intelligence

#3 Generative AI

#4 Medical Student

#5 Future

#6 Nursing Education

This clustering provides a structured overview of prevalent themes and trends within the field, facilitating a deeper understanding of the interconnections among key concepts related to artificial intelligence and medical education. Below, we present an analysis of three particularly representative clusters.

In this section, we provide a detailed illustration of the 0–2 cluster and directly interpret the 3–6 cluster using semantics.

#0 Machine Learning

This cluster contains 53 keywords and has been established from 1993 to the present. It primarily encompasses three subthemes: medical education (130 articles), machine learning (48 articles), and performance (13 articles). Representative articles in this cluster include “ChatGPT: Reshaping Medical Education and Clinical Management” and “The Rise of ChatGPT: Exploring Its Potential in Medical Education.” These works collectively examine the potential of ChatGPT in medical education and its impact on clinical management. As an advanced natural language processing model, ChatGPT can provide personalized learning support, aiding students in understanding complex concepts, simulating clinical scenarios, and playing a significant role in curriculum development. However, the studies also highlight challenges associated with using ChatGPT, such as content accuracy, ethical issues, and the potential for academic dishonesty. Researchers suggest that, while ChatGPT holds promise for enhancing student engagement and learning outcomes, its application must be approached with caution to ensure effective integration and responsible use in medical education.

#1 Artificial Intelligence in Medicine

This cluster consists of 34 keywords and has emerged from 2018 to the present. It mainly includes three subthemes: health care (13 articles), health (9 articles), and big data (9 articles). Representative articles in this cluster include “Medical Education Trends for Future Physicians in the Era of Advanced Technology and Artificial Intelligence: An Integrative Review” and “A Virtual Counseling Application Using Artificial Intelligence for Communication Skills Training in Nursing Education: Development Study.” These articles explore the positive roles of AI in medical education, such as providing personalized learning experiences, simulating clinical environments, assisting in teaching and assessment, and improving learning efficiency. They also address challenges that need to be considered when implementing AI and LLM technologies, including privacy protection, potential misuse, ethical concerns, and interdisciplinary integration. These studies offer valuable insights into the future of medical education and provide guidance on how to leverage AI technologies to train the next generation of healthcare professionals.

#2 Artificial Intelligence

This cluster comprises 3 keywords and has been established since 1998. It mainly includes three subthemes: artificial intelligence (249 articles), care (5 articles), and AI and LLMs in medical education (3 articles). Representative articles in this cluster include “Noninterpretive Uses of Artificial Intelligence in Radiology” and “Synthesis of Diagnostic Quality Cancer Pathology Images by Generative Adversarial Networks.” These studies collectively emphasize the necessity of integrating AI into medical education and highlight medical students’ positive attitudes toward AI technologies and their expectations for future medical practice. At the same time, they reveal students’ uncertainties and potential concerns regarding the application of AI in the medical field.

Evolution of Research Themes

Based on the operations performed in Figure 4, we modified the parameters in the Control Panel by selecting “Burstness.” We set the Minimum Duration to 1 and adjusted the Y-value to 0.3. After clicking “Refresh”, we obtained a list of 25 keywords, as illustrated in Figure 4.

Figure 4 Keywords Burst Activity Detection Timeline, Top 25 Keywords with the Strongest Citation Bursts.

From Figure 4, we can conclude that research in the field of artificial intelligence and medical education can be broadly divided into two periods. The first period spans from 1998 to 2019, with keywords including: care, artificial neural networks, medical training, health, validation, outcome, future, big data, deep learning, students, classification, and augmented reality. This initial phase represents the foundational stage of research in artificial intelligence within medical education, with a focus on basic AI technologies, medical training, and health management.

The second period extends from 2020 to the present, with keywords such as: medical school, deep, system, surgery, cancer, machine learning, medical student, virtual reality, skills, digital health, conversational agent, language model, and USMLE Step 1. The keywords from this period indicate a deepening of research, particularly in the application of emerging technologies such as medical imaging, virtual reality, and machine learning, highlighting the significance of artificial intelligence in clinical skills training and medical education.

This evolution in keywords not only reflects the trends in technological development but also reveals the dynamic changes in medical education as it adapts to new technologies and methodologies. This provides important insights for future research directions.

Conceptual Structure Map (CSM) and Strategic Coordinates Analysis

Factorial Analysis in the Conceptual Structure Module of Bibliometrix

In the Conceptual Structure Module of the Bibliometrix toolkit, we performed a factorial analysis using the default algorithm, Multiple Correspondence Analysis (MCA). The analysis was conducted with a default of 50 terms and a minimum of 5 documents. This process resulted in a coordinate system encompassing Dimension 1 (Dim 1) and Dimension 2 (Dim 2).

Dim 1 typically represents the significance of themes or keywords within a specific field, reflecting their influence, prevalence, or concentration in research. A higher Dim 1 value indicates that a theme is widely studied or frequently cited in the literature. Conversely, Dim 2 generally captures the depth, developmental stage, or historical context of related research. A higher Dim 2 value suggests that a theme has recently gained attention or that the research in that area has reached a certain level of maturity.¹³

In the Conceptual Structure Map (CSM) (Figure 5A), the relative positioning of keywords or themes reveals important insights; for instance, themes located in the upper right quadrant are often considered cutting-edge and trending, as they exhibit both high significance and development. By analyzing Dim 1 and Dim 2 within the CSM, researchers can identify emerging trends and potential future directions in the research landscape.

Figure 5 (A) R Studio – factorial analysis and conceptual structure mapping using multiple correspondence analysis (MCA). (B) Visualization of thematic mapping.

In this Conceptual Structure Map, the themes located in the upper right quadrant, including large language models, ethics, and nursing education, occupy a potential hotspot and frontier position. Within the quadrant, “undergraduate medical education” has recently garnered attention, although the volume of literature may be relatively insufficient. Conversely, themes such as “medical training”, “virtual reality”, and “surgical education”, situated in the lower right quadrant, typically possess a certain level of scale or impact, yet have not received adequate attention in recent times.

Thematic Map Analysis Using Bibliometrix

In this study, we utilized the thematic map function within the Bibliometrix toolkit to conduct a network analysis focused on author keywords. We constructed a co-word network to analyze the relationships and occurrences of these keywords, which facilitated the visualization of interconnected research themes.

For clustering, the tool automatically selected the Walktrap algorithm, which is effective for identifying community structures within networks. Additionally, the minimum cluster frequency was set to 5 per thousand documents as a default parameter. This setting helped ensure that we focused on significant clusters that reflect substantial themes within the literature.¹⁴

The outcome of this analysis was a thematic map that delineates the strategic coordinates of the identified research themes (Figure 5B). This map not only highlights established areas of research but also reveals emerging trends, providing valuable insights into the current landscape and potential future directions for scholarly inquiry.

Generative language models like ChatGPT have established foundational and driving themes in the field, while topics such as clinical reasoning and undergraduate medical students remain underexplored. In contrast, themes such as medical training and medical ethics may be considered emerging topics.

Discussion

The field of AI and LLMs in medical education has experienced remarkable growth in recent years. Since the 1980s, researchers attempted to leverage natural language interfaces to develop AI programs for medical education, focusing on diagnosis, teaching methods, and gross anatomy.¹⁵ However, these early efforts did not reflect AI in its modern sense. The field remained largely dormant for decades until the advent of generative models like ChatGPT reignited interest. AI in medical education is now emerging as a distinct discipline, with applications in simulating clinical scenarios to enhance students’ practical skills and adaptability, as well as supporting educators in optimizing curriculum design and teaching methodologies. The accumulation of literature in this field has reached a substantial scale, necessitating specialized bibliometric methods to elucidate its characteristics. This study represents the first comprehensive bibliometric analysis of this domain.

In recent years, the integration of AI and LLMs into medical education has attracted significant attention, promising to transform the training of medical professionals. Yet, its developmental trajectory remains uncertain. As we navigate this new frontier, it is essential to weigh both the potential benefits and the challenges posed by evolving AI technologies in the educational landscape. The principal findings of this study are summarized in Figure 6. In co-citation network analysis, betweenness centrality serves as a critical metric for assessing the intermediary role of nodes in information dissemination and knowledge flow.¹⁶ Nodes with a betweenness centrality of 0.1 or higher typically act as pivotal hubs, connecting diverse research groups and facilitating knowledge exchange.¹⁷ Our analysis identified 10 authors and 4 institutions with betweenness centrality values of 0.1 or greater in the author and institutional co-citation networks, underscoring their roles as key connectors fostering collaboration across research teams.

Figure 6 Summary diagram of the main findings of this research.

In contrast, no nodes in the co-citation network exceeded a betweenness centrality of 0.1, indicating that the field is still in its early developmental stage and lacks robust interdisciplinary connections or subfields. However, six citations approached a betweenness centrality of 0.1, suggesting their emerging significance. We anticipate that these citations will evolve into critical knowledge bridges, fostering connections across research directions in the future.

Notably, the article with PMID 37115527, exhibiting a betweenness centrality of 0.09 (Supplementary Table 1, No.34), holds potential as a knowledge hub linking various subfields.¹⁸ Despite receiving only 18 local citations and lacking a direct focus on medical education, this cross-sectional study demonstrates that AI chatbots outperform physicians in delivering higher-quality and more empathetic responses to patients. By providing objective evidence of AI’s capabilities, this work lays a foundational knowledge base for AI and LLMs applications in medical education and may bridge diverse subtopics in future research.

Another study, published on June 9, 2023 also with a betweenness centrality of 0.09, is a high-impact review.¹⁹ This article synthesizes the role of generative language models in medical education, highlighting their benefits alongside challenges, including the quality of AI-generated content, biases in AI systems, ethical and legal concerns, and risks of academic dishonesty. These issues are likely to be pervasive across AI research, reinforcing the study’s potential as a bridging node in citation networks. Similarly, another review with a betweenness centrality of 0.07¹⁹ advocates for longitudinal studies in medical schools to equip students with AI knowledge. Its high betweenness centrality, typical of reviews in emerging fields, underscores its dominant position in the literature landscape.

Beyond the influence of generative models like ChatGPT, an often-overlooked factor is the role of journal calls for papers, which act as catalysts in the early stages of a discipline’s development. In nascent fields, academic dynamics are more readily discernible over short periods. For instance, JMIR Medical Education issued calls for papers on this topic in 2023, and articles resulting from these calls exhibit significant bibliometric characteristics within the citation network. In contrast, in mature disciplines, the impact of journal calls may be diluted by the sheer volume of literature.²⁰ Compared to established fields, bibliometric analyses of emerging disciplines are relatively scarce. Our findings suggest that in the formative stages of a new field, journal calls for papers can profoundly influence key bibliometric characteristics, shaping the foundational knowledge of the discipline.

In the conceptual structure and thematic mapping sections, we categorized trending topics into hot and cold themes, with results summarized in the article’s conclusion figure (Figure 6). Medical ethics emerged as a potential area for future research. As early as 2021, scholars proposed integrating AI ethics education into medical curricula to prepare students for informed decision-making in practice.²¹ However, ethical challenges in medicine remain a critical hurdle for AI applications in medical education. Clinical reasoning, another under-researched area, is currently overshadowed by studies focusing on AI’s role in examinations, with limited attention to its impact on medical students.²² Thematic Map Analysis’s strategic coordinates show “clinical reasoning” in the Niche Themes quadrant (Figure 5B) with low word frequency capacity, evidencing its insufficiency. Nursing education’s rise post-2020 may stem from AI’s simulation for clinical scenarios, heightened by pandemic-driven remote learning needs. The COVID-19 pandemic highlighted the nurse shortage crisis, accelerating the adoption of artificial intelligence in nursing to optimize care processes, manage pandemic-related data, and enhance service quality, significantly driving the field’s rapid development.²³ The Josiah Macy Jr. Foundation’s funding and conferences²⁴ foster academic-industry partnerships with companies like OpenAI and Google, promoting AI literacy curricula and ethical frameworks to support precision education. Industry collaborations, as highlighted in Macy’s reports,^25,26 drive innovation through data-sharing networks and open-source initiatives, mitigating resource inequities and biases in AI applications. Policy makers, supported by funders like the NIH and Macy Foundation, are encouraged to establish governance for safe AI integration, ensuring equitable access and robust research agendas for medical education.^25,26 The rapid progress of AI in medical education is driven by cross-disciplinary collaboration. Industry partnerships, via data sharing and funding, such as AIMI’s algorithm development, advance AI research for educational applications.²⁷ Policymakers, through programs like AIMI’s online courses, foster ethical AI integration in medical education.²⁷ The advancement of AI in medical education hinges on cross-disciplinary funding and collaboration. Compared to AI research in computer science and engineering, educational AI research receives less funding and started later, yet medical robotics is favored in NSF and NIH funding, suggesting potential for interdisciplinary collaboration and broader funding advantages in medical education.²⁸

Limitations and Shortcomings

This study employs a bibliometric analysis based on a targeted title search within the Web of Science Core Collection (WoSCC). While this approach ensured high precision in selecting relevant studies, it limited the scope of topic coverage and sample size compared to broader search strategies or the inclusion of additional databases. Nonetheless, our findings remain robust and valid for the curated dataset. As the literature on AI and LLMs in medical education continues to expand, future research should conduct more comprehensive bibliometric analyses and publish updates regularly to capture the field’s evolving landscape.

Conclusions

AI and LLMs in medical education have rapidly emerged as a significant field in recent years. The journal JMIR Medical Education stands out as a key node due to its prominent bibliometric characteristics. During the early development of this discipline, the journal’s submission calls significantly shaped the bibliometric features of the literature, thereby fostering the field’s growth. Currently, the field of AI and LLMs in medical education continues to evolve without clearly defined subfields. Growing interest has been observed in topics such as nursing education, digital health, medical exams, and conversational agents. Research on ChatGPT and large language models holds a central and influential position within this landscape. Moreover, emerging areas of focus include medical ethics, medical training, and skills training. However, this analysis highlights a notable gap, namely insufficient attention to clinical reasoning, undergraduate education, and virtual reality in the context of AI and LLMs in medical education, which limits advances in diagnostic and immersive training.

Abbreviations

AI, Artificial Intelligence; LLM, Large Language Model; AI and LLMs, Artificial Intelligence and Large Language Models.

Data Sharing Statement

The data analyzed in this study are publicly accessible and downloadable from the Web of Science Core Collection.

Funding

This project was funded by the Ningbo Public Welfare Science and Technology Project (2023Y11) and the Ningbo Top Medical and Health Research Program (No.2022020203) and the First Affiliated Hospital of Ningbo University Teaching and Research Project (Grant No. 2025-JXK-006).

Disclosure

The authors report no conflicts of interest in this research.

References

1. Hswen Y, Abbasi J. AI will-and should-change medical school, says Harvard’s Dean for medical education. JAMA. 2023;330(19):1820–1823. doi:10.1001/jama.2023.19295

2. Jackson P, Ponath Sukumaran G, Babu C, et al. Artificial intelligence in medical education - perception among medical students. BMC Med Educ. 2024;24(1):804. doi:10.1186/s12909-024-05760-0

3. Tolentino R, Baradaran A, Gore G, et al. Curriculum frameworks and educational programs in AI for medical students, residents, and practicing physicians: scoping review. JMIR Med Educ. 2024;10:e54793. doi:10.2196/54793

4. Kumar A, Burr P, Young TM. Using AI text-to-image generation to create novel illustrations for medical education: current limitations as illustrated by hypothyroidism and Horner syndrome. JMIR Med Educ. 2024;10:e52155. doi:10.2196/52155

5. Franco DR, Mathew M, Mishra V, et al. Twelve tips for addressing ethical concerns in the implementation of artificial intelligence in medical education. Med Educ Online. 2024;29(1):2330250. doi:10.1080/10872981.2024.2330250

6. Ng F, Thirunavukarasu AJ, Cheng H, et al. Artificial intelligence education: an evidence-based medicine approach for consumers, translators, and developers. Cell Rep Med. 2023;4(10):101230. doi:10.1016/j.xcrm.2023.101230

7. He Q. Knowledge discovery through co-word analysis. Library Trends. 1999;48(1):133–159.

8. Chen C. CiteSpace II: detecting and visualizing emerging trends and transient patterns in scientific literature. J Am Soc Inf Sci Technol. 2006;57(3):359–377. doi:10.1002/asi.20317

9. Aria M, Cuccurullo C. bibliometrix: an R-Tool for comprehensive science mapping analysis. J Informetrics. 2018;11(4):959–975. doi:10.1016/j.joi.2017.08.007

10. Pinto DSD, Giese D, Brodehl S, et al. Medical students’ attitude towards artificial intelligence: a multicentre survey. Eur Radiol. 2019;29(4):1640–1646. doi:10.1007/s00330-018-5601-1

11. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. doi:10.1371/journal.pdig.0000198

12. Paranjape K, Schinkel M, Nannan Panday R, et al. Introducing artificial intelligence training in medical education. JMIR Med Educ. 2019;5(2):e16048. doi:10.2196/16048

13. Huh S. Document network and conceptual and social structures of clinical endoscopy from 2015 to July 2021 based on the web of science core collection: a bibliometric study. Clin Endoscopy. 2021;54(5):641–650. doi:10.5946/ce.2021.207

14. Alkhammash R. Bibliometric, network, and thematic mapping analyses of metaphor and discourse in COVID-19 publications from 2020 to 2022. Front Psychol. 2022;13:1062943. doi:10.3389/fpsyg.2022.1062943

15. Hagamen WD, Gardy M. The numeric representation of knowledge and logic—two artificial intelligence applications in medical education. IBM Syst J. 1986;25(2):207–235. doi:10.1147/sj.252.0207

16. Chen C, Hu Z, Liu S, et al. Emerging trends in regenerative medicine: a scientometric analysis in CiteSpace. Expert Opin Biol Ther. 2012;12(5):593–608. doi:10.1517/14712598.2012.674507

17. Chen C. Searching for intellectual turning points: progressive knowledge domain visualization. Proc Natl Acad Sci U S A. 2004;101(Suppl 1):5303–5310. doi:10.1073/pnas.0307513100

18. Ayers JW, Poliak A, Dredze M, et al. Comparing physician and artificial intelligence Chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med. 2023;183(6):589–596. doi:10.1001/jamainternmed.2023.1838

19. Karabacak M, Ozkara BB, Margetis K, et al. The advent of generative language models in medical education. JMIR Med Educ. 2023;9:e48163. doi:10.2196/48163

20. Kaushik V. JMIR Medical Education. Call for papers: ChatGPT, generative language models and generative AI in medical education. Available from: https://mededu.jmir.org/themes/1302-theme-issue-chatgpt-and-generative-language-models-in-medical-education. Accessed October 2024 03.

21. Katznelson G, Gerke S. The need for health AI ethics in medical school education. Adv Health Sci Educ Theory Pract. 2021;26(4):1447–1458. doi:10.1007/s10459-021-10040-3

22. Strong E, DiGiammarino A, Weng Y, et al. Performance of ChatGPT on free-response, clinical reasoning exams. medRxiv. 2023. doi:10.1101/2023.03.24.23287731

23. Chang CY, Jen HJ, Su WS. Trends in artificial intelligence in nursing: impacts on nursing management. J Nurs Manag. 2022;30(8):3644–3653. doi:10.1111/jonm.13770

24. Macy J. Foundation conference on artificial intelligence in medical education: proceedings and recommendations. Acad Med. 2025.

25. Boscardin CK, Abdulnour RE, Gin BC. Macy Foundation Innovation Report Part I: current landscape of artificial intelligence in medical education. Acad Med. 2025. doi:10.1097/ACM.0000000000006107

26. Gin BC, LaForge K, Burk-Rafel J, Boscardin CK. Macy Foundation Innovation Report Part II: from hype to reality: innovators’ visions for navigating AI integration challenges in medical education. Acad Med. 2025. doi:10.1097/ACM.0000000000006117

27. Langlotz CP, Kim J, Shah N, et al. Developing a research center for artificial intelligence in medicine. Mayo Clin Proc. 2024;2(4):677–686. doi:10.1016/j.mcpdig.2024.07.005

28. Taylor ZW, Stan K. Exploring the stratified nature of artificial intelligence research funding in United States Educational Systems: a bibliometric and network analysis. Educ Sci. 2024;14(11):1248. doi:10.3390/educsci14111248

Creative Commons License © 2025 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms and incorporate the Creative Commons Attribution - Non Commercial (unported, 4.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]