Recent Advances on Graph Analytics and Its Applications in Healthcare

In recent years, because of the rapid development of data mining and knowledge discovery, many novel graph analytics algorithms have been proposed and successfully applied in a variety of areas. The goal of this tutorial is to summarize the graph analytics algorithms developed recently and how they have been applied in healthcare.

This tutorial will be held at the KDD 2020 on August 23rd, 8:00 AM PDT (11:00 AM EDT), live streaming at:

Tutorial information

Graph is a natural representation encoding both the features of the data samples and relationships among them. Analysis with graphs is a classic topic in data mining and many techniques have been proposed in the past. In recent years, because of the rapid development of data mining and knowledge discovery, many novel graph analytics algorithms have been proposed and successfully applied in a variety of areas. The goal of this tutorial is to summarize the graph analytics algorithms developed recently and how they have been applied in healthcare.

In particular, our tutorial will cover both the technical advances and the application in healthcare. On the technical aspect, we will introduce deep network embedding techniques, graph neural networks, knowledge graph construction and inference, graph generative models and graph neural ordinary differential equation models.

On the healthcare side, we will introduce how these methods can be applied in predictive modeling of clinical risks (e.g., chronic disease onset, in-hospital mortality, condition exacerbation, etc.) and disease subtyping with multi-modal patient data (e.g., electronic health records, medical image and multi-omics), knowledge discovery from biomedical literature and integration with data-driven models, as well as pharmaceutical research and development (e.g., de-novo chemical compound design and optimization, patient similarity for clinical trial recruitment and pharmacovigilance).

We will conclude the whole tutorial with a set of potential issues and challenges such as interpretability, fairness and security. In particular, considering the global pandemic of COVID-19, we will also summarize the existing research that have already leveraged graph analytics to help with the understanding the mechanism, transmission, treatment and prevention of COVID-19, as well as point out the available resources and potential opportunities for future research.

Tutorial materials and outline

  • Recent Advances in Graph Analytics and Its Applications in Healthcare (30 min) [PDF] [YouTube]
    • Introduction
    • Outline
    • Healthcare & Graphs
  • Network Embedding & Graph Neural Networks (50 min) [PDF] [YouTube]
    • Healthcare and Graphs
    • Learning from Networks
    • Network Embedding
    • Graph Neural Networks
    • Rethinking, Summary, and Conclusion
  • Knowledge Graph Mining (50 min) [PDF] [YouTube]
    • General Knowledge Graphs
    • Knowledge Graphs in Healthcare
    • Knowledge Graph Construcution
    • Knowledge Graph Inference
  • Graph Generative Models & Drug Discovery (50 min) [PDF] [YouTube]
    • Drug discovery
    • Molecular Graph Generation and Normalizing Flow Models
    • Our MoFlow for Molecular Graph Generation
    • Experiments on Molecule Generation, Reconstruction, Visualization and Optimization
  • Conclusion: Discussions and Future Directions (30 min) [PDF] [YouTube]
    • More on Data Quality, Privacy, Interpretation, Bias, and Security, etc.

Virtual Location

Live Streaming on 2020 August 23rd, 8:00 AM PDT (11:00 AM EDT):

Video on Demand:


3 hours, 30 minutes, plus 30-minute break. This tutorial will be held at The KDD 2020, August 23rd, 8:00 AM PDT (11:00 AM EDT).


This tutorial will be highly accessible to all data mining researchers, students and practitioners who are interested in graph analytics. The tutorial will be self-contained. No special prerequisite knowledge is required to attend this tutorial. The estimated number of participants is 100.


Fei Wang is currently an Associate Professor of Health Informatics in Department of Population Health Sciences, Weill Cornell Medicine, Cornell University. His major research interest is data mining and its applications in health data science. He has published more than 200 papers on the top venues of related areas such as ICML and KDD. His papers have received over 11,000 citations so far with an H-index 52. His papers have won 7 best paper awards at top international conferences on data mining and medical informatics. His team won the championship of the NIPS/Kaggle Challenge on Classification of Clinically Actionable Genetic Mutations in 2017 and Parkinson's Progression Markers' Initiative data challenge organized by Michael J. Fox Foundation in 2016. Dr. Wang is the recipient of the NSF CAREER Award in 2018, the inaugural research leadership award in IEEE International Conference on Health Informatics (ICHI) 2019, Amazon AWS Machine Learning for Research Award in 2017 and 2019, as well as Google Faculty Research Award. Dr. Wang’s Research has been supported by NSF, NIH, ONR, PCORI, MJFF, AHA, etc. Dr. Wang is the chair of the Knowledge Discovery and Data Mining working group in American Medical Informatics Association (AMIA).

Peng Cui is an Associate Professor with tenure in Tsinghua University. He got his PhD degree from Tsinghua University in 2010. His research interests include network representation learning, causally-regularized machine learning, and social-sensed multimedia computing. He has published more than 100 papers in prestigious conferences and journals in data mining and multimedia. His recent research won the IEEE Multimedia Best Department Paper Award, SIGKDD 2016 Best Paper Finalist, ICDM 2015 Best Student Paper Award, SIGKDD 2014 Best Paper Finalist, IEEE ICME 2014 Best Paper Award, ACM MM12 Grand Challenge Multimodal Award, and MMM13 Best Paper Award. He is the Associate Editors of IEEE TKDE, IEEE TBD, ACM TIST, and ACM TOMM etc.

Jian Pei is currently a professor in the School of Computing Science at Simon Fraser University, Canada. His expertise is in developing business driven, technology enabled data analytics for critical applications. His publications have been cited by more than 94,000 in literature, and by more than 36,000 since 2015. He has an h-index of 87. He is also active in providing consulting service to industry and transferring his research outcome to industry and applications. He received a few prestigious awards, including the 2017 ACM SIGKDD Innovation Award, the 2015 ACM SIGKDD Service Award, and the 2014 IEEE ICDM Research Contributions Award. He is a fellow of the Royal Society of Canada, the ACM, and the IEEE.

Yangqiu Song is an assistant professor at Department of CSE with a joint appointment at Math Department at HKUST, associate director of WeChat-HKUST Joint Lab on Artificial Intelligence Technology WHATLab and HKUST-WeBank Joint Lab. His current research focuses on using machine learning and data mining to extract and infer insightful knowledge from big data. The knowledge helps users better enjoy their daily living and social activities, or helps data scientists do better data analytics. He is particularly interested in large scale learning algorithms, natural language understanding, text mining, and information networks including knowledge graphs. He received the Yelp 2018 Data Challenge Best Paper Award, KDD 2017 Data Science Best Paper Award, IUI 2015 Best Paper Honorable Mention, KDD 2014 selected paper for TKDD publication, and PAKDD 2007 Best Paper Award–Honorable Mention. He is now on the JAIR editorial board and served as IJCAI 2019 Local Co-Chair and ACL 2020 Area Chair.

Chengxi Zang is currently a postdoctoral research associate in the Weill Cornell Medicine, Cornell University. He got his Ph.D. from Tsinghua University in 2019 with an Excellent Ph.D. Award at Tsinghua University (top 3%). He has worked extensively on mining, modeling, and learning the structures and dynamics of complex social and biological systems. For example, he is the first one who studied the human and social dynamics of WeChat at a billion scale, and he also investigated millions of information cascades in Tencent Weibo. Currently, his focus is drug discovery driven by large-scale chemical data and AI. His research won ICDM'18 Best Paper Candidate and Best Paper Award at AAAI'20 Workshop on Deep Learning on Graphs.