Directory of Experts
Back to search results

Research project title

GAI-ORKG : Generative AI for Oncology Research with Knowledge Graphs / Intelligence Artificielle Générative pour la Recherche en Oncologie avec Graphe de Connaissances (doctorate)

Education level



Director: Amal Zouaq

End of display

July 19, 2025

Areas of expertise

Artificial intelligence

Medical sciences

Unit(s) and department(s)

Department of Computer Engineering and Software Engineering

LAMA-WeST (Web, Semantics and Text)


The LAMA-WeST laboratory ( is in the process of recruiting several doctoral students starting in the fall 2023 semester.

The doctoral candidate must have a master's degree in natural language processing / machine learning and/or Semantic Web (creation of ontologies, knowledge bases, etc.). He or she must be passionate about research and have knowledge of Python programming. Experience in the field of AI and health is a plus.

Please send to  a CV, a transcript, as well as a letter motivating how your past experience can contribute to this project. Please indicate in the subject of the message: Doctorate (D1 or D2) - GAI-ORKG: Generative AI for Oncology Research with Knowledge Graph.

Detailed description

Over the last two decades, healthcare has moved from a paper-based reality to a digital one and a trove of digital health data now exists. Simultaneously, an era of AI has dawned with benefits to many areas of society. But, the unstructured and siloed nature of a lot of health data mean these parallel developments have barely converged and the benefits of AI in healthcare remain, as yet, unrealized. This is particularly true in cancer care. For many cancer patients, important information is buried in clinical notes in disparate parts of their electronic health record. Likewise, useful information, that could otherwise contribute to AI-powered cancer research, lies trapped and inaccessible to researchers. A solution to combine, consolidate, and exploit unstructured health data is needed.

To achieve this objective, the research team will leverage modern standards for health data to build/learn a cancer patient knowledge base (i.e. a fully-structured record for each patient) from both structured and unstructured data in electronic health records. We will investigate how neural architectures, pretrained language models, and knowledge graphs can be used to extract such a knowledge base and provide relevant information to specialists through natural language generation approaches

Two PhDs are planned in this project:

The objective of the first PhD (D1) will be to design and implement a methodology for extracting information from texts and populating a knowledge base in oncology. This will therefore involve knowledge representation and knowledge extraction challenges, alignment challenges, as well as neural (including generative) NLP models.

The objective of the second PhD (D2) will be to design and implement a methodology to generate adapted summaries from clinical notes and the oncology knowledge base. This will therefore involve challenges in natural language synthesis and generation and integration of knowledge graph embeddings, as well as methods to avoid and detect model hallucination issues.

Professors involved in the project:

Polytechnique Montréal :  Amal Zouaq
McGill :  John Kildea

Financing possibility

Funding available