Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

This course is suitable for MSD students and research staff interested in developing a basic understanding of “real-world” or observational health data science. The sessions are interactive. No prior knowledge is required.

course aim

This course will provide an opportunity to learn about the foundations of a series of key topics in observational health data science. It includes an introduction to “real-world” data sources, epidemiology principles, and applied machine learning for clinical risk prediction and cluster analysis.

The course is carefully designed based on student-led needs to cover topics not collectively taught in a similar course across the Division.

course content

SESSION 1 – Introduction to real-world data sources in the UK: CPRD GOLD & Aurum, ONS, HES, UK BIOBANK

Students will learn about the most influential data sources available in the UK: why they are collected, how they are structured and linked, and how to gain permission to use them. The challenges of real-world data will be made apparent together with solutions to implement and achieve data harmonisation and standardised analytics.

SESSION 2 – Introduction to Epidemiology Basics

Students will learn the principles and scope of epidemiology and examine the benefits and limitations of epidemiological studies. Concepts such as ‘PICO’, ‘confounding’ and ‘bias’ will be introduced and the differences between various study designs such as cohort and case-control examined.

SESSION 3 – Introduction to Machine Learning for Prediction Modelling

This session will offer a brief overview of machine learning methods for healthcare applications including supervised and unsupervised learning, followed by real-world examples of data analysis using routinely-collected data.

SESSION 4 – Introduction to Unsupervised Learning Approaches

This session will cover unsupervised learning methods and its application to cluster analyses, sub-group detection using routinely-collected data and actual clinical case studies. 

course objectives

The aim is to help participants in becoming familiar with some of the key observational health data sources and data science approaches, along with strengths and  limitations when applied in a variety of clinical scenarios.

This course would be suitable for those interested in developing a basic understanding of  “real-world” or observational health data science as a foundation for more advanced studies towards improving healthcare practice and policy focusing on improving health access, interventions, and outcomes.

participant numbers

20

ATTENDANCE CERTIFICATE ON SURVEY COMPLETION

It is now a requirement that you complete the three short questions in the survey you receive after attending the course. Once you have submitted the survey, you will be sent an email with a link to your attendance certificate. This is to ensure we receive the feedback we need to evaluate and improve our courses. Survey results are downloaded and stored anonymously.