Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Rapid advances in Artificial Intelligence (AI) have opened the way for the creation of a huge range of new healthcare tools, but to ensure that these tools do not exacerbate pre-existing health inequities, researchers urge the use of more representative data in their development.

A stethoscope and doctor's notes

Researchers from Oxford University’s Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences (NDORMS), University College London and the Centre for Ethnic Health Research, supported by Health Data Research UK, have for the first time studied the full detail of ethnicity data in the NHS. They outline the importance of using representative data in healthcare provision and have compiled this information into a research-ready database.

The new study, published in Nature Scientific Data, is the first part of a three-phase project that aims to reduce bias in AI health prediction models which are trained on real-world patient data. The project, which addresses ethnicity disparities that were highlighted during the pandemic, is part of the UK Government’s COVID-19 Data and Connectivity National Core Study led by Health Data Research UK.

The researchers used de-identified data on ethnicity and other characteristics from general practice and hospital health records, accessed safely within NHS England’s Secure Data Environment (SDE) service, via the British Heart Foundation Data Science Centre’s CVD-COVID-UK/COVID-IMPACT Consortium. This is the first time that patient ethnicity data has been studied at this depth and breadth for the whole population of England. The researchers were able to combine records to analyse patient self-identified ethnicity recorded through over 489 potential codes.

Researchers analysed how more than 61 million people in England identified their ethnicity in over 250 different groups. They also looked at the characteristics of those with no record of their ethnicity, and how conflicts in patient ethnicity data can arise. The data, now available for other researchers to use, shows that 1/10 patients lack ethnicity records, and around 12% of patients had conflicting ethnicity codes in their patient records.

Read the full story on the University of Oxford website