Machine learning and artificial intelligence (AI) in RWE studies
Currently huge piles of health care data have been collected in Finland through national, regional and private biobanks since the biobank act took affect 2013. Additional sources of data include for example different national registries that have been up and running for decades, and hospital “data lakes” that are emerging in largest university hospitals. More importantly, the different sources of data can be linked with each other through the unique national identity code, allowing collection of comprehensive healthcare data from numerous individuals.
However, the raw data is impossible to comprehend as such, and requires lots of processing in order to transform the data to knowledge. With correct tools and people, there are great treasures hidden in the data, waiting to be found.
Current RWE studies
The real world evidence (RWE) -based studies completed so far by Medaffcon have yielded vital knowledge on different patient populations, and their characteristics. The general goals have been on describing the incidence and prevalence of the disease, describing the health care resource utilization of the patients, and describing the mortality and survival of the patients. Additionally, the studies have always been including more detailed disease specific objectives. The results have described well the current state of the patients in real world setting, which usually differs from the setting of clinical trials. The results have added deeper knowledge on the current treatment and well-being of the patients, which have further been used to improve the care of the patients.
Already at the current state RWE data possess huge potential to also something greater, for example when tools from the field of machine learning are applied.
Unsupervised Machine Learning
Unsupervised machine learning is a general set of algorithms and approaches that describe and group hidden structures of “data without labels”. It involves, for example, different clustering methods and special types of neural networks such as self-organizing maps.
One possibility offered by these tools is the identification of new, more precisely stratified patient groups. Instead of categorizing disease into “severe” and “non-severe” based on a limited set of clinical parameters (usually just one or two), different tools under unsupervised machine learning can learn the complex patterns in the presented RWE data utilizing all parameters simultaneously and create new patient classifications. Generally, one could find more precisely defined subclasses of patients, or totally new ways to classify them. Using these new patient subgroups, it would be possible to further study which of the patients benefit the most from new drugs or treatments, thus moving the treatment patterns and options one step towards personalized medicine.
The idea of generating new patient groups can be turned the other way around when moving from the field of unsupervised machine learning to supervised machine learning. In these approaches, the “labels” of the data are known, and generally the goal is to find the most important rules or features that divide patients into these groups.
For example, a given drug might be very efficient in some patients, whereas it seems to have close to no impact in others, and the reason for this is still not fully understood. Using supervised machine learning approaches, such as decision trees, it would be possible to find the most important features that seem to divide patients into responders and non-responders. Once again, all the clinical parameters can be considered simultaneously, and the selection can be done without pre-assumptions. The already collected RWE data could be utilized to find these “new” biomarkers that seem to affect the treatment outcome in real world setting. Moreover, the biological samples stored in the biobanks enables studies to make additional measures and derive more data, if for example the crucial lab-measures are missing.
In addition, supervised machine learning contains also other approaches, which can create more complex classifying rules, such as support vector machines and various neural networks that can also be utilized in similar tasks.
Neural Network – Deep Learning
Neural networks and deep learning are one of the recent rising stars in machine learning approaches. They have been part of the latest advancements in artificial intelligence (AI) in various fields and could also be utilized in health sciences.
In neural networks, the input parameters (in this case the clinical profile of the patient) gets abstracted, and it’s no longer meaningful to track which of the variables affect the prediction and how. However, these approaches can yield surprisingly good classifications and predictions, and have proven their usefulness for example in image and speech recognition. In principle, it would be possible to train a neural network using the real world data available and create AI that would predict the most efficient treatment option for the upcoming patients, based on all the previous patients and their selected treatments and outcomes. In fact, machine learning approaches are already used in several health care applications, for example among different imaging procedures and to automatically detect anomalies in long-term ECG measures.
Machine learning and AI in the future…
If we take a futuristic look into the upcoming, in several years it would be possible to have the artificial intelligence helping doctors in their everyday clinical work both in diagnosis and selecting treatment. One day the AI could recommend the most efficient and correct treatment for the patients automatically based on the patient records and consider wider panel of affecting parameters than human mind could ever comprehend at once. As the collected health care related data is increasing rapidly in both variety and quantity, later in the future the AI could also consider for example genetic information and all the behavioural and health related data collected via your smartphone and other devises. Then it would be possible to detect the uprising disease already before the symptoms appear and start the interventions as soon as possible. This way, instead of providing the best possible care, we could move towards the predictive and preventive care tackling the disease already before the symptoms appear. Still, today this is closer to science fiction than reality, however, we might reach our dreams earlier than expected.
Watch a video about machine learning and AI
Iiro Toppila, Medaffcon’s data-analysis lead talked about machine learning and AI in our customer evening. Watch a recording of the speech below.
“RWE data possesses huge potential to improve patient care, and machine learning could be the tool to utilize this potential now and in the future.” – Iiro Toppila, Biostatistician, Medaffcon Oy