Exploring Innovations: Machine Learning in Healthcare

Vusi Kubheka
Jun 3, 2024
8 min read

Introducing the Problem

The problem identified in this essay is the high burden of disease in South Africa that is straining the fiscal and physical capacity of our public health system. This urges us to rethink the way we approach health. The term “P4 Medicine” was coined by Leroy Hood and it refers to a systems approach to health and disease that aims to make healthcare more Predictive, Preventative, Personalized and Participatory (Chivot, 2013). By identifying and treating healthy individuals before symptoms appear (which optimises individual’s wellness and prevents disease), this approach shifts healthcare from being reactive to being proactive. Not only would this innovation improve health outcomes, it would also drastically improve the cost and quality of healthcare and enable consumers to take more accountability in monitoring their health (Flores et al., 2013).

The recent advances in Artificial Intelligence and machine learning provide us with the means to treat causes instead of symptoms of disease. In healthcare, machine learning depends on gathering patient data. Through specialized systems and tools for organizing data, machine learning algorithms can detect patterns within datasets. These patterns enable medical practitioners to uncover emerging illnesses and forecast treatment results. Machine Learning can be considered as disruptive, technological and process innovation. Other innovations that were considered to address this problem included Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) gene-editing technology (Forum, 2023). This disruptive and technological innovation could enable the development of targeted treatments that address the underlying genetic or molecular mechanisms of diseases (Forum, 2023). Social innovations such as community-based response programmes were also considered. This form of innovation empowers communities to find the best methods to address their own health challenges and enables interlinked causes of disease, structural vulnerabilities and health system inefficiencies to be addressed (Dako-Gyeke et al., 2020).

Definition and Description of Innovation

Machine Learning in healthcare involves the use of various forms of statistical models and computer programs to analyse and identify patterns in historical and process healthcare databases to determine different health outcomes, risks and opportunities (Kumar & Suthar, 2024; Mishra & Silakari, 2012). These patterns and predictive capabilities are improved through the iterative incorporation of more data over time (Ranapurwala et al., 2019). Machine Learning enables us to capture relationships between variour health data points and events to assess the risk or potential associated or related with a specific set of conditions. The information extracted in this process is subsequently used to guide decision making (Ranapurwala et al., 2019).

How Machine Learning Works and its Principles

Machine Learning is used to support healthcare professionals and analysts’ ability to perform their roles by identifying health patterns, trends, and risk factors to inform decision-making and interventions (Kumar & Suthar, 2024). Machine Learning is process that can be outlined as follows: health-related data collection from various sources; data preprocessing to clean and prepare the data for analysis and involves handling missing values, removing duplicates, and standardizing formats; Exploratory Data Analysis (EDA) which involves exploring the data to understand its characteristics, relationships, and patterns; feature selection/extraction where relevant attributes or variables that are most predictive of the target outcome (e.g., disease diagnosis, treatment response) are selected or extracted from the dataset; model selection where various machine learning algorithms are evaluated and selected based on how well they fit the available data and predict the target outcome; model training involves training the selected models on a portion of the data (training set) to learn patterns and relationships between the input variables and the target outcome. The model’s parameters are adjusted to minimize prediction errors and improve its performance in this step; and finally model evaluation and optimization (Jothi, Rashid, & Husain, 2015). A simplified illustration of this process can be seen in Figure 1.

Some important principles of predictive models developed through this process include their ability to handle missing information; infrequent/sporadic healthcare events and datasets; variation in the types of input elements and dimensions; variations of different health event periods; multiple time sequences of data; and the models should be interpretable by healthcare professionals who might use them (Hsu, Warren, & Riddle, 2022).

Applications in Other Contexts

In their study, Ranapurwala et al., (2019) used a dataset that documents farm vehicle crashes and Machine Learning algorithms to forecast the risk of an injury or death in a farm vehicle crash for a specific individual or a scenario. The dataset consisted of 7094 farm crashes that occurred between 2005 and 2010 in the United Sates (Ranapurwala et al., 2019). The researchers tested and evaluated different predictive models that could adequatey fit the type of data in the the dataset (Ranapurwala et al., 2019).

Applicability to the Problem

In their study, de Carvalho et al., (2020) demonstrated that machine learning could be used to identify patients who have a higher probability for clinical events and who are more likely to incur higher costs in the long term. Not only did the machine learning model effectively improve the prediction of cardiovascular outcomes, it also forecasted the expected health expenditure associated with the treatment of cardiovascular diseases (de Carvalho et al., 2020). This study adds to the evidence showing that controlling risk factors in individuals with a higher risk of adverse health events is more economically feasible than efforts to therapeutic adherence at a population level (de Carvalho et al., 2020).

A vital element of healthcare revolves around timely decision-making. With access to additional information, physicians and healthcare practitioners can mitigate risks by making decisions ahead of time regarding treatments (Javaid et al., 2022). Machine learning’s capability to analyze extensive datasets and offer valuable forecasts that aid timely decisions and consequently focusing on preventative interventions that are more cost effective (Javaid et al., 2022). For example, being able to discover health problems such the onset of a stroke allows cheaper preventative pathways compared to the resources needed for treatment procedures.

Machine Learn also has the ability to improve clinical trial research by aiding medical researchers to evaluate a broader range of data, while reducing the cost and time needed for clinical tests (Javaid et al., 2022). This technology can also be used to estimate optimal sample sizes for enhanced efficacy and a reduced risk of data errors (Javaid et al., 2022).

Limitations in the Healthcare Context

People driving for deep learning-based models in healthcare champion its ability to formulate meaningful patterns from health-related data. However, these methods lack trustworthy and interpretable results needed in healthcare settings, stopping their transition from academic research to clinical implementation (Ravì et al., 2016). Zhang et al., (2020) highlights that these models are unable to tell healthcare professionals which variables are most meaningful to an outcome. Additionally, because these models lack a predefined structure, it is not possible to account for their validity or reliability to make predictions, ultimately making them untrustworthy (Henriques & Antunes, 2014). In the healthcare context, model interpretability requires predictive models to identify and explain the contributory degree of each factor in relation to the prediction it makes (Ravì et al., 2016; Zhang et al., 2020). Several models that have sought interpretability have overlooked model trustworthiness by using the model’s predicted probability as confidence scores (Zhang et al., 2020). However, crucial to model trustworthiness is providing information about the level of uncertainty or confidence in their predictions, rather than relying solely on predicted probabilities as confidence scores (Zhang et al., 2020). This is a key vulnerability of deep-learning or unsupervised machine learning algorithms, they are ultimately deterministic in nature. For any random data input, the models will always produce a deterministic result without giving any measure of uncertainty (Zhang et al., 2020).

Possible Modifications for Successful Implementation

Therefore it vital that machine learning algorithms that have the ability to explain the predictions outcome that they produce are developed. These algorithms will be better equipped to account fot the interprible predictions required in the healthcare setting by demonstrating how input values are processed into the observed output values. Recognising the importance of this, Sarker (2024) suggests modifications to machine learning algorithms that enhance their interpretability and give clearer insights into their decision-making capabilities. By implementing techniques such as “feature importance analysis, SHAP (SHapley Additive exPlanations) values, and partial dependence plots”, health professionals and stakeholders could gain a better understanding of the factors influencing predictions (Sarker, 2024).

Conclusion

Addressing the high burden of disease in South Africa necessitates a paradigm shift towards proactive healthcare approaches like P4 Medicine, which emphasizes predictive, preventative, personalized, and participatory strategies. Machine learning emerges as a pivotal form of innovation in this approach, enabling the development of predictive models that identify health risks and facilitate timely interventions. Machine learning's ability to analyze large datasets and forecast outcomes offers promising paths for improving healthcare decision-making and resource allocation. However, challenges such as interpretability and trustworthiness must be addressed for successful implementation in clinical settings. Modifications to algorithms, focusing on enhancing interpretability and providing clearer insights into decision-making processes, can enhance acceptance and utilization of machine learning innovations in healthcare. Embracing these advancements holds the potential to revolutionize healthcare delivery, improving outcomes while optimizing costs and empowering individuals to take proactive control of their health.

References

Chivot, E. (2013). Innovation for Prevention and Health. http://www.jstor.org/stable/resrep24006

Dako-Gyeke, P., Amazigo, U. V., Halpaap, B., & Manderson, L. (2020). Social innovation for health: engaging communities to address infectious diseases. Infectious diseases of poverty, 9(1), 98. https://doi.org/10.1186/s40249-020-00721-3

de Carvalho, L. S. F., Gioppato, S., Fernandez, M. D., Trindade, B. C., Silva, J. C. Q. e., Miranda, R. G. S., de Souza, J. R. M., Nadruz, W., Avila, S. E. F., & Sposito, A. C. (2020).

Machine Learning Improves the Identification of Individuals With Higher Morbidity and Avoidable Health Costs After Acute Coronary Syndromes. Value in Health, 23(12), 1570-1579. https://doi.org/https://doi.org/10.1016/j.jval.2020.08.2091

Flores, M., Glusman, G., Brogaard, K., Price, N. D., & Hood, L. (2013). P4 medicine: how systems medicine will transform the healthcare sector and society. Per Med, 10(6), 565-576. https://doi.org/10.2217/pme.13.57

World Economic Forum. (2023). 5 innovations that are revolutionizing global healthcare. WEF. Retrieved 19 April 2024 from https://www.weforum.org/agenda/2023/02/health-future-innovation-technology/

Henriques, R., & Antunes, C. (2014). Learning predictive models from integrated healthcare data: Extending pattern-based and generative models to capture temporal and cross-attribute dependencies. 2014 47th Hawaii International Conference on System Sciences.

Hsu, W., Warren, J. R., & Riddle, P. J. (2022). Medication adherence prediction through temporal modelling in cardiovascular disease management. BMC medical informatics and decision making, 22(1), 313.

Javaid, M., Haleem, A., Pratap Singh, R., Suman, R., & Rab, S. (2022). Significance of machine learning in healthcare: Features, pillars and applications. International Journal of Intelligent Networks, 3, 58-73. https://doi.org/https://doi.org/10.1016/j.ijin.2022.05.002

Jothi, N., Rashid, N. A. A., & Husain, W. (2015). Data Mining in Healthcare – A Review. Procedia Computer Science, 72, 306-313.

https://doi.org/https://doi.org/10.1016/j.procs.2015.12.145

Kumar, D., & Suthar, N. (2024). Predictive analytics and early intervention in healthcare social work: a scoping review. Social Work in Health Care, 1-22.

Mishra, N., & Silakari, D. S. (2012). Predictive Analytics : A Survey , Trends , Applications , Oppurtunities & Challenges.

Ranapurwala, S. I., Cavanaugh, J. E., Young, T., Wu, H., Peek-Asa, C., & Ramirez, M. R. (2019). Public health application of predictive modeling: an example from farm vehicle crashes. Injury epidemiology, 6, 1-11.

Ravì, D., Wong, C., Deligianni, F., Berthelot, M., Andreu-Perez, J., Lo, B., & Yang, G.-Z. (2016). Deep learning for health informatics. IEEE journal of biomedical and health informatics, 21(1), 4-21.

Sarker, M. (2024). Revolutionizing healthcare: the role of machine learning in the health sector. Journal of Artificial Intelligence General science (JAIGS) ISSN: 3006-4023, 2(1), 35-48.

Zhang, X., Qian, B., Cao, S., Li, Y., Chen, H., Zheng, Y., & Davidson, I. (2020). INPREM: An Interpretable and Trustworthy Predictive Model for Healthcare Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, CA, USA. https://doi.org/10.1145/3394486.3403087

BHSc (Honours) Health Systems Sciences, Witwatersrand University