Session 1
Introduction to Health Analytics
The purpose of this session is to provide an overview of health analytics, which uses data analysis and statistical methods to improve healthcare outcomes, and how collecting and managing healthcare data, analysing healthcare data, and applying health analytics in clinical settings, public health, and healthcare management can enhance health systems.
Competencies that Health Analytics Requires
Data Modelling
This involves coding representations of real-world processes, such as patient admissions, to relate diverse data points meaningfully and mirror actual workflows.
Extract, Transform & Load
Analysts should extract, transform, and load (ETL) data from disparate systems into a unified data warehouse, enabling integrated analysis across multiple sources like EMR, patient satisfaction, and costing systems.
Data Analysis
Analysts must extract actionable insights from complex datasets, often blending statistical methods and SQL. For instance, they identify trends and correlations within stratified patient populations to guide clinical improvements.
Proficiency in SQL
Analysts must directly query and manipulate databases using SQL rather than relying on intermediary tools. This ensures precise understanding and control over data logic, avoiding errors introduced by auto-generated queries.
Business Intelligence Reporting
Presenting data visually in an accessible, intuitive manner is critical. Analysts act as interpreters, translating complex data into clear, actionable insights for non-technical stakeholders.
Narrative Communication
Beyond presenting isolated metrics, analysts must weave them into a cohesive story, integrating clinical and financial outcomes to provide a comprehensive picture and suggest strategic actions.
Contextual Knowledge
Data collection & cleaning
In their study exploring how physician organisations (POs) perceived the utility of claims-based risk-scoring algorithms, Nong & Adler-Milstein (2021) found that these risk-scoring algorithms had low utility. This is because there was a misalignment between the algorithms and the POs expectations. The algorithms failed to identify patients who would be well-suited for supplemental care management and often included those who were not perceived to need it, based on PO and provider assessment of needs. The algorithm was perceived to be driven by outdated data as risk scores relied on lagged quarterly claims data. For example, a high-cost hospitalisation for an acute problem that had since been resolved could result in a risk score that qualified a patient for care management months later (Nong & Adler-Milstein, 2021).
​
References: Nong, P., & Adler-Milstein, J. (2021). Socially situated risk: challenges and strategies for implementing algorithmic risk scoring for care management. JAMIA Open, 4(3). https://doi.org/10.1093/jamiaopen/ooab076
Activity 1.1.1: Finding cases where Health Analytics made a difference
Beck et al., (2011) developed a tool called C-Path to better identify high-risk breast cancer cases using pathology samples. Many of the tumor features used today, such as tubules and atypical nuclei, were discovered decades ago. Yet, instead of just applying new algorithms to these existing features, C-Path went a step further by using automated image processing to find new features.
​
C-Path started by creating a classifier to reliably distinguish between the epithelial and stromal regions of tumors. From these regions, the tool extracted 6,642 quantitative features, focusing on properties like the size, location, and spacing of nuclei, as well as relationships between nuclei and the cytoplasm. These features were then used to build a model to predict patient survival, which performed significantly better than predictions made by community pathologists on two independent test datasets. Additionally, C-Path's scores were strongly linked to 5-year survival, outperforming all existing clinical and molecular factors.
​
The C-Path project provided several valuable insights. The most important lesson was that simply using new algorithms on old features isn’t enough to achieve better results. Instead, the discovery of new features is critical for improved performance. Interestingly, many of the features identified by C-Path were completely new, despite decades of pathologists examining breast cancer slides. This shows that machine learning can take a fresh, unbiased approach to reveal unexpected but important predictive variables.
​
References: Deo, R. C. (2015). Machine Learning in Medicine. Circulation, 132(20), 1920-1930. https://doi.org/doi:10.1161/CIRCULATIONAHA.115.001593
Activity 1.1.2: Healthcare Challenges
How can data science be used to solve some of these challenges?
Prescriptive analytics offers a proactive approach to healthcare by recommending the best course of action based on individual patient data, such as genetics, environment, and lifestyle factors.
​
Prescriptive analytics goes beyond descriptive, diagnostic, and predictive analytics by using the results of these analyses to generate actionable insights. While predictive analytics might forecast potential health risks for a patient, prescriptive analytics suggests specific interventions or treatment pathways to mitigate those risks. For example, if predictive analytics identifies a patient as being at high risk for heart disease, prescriptive analytics might recommend specific lifestyle changes, medications, or even genetic tests to optimize the patient’s health outcomes. This approach ensures that treatments are not only targeted but are also applied in the right way, at the right time, and for the right patient.
​
Another example may be a health insurer identifying a pattern in its claims data from last year showing a significant portion of its diabetic patient population also suffers from retinopathy. Using predictive analytics, the insurer estimates the probability of an increase in ophthalmology claims during the next year. Prescriptive analytics can then be used to "model out the cost impact if average ophthalmology reimbursement rates increase, decrease or remain the same for the next plan year, then recommend a course of action".
​
In precision medicine, prescriptive analytics enables the integration of diverse datasets, including clinical records, genetic information, patient histories and insurance claims to recommend tailored treatment plans. The use of what-if scenarios and simulation techniques allows clinicians to explore different treatment options and predict the outcomes of various interventions. This capacity to simulate and evaluate multiple scenarios helps healthcare providers choose the most effective and least invasive treatment for each patient. Ultimately, this approach results in a more efficient use of resources, reducing the likelihood of overtreatment and minimizing healthcare costs.
The application of prescriptive analytics in healthcare is particularly valuable in tackling complex decision-making challenges where numerous variables must be considered simultaneously. Using advanced techniques such as optimization and multi-criteria decision-making, prescriptive analytics evaluates potential interventions against predefined goals, such as improving patient outcomes while minimizing costs. This decision-making model empowers healthcare systems to deliver higher quality care, increase patient satisfaction, and support the shift towards precision medicine and public health.
​
References: Mosavi, N. S., & Santos, M. F. (2020). How prescriptive analytics influences decision making in precision medicine. Procedia Computer Science, 177, 528-533.