Health Catalyst President of Technology, Dale Sanders, gives straightforward answers to tough questions about the future of AI in healthcare.
He starts by debunking a common belief: We are awash in valuable data in healthcare as a consequence of EHR adoption. The truth involves a need for deeper data about a patient.
Editor’s Note: Health Catalyst CEO, Dan Burton, served as a panelist on a session entitled “Smart EHRs: AI for All” at the April 2018 World Medical Innovation Forum. To prepare, he requested assistance from Health Catalyst President of Technology, Dale Sanders. What follows are Sanders’s thoughts on questions generated by the forum organizers.
Q: The first wave of EHR adoption has focused primarily on digitizing the patient record—with a more recent focus on building interactive clinical decision support capabilities. Development and implementation of CDS applications currently requires clinical staff to observe trends in data, develop protocols to act on these trends, and work with technical staff to codify the logic into executable form. As NLP and computer vision capabilities become more advanced, algorithms will identify and propose actions reflecting patterns in the data. [Will] AI technology…ultimately support an unsupervised learning approach in the EHR to identify trends and possible responses at both the patient and population level?
A: One thing to keep in mind that is contrary to popular belief: We are not awash in valuable data in healthcare as a consequence of EHR adoption. AI requires breadth and depth of data to be effective—lots of rows of patients and lots of facts, aka features, about those patients. Google’s autonomous car algorithms drove for 80 million miles in a simulator before they touched a real paved road, and they still require a human to supervise the car. How many times per year does your healthcare provider collect data about you? On average in the US, you visit the doctor or hospital 3 times per year. The other 362 days of the year, your healthcare provider collects no data about you. During your 3 visits, providers collect a few pieces of data. Height, weight, smoking status, blood pressure, age, gender, name, and address. Lab tests with somewhere around 1 and 10 measurements (these measurements are called “features” in the AI world). An ICD diagnosis code or two. Maybe a CPT code or two. Maybe a digital image—X-ray, CT, MRI. A physician’s clinical note. Maybe a microbiology or pathology diagnostic test along with a text report about the findings. A pharmacy order for a medication.
In the world of AI data, where breadth and depth of data are critical to the accuracy of the AI model, that’s not very much data. Also, we like to think that the key to AI success resides in the analysis of clinical notes. But the quality, accuracy, quantity, and objective, computable information in a clinical note is questionable, at best. Clinical notes are rarely more than half a page long—that’s not a very much information in the AI world and it’s entirely subjective, unless it references an objective measure such as a lab test or blood pressure reading. They are authored almost completely by the clinician.
On very rare occasions, a patient might have a full or partial genome sequence, so that’s an important but largely missing data set in today’s healthcare.
Thus, in a traditional clinical encounter, we collect somewhere around 50 data points—features—about a patient, only three times per year. That’s less than 100Mbytes of data per year. Tesla collects 25Gbytes of data per hour about their cars.
We facilitated a study in Alberta that concluded that EHRs represent only 8 percent of the features and facts that we need for precision medicine and population health—and that 8 percent was gracious. We think the number is even lower. By the way, we should all be struck down by lightning if we continue to add mouse clicks to the backs of clinicians in our attempts to collect more data about patients and their care. #nomoreclicks
The bottom line: EHRs are not the holy grail of data for AI that we like to think they are. They are barely scratching the data surface in terms of what we need to fully leverage the potential of AI for healthcare. EHR data is better than nothing, but it’s not nearly enough. We need to bathe the patient and the health ecosystem in passive sensors that stream that data into a technology platform that was designed from the ground up to support analytics, decision support, and AI. EHRs were not designed for that purpose, technically or functionally.
AI algorithms, for the most part, boil down to some form of pattern recognition and then either a suggestion to a human for reaction to the pattern, or in the case of autonomous AI, an intervention by the computer to the pattern. Autopilot on aircraft is the classic example of pattern recognition followed by autonomous intervention by the computer—a constant feed of data that describes the pitch, roll, yaw, location, destination, and velocity of the aircraft feed an algorithm that maintains the flight path of the aircraft. The autopilot software on aircraft samples these data streams over 100 times per second. The now-retired F-117 Stealth Fighter was aerodynamically unstable to the degree that a human pilot could not fly the plane without the aid of AI algorithms that monitored its flight telemetry, continuously adjusted the flight control systems on the plane, and kept the plane stable. AI algorithms need data—lots of data—for their full potential to be realized.
The term “unsupervised learning” can be misleading. Just as is the case in human learning, unsupervised learning can only be realistically applied in situations in which the consequences of mistakes and the unsupervised adaption to those mistakes are insignificant. Could we place an autonomous car on the road and let its AI algorithms learn in an unsupervised fashion? Yes, technically that’s possible, but imagine the operational consequences of doing so. To some people, it implies that AI algorithms can be unleashed on data and somehow magically learn something, but learning requires a distinct understanding between “correct” and “incorrect,” which comes from observing, acting, monitoring the outcome of the act, determining whether the act was correct or incorrect in the context of the desired outcome, then adapting so that the incorrect action is not repeated in future, similar scenarios. Generally speaking, the history of unsupervised learning in AI would suggest that the best AI can do without supervision is identify patterns in data that might otherwise escape human recognition. It does not mean that those patterns will be useful. Unsupervised learning can generate hypotheses from data, but it will be up to humans and other downstream AI algorithms to test those hypotheses. At this time in healthcare, given our current data environment, unsupervised learning is best used to generate hypotheses that humans would otherwise not identify.
Q; Is AI just a passing trend, with potential to make only incremental changes over the status quo? Or will AI unlock new possibilities for humankind?
A: It will and is unlocking new possibilities for humankind. The progress of AI is exceeding Moore’s Law. The capabilities of AI are doubling every six months in. It’s unlike anything I’ve seen in my 34-year career in computer and information science. That said, as I mentioned earlier, healthcare will be left behind if we don’t increase the digitization of the patient—dramatically increase the breadth and depth of health-related data about patients. AI needs large volumes and high-quality data, and for the most part, we don’t have that in healthcare, yet.
Q: How significant a challenge is posed by interoperability (or lack thereof) between different EHR platforms and the widespread application of AI to these systems?
A: Surprisingly, lack of interoperability is not really the problem. We can easily peel the data out of EHRs and expose that data to AI algorithms. At Health Catalyst, we’ve commoditized that; we can do it in our sleep. The problem lies in the inherent limitations of the EHR and the way they are designed. They were designed with the clinical encounter—the billable event—as the center of their data model. So, if you want to apply AI to understand billable events, that design is fine. But if you want to apply AI to understand the patient, you need to completely change the fundamental design of the data model behind EHRs so that the patient is at the center of the model, not the billable encounter. EHRs were not software engineered to support dynamic, intelligent, context-based user interfaces, such as what we see in modern web and mobile software applications. The software code and architecture behind the scenes of EHRs is based on 20-year old technology, at least. This puts inherent limitations on the ability to drive better decision support to clinicians, enabled by AI, into the user interface of EHRs. Also, healthcare is the only data environment that consistently believes we have to push copies of data from one EHR system to another in order to achieve interoperability; to view and access data. But, if you think about modern information systems, like Amazon, Google, and Facebook, they don’t ship and store data locally. They index and then reference data—that is, they “point to data” in its native location. That’s what an HTTP and URL address is all about. It’s an address to the location of data. If the Internet followed EHRs’ approach to interoperability, we would all have a full copy of the world wide web on our laptop computers. So, is interoperability a problem? Yes. But the bigger problem is the fundamental design and engineering of EHRs.
Q: Will these tools live up to the hype and also help reduce the significant computer-based workload physicians now face?
A: Yes, definitely, but only if we increase the breadth and depth of data about patients. Like autopilot relieves the commodity tasks of flying an airplane from a pilot, the same can be achieved with AI applied to healthcare. It can relieve the commodity tasks of simple diagnosis and treatment, and administrative tasks, from the backs of clinicians so that they can monitor a patient and react when the situation requires their higher, non-commodity, expertise. But, none of that will be possible if we don’t better digitize the patient and the processes of care.
Q: What about some of the inherent biases in EHR data? What can be done to ensure that machine learning algorithms don’t perpetuate (or even exacerbate) these biases?
A: This is the core problem we face. EHRs were designed with the billable clinical encounter at the center of the data model, not the patient. Clinical notes are notoriously random in terms of their content and quality for the same patients and patient types. Clinical notes are just as random in their content as the humans who write them. Human-assigned ICD diagnosis codes are subjective and highly-impacted by the clinicians and coders who assigns them. Same with CPT codes. So, we have a very limited data set in EHRs which were designed for billing, and we have a data set whose quality is very questionable. Think about the objective and quantitative nature of data that is collected in the telemetry from a rocket or satellite. That data originates in a sensor that was designed and manufactured to capture computable information. It’s not a human, subjectively estimating and entering that telemetry data. So, can we derive AI value from the data in an EHR? Yes, and we are doing it now. Do we face the probability of making false conclusions from the output of that AI? Absolutely, positively. We have to apply AI in the field of healthcare with the rigor of formal experimental design until the data we have about patients is less subjective, deeper, and broader than what’s contained in today’s EHRs. That’s an important topic for another time—data scientists in healthcare need formal training in experimental design to ensure that the results of their AI models are valid.
Q: How will these technologies impact patients’ expectations of data privacy and confidentiality?
A: I’m a patient and I can’t wait for the benefits of AI to be unleased to the betterment of my treatment and to reduce my healthcare costs. I have no concerns about privacy and confidentiality in this context, at all, and, as an industry, we need to make sure that we don’t create a boogie man fear in patients about privacy, and thus delay or inhibit the progression of AI to help. I’m looking forward to the future when I’m bathed in 7×24 sensors that are collecting data about and monitoring my health. I look forward to the future when I own and control this data and its privacy settings—I decide who gets to see my data and who doesn’t. I look forward to exposing my health data to commercially available AI algorithms—AI algorithm companies who compete for my patronage and subscriptions– that can assess my health and suggest the best treatment plans and therapeutics. I look forward to the future in which I go into a clinical encounter with a clinician, armed with more data about my health than that clinician has, along with the output of the AI algorithms that I subscribe to, and together with that clinician, we decide what’s best for me as a human being, not as their subordinate patient. Healthcare moves at glacial speeds, but I think this future is emerging now. For example, look at the bio-integrated sensors technology being developed by John Rogers and team at Northwestern. Their products are already being used by sports teams and in clinical trials. We are facing a tipping point of cultural dissatisfaction at healthcare’s glacial pace. The winds of change are blowing harder than ever. This future that I look forward to, is no more than five years away.
Would you like to learn more about this topic? Here are some articles we suggest: