Using COVID-19 Value Sets for Patient Identification

Summary

The U.S. healthcare system was not prepared for a health crisis of the magnitude of the COVID-19 pandemic. Hospitals are working to facilitate widespread distribution of information within their organization and to local, state, and federal authorities to successfully manage this novel infection. EHRs and Lab Information Systems (LISs) have become public health tools for disease surveillance and management.

Due to signification variation in EHR data, informatics tools are needed to define patients with suspected SARS-Cov2 Infection and confirmed COVID-19 infection. With the aim of building an extensible model for a COVID-19 database, Health Catalyst has built a detailed approach that leverages a heuristic methodology for capturing both confirmed and suspected cases.

Health Catalyst has proposed value sets that define two patient cohorts for the registry for confirmed and suspected COVID-19 patients, stratified further into three levels of confidence: high confidence suspected, moderate confidence suspected, and low confidence suspected.

Downloads

Download

Since the World Health Organization (WHO) declared the coronavirus disease 2019 (COVID-19) as a pandemic on March 11, 2020, the United States has become the epicenter of the disease.¹ While the numbers continue to increase, there are 4,710,282 confirmed cases¹, surpassing the national case numbers of Spain, Italy, Germany, France, and China. This pandemic has highlighted the fact that the U.S. healthcare system is not prepared for a health crisis of this magnitude.² EHRs and Lab Information Systems (LISs) have become public health tools for disease surveillance and management. Hospitals are working to facilitate widespread distribution of information within their organization and to local, state, and federal authorities to successfully manage this novel infection.

Additionally, reports of sparse testing kits, ventilators, and personal protective equipment (PPE) drive an urgent need for a means to calculate surge capacity requirements.³ EHR data has significant variation and informatics tools are needed to define patients with suspected SARS-Cov2 Infection and confirmed COVID-19 infection. With the aim of building an extensible model for a COVID-19 database, we have built a detailed approach that leverages a heuristic methodology for capturing both confirmed and suspected cases.

Based on the guidance provided by the CDC,^4,5 WHO,^{6, 7} AMA,⁸ NLM VSAC,⁹ AAPC¹⁰ and published literature, we propose value sets that define two patient cohorts for the registry of confirmed and suspected COVID-19 patients, stratified further into three levels of confidence: high confidence suspected, moderate confidence suspected, and low confidence suspected. These value sets are based on ICD-10-CM codes (Supplemental Table 2) and will continue to evolve as we understand more about COVID-19, and as clinical practices change over time. Using the four patient cohorts for the registry defined using the value sets above, patient care outcomes can be studied.

Methods

We developed modified indicators that classify diagnosis codes into one of six value-sets (Diagnosis-Grain) and added additional logic on top of this to classify a patient into one of four diagnosis categories (Patient-Grain) intended to represent varying levels of diagnostic certainty. This was developed using Health Catalyst Touchstone® data with the intent to capture suspected C-19 patients not readily identifiable by lab tests and preceding widespread adoption of the ICD-10 code U07.1 for confirmed COVID-19, released for use on April 1, 2020.

Diagnosis-Grain:

1) Confirmed COVID-19
2) Viral exposure
3) Coronavirus-related
4) Associated COVID-19 Diagnoses
5) Severe Associated COVID-19 Diagnoses
6) COVID-19 Symptoms

Patient-Grain: [Figure 1]

1) Confirmed COVID-19
2) Suspected: High Confidence
3) Suspected: Moderate Confidence
4) Suspected: Low Confidence

Fig. 1. Proposed Use of Value Sets for Generating COVID-19 Patient Cohorts Using COVID-19 Categorization Logic

Chart - Text string diagnosis description — *Text String is related to diagnosis description

Key Differences Between CDC and Health Catalyst Value Sets:

Value Set: Confirmed COVID-19 Patients

The confirmed cohort removes ICD-10 Code B97.29 “Other coronavirus as the cause of diseases classified elsewhere” as this code captures all coronaviruses (including and in addition to COVID-19). People around the world are commonly infected with human coronaviruses 229E, NL63, OC43, and HKU1. There are additional human coronaviruses that are captured through the use of ICD-10 code B97.2. In our diagnosis logic, patients with this code are classified as “Suspected: High Confidence” if they do not have a positive non-COVID-19 coronavirus lab test.

	CDC Value-set: COVID-19 Confirmed		Health Catalyst Value-set: COVID-19 Confirmed
U07.1	COVID-19 virus infection	U07.1	COVID-19 virus infection
B97.29	Other coronavirus as the cause of diseases classified elsewhere		Discuss U07.2 CODE (further discussion ongoing at HCAT)

Value Set: Associated Diagnosis

Associated adds additional ICD-10 codes capturing the same or similar diagnoses as those documented in the CDC Guidelines, but extends beyond the specific ICD-10 codes documented in CDC Guidance. This is to account for varying usage of ICD-10 codes by clinicians across health systems. This value-set is intended to capture diagnoses characteristic of COVID-19 (moderate in severity, as mild symptoms and severe complications are captured in separate value-sets). Two codes have been moved to the Health Catalyst Severe Diagnosis value-set for purposes of use in the patient-grain diagnosis logic.

	CDC Value-set: COVID-19 Associated Diagnosis		Health Catalyst Value-set: COVID-19 Associated Diagnosis
A41.89	Other specified sepsis		*A41.89 is moved to ‘severe’ value-set
		J06.9	Acute upper respiratory infection, unspecified
J12.89	Other viral pneumonia	J12.89	Other viral pneumonia
		J12.9	Viral pneumonia, unspecified
		J18.9	Pneumonia, unspecified organism
J20.8	Acute bronchitis due to other specified organisms	J20.8	Acute bronchitis due to other specified organisms
		J20.9	Acute bronchitis, unspecified
J22	Unspecified acute lower respiratory infection	J22	Unspecified acute lower respiratory infection
J40	Bronchitis, not specified as acute or chronic	J40	Bronchitis, not specified as acute or chronic
J80	Acute respiratory distress syndrome		*J80 is moved to ‘severe’ value-set
J98.8	Other specified respiratory disorders	J98.8	Other specified respiratory disorders

Value Set: Suspected COVID-19 Patients

Health Catalyst value-sets place the below codes in different value-sets for the purposes of allowing the patient-grain diagnosis logic to require additional elements to increase confidence that these codes are not representing non-C-19 coronaviruses. Clinically, these codes are intended to be used in conjunction with other diagnosis codes. Our value-sets do not include ICD-10 code Z03.818 “Encounter for observation for suspected exposure to other biological agents ruled out” as the guidance for this code is to be used to document cases ruled out.

People around the world are commonly infected with human coronaviruses 229E, NL63, OC43, and HKU1. Please see https://www.cdc.gov/coronavirus/types.html for additional human coronaviruses that are captured through use of ICD-10 codes B34.2 and B97.2.

	CDC Value-set: COVID-19 Suspected	Health Catalyst Value-set: COVID-19 Suspected
B34.2	Coronavirus infection, unspecified	*moved to ‘coronavirus related’ value-set
B97.2	SARS-associated coronavirus as the cause of diseases classified elsewhere	*moved to ‘coronavirus related’ value-set
Z03.818	Encounter for observation for suspected exposure to other biological agents ruled out	Removed – indicates cases ruled out*
Z20.828	Contact with and (suspected) exposure to other viral communicable diseases*	*moved to ‘viral exposure’ value-set

Value Sets: Coronavirus-Related and Viral Exposure

The codes in this value-set are found in the CDC Value-set ‘Suspected’ but are moved to their own value-sets for the purposes of the Patient-Grain logic.

	Health Catalyst Value-set: Viral Exposure
Z20.828	Contact with and (suspected) exposure to other viral communicable diseases (Not specific to COVID-19)
	Health Catalyst Coronavirus-Related
B34.2	Coronavirus infection, unspecified (Not specific to C-19)
B97.21	SARS-associated coronavirus as the cause of diseases classified elsewhere (Not specific to C-19)
B97.29	Other coronavirus as the cause of diseases classified elsewhere (Not specific to C-19)

Value Set: Severe Associated COVID-19 Diagnoses

This value-set is intended to be used in conjunction with other value-sets in the Patient-Grain Diagnosis Logic to help identify those patients who have developed severe symptoms/complications associated with C-19 likely requiring hospital-level care, in addition to other known C-19 symptoms and/or associated diagnoses. This is a compilation of some of the most commonly noted severe complications of COVID-19 noted in the literature.

	Health Catalyst: Severe Associated COVID-19 Diagnoses
A41.89	Other specified sepsis
A41.9	Sepsis, unspecified organism
I50	Heart failure
I50.1	Left ventricular failure, unspecified
I50.20	Unspecified systolic (congestive) heart failure
I50.21	Acute systolic (congestive) heart failure
I50.23	Acute on chronic systolic (congestive) heart failure
I50.3	Diastolic (congestive) heart failure
I50.30	Unspecified diastolic (congestive) heart failure
I50.31	Acute diastolic (congestive) heart failure
I50.33	Acute on chronic diastolic (congestive) heart failure
I50.40	Unspecified combined systolic (congestive) and diastolic (congestive) heart failure
I50.41	Acute combined systolic (congestive) and diastolic (congestive) heart failure
I50.42	Chronic combined systolic (congestive) and diastolic (congestive) heart failure
I50.43	Acute on chronic combined systolic (congestive) and diastolic (congestive) heart failure
I50.810	Right heart failure, unspecified
I50.811	Acute right heart failure
I50.813	Acute on chronic right heart failure
I50.814	Right heart failure due to left heart failure
I50.82	Biventricular heart failure
I50.83	High output heart failure
I50.89	Other heart failure
I50.9	Heart failure, unspecified
I51.3	Intracardiac thrombosis, not elsewhere classified
I51.4	Myocarditis, unspecified
I51.5	Myocardial degeneration
I51.9	Heart disease, unspecified
J80	Acute respiratory distress syndrome
R06.03	Acute respiratory distress
R65.11	Systemic inflammatory response syndrome (SIRS) of non-infectious origin with acute organ dysfunction
R65.20	Systemic inflammatory response syndrome (SIRS) of non-infectious origin with acute organ dysfunction, without septic shock
R65.21	Systemic inflammatory response syndrome (SIRS) of non-infectious origin with acute organ dysfunction, with septic shock

Value Set: COVID-19 Symptoms

This value-set is intended to be used in conjunction with other value-sets in the Patient-Grain Diagnosis Logic to help identify those patients who have symptoms of C-19 in addition to other known C-19 severe complications and/or associated diagnoses. This is a compilation of some of the most commonly noted symptoms and emerging symptoms of COVID-19 noted in the literature.

	Health Catalyst Value-set: COVID-19 Symptoms
H10	Conjunctivitis
H10.011	Acute follicular conjunctivitis, right eye
H10.012	Acute follicular conjunctivitis, left eye
H10.013	Acute follicular conjunctivitis, bilateral
H10.019	Acute follicular conjunctivitis, unspecified eye
H10.021	Other mucopurulent conjunctivitis, right eye
H10.022	Other mucopurulent conjunctivitis, left eye
H10.023	Other mucopurulent conjunctivitis, bilateral
H10.029	Other mucopurulent conjunctivitis, unspecified eye
H10.231	Serous conjunctivitis, except viral, right eye
H10.232	Serous conjunctivitis, except viral, left eye
H10.233	Serous conjunctivitis, except viral, bilateral
H10.239	Serous conjunctivitis, except viral, unspecified eye
H10.30	Unspecified acute conjunctivitis, unspecified eye
H10.31	Unspecified acute conjunctivitis, right eye
H10.32	Unspecified acute conjunctivitis, left eye
H10.33	Unspecified acute conjunctivitis, bilateral
H10.89	Other conjunctivitis
H10.9	Unspecified conjunctivitis
M79.10	Myalgia, unspecified site
R05	Cough
R06.0	Dyspnea
R06.00	Dyspnea, unspecified
R06.01	Orthopnea
R06.02	Shortness of breath
R06.03	Acute respiratory distress
R06.09	Other forms of dyspnea
R07.0	Pain in throat
R07.1	Chest pain on breathing
R07.2	Precordial pain
R07.8	Other chest pain
R07.81	Pleurodynia
R07.82	Intercostal pain
R07.89	Other chest pain
R07.9	Chest pain, unspecified
R43.0	Anosmia
R43.1	Parosmia
R43.2	Parageusia
R43.9	Unspecified disturbances of smell and taste
R50	Fever of other and unknown origin
R50.81	Fever presenting with conditions classified elsewhere
R50.9	Fever, unspecified
R51	Headache
R53	Malaise and fatigue

Lab Test Knowledge Curation

Due in part to the critical diagnostic importance of lab testing, lab test result data are key to understanding the clinical state of patients as well as surveillance of patient populations. However, in EHR systems lab test result data are often stored with local codes or strings for lab test types and test result values rather than codes from widely standardized terminologies such as LOINC (Logical Observation Identifiers Names and Codes).¹⁹ In such cases, there is a need to ascertain the types of the lab tests and the meanings of the result values automatically over a large volume of lab result data without the benefit of a uniform standard lab terminology across the various EHR systems from which the lab result data are sourced.

In this work, a lab test knowledge curation workflow (fig. 2) was established in order to provide a knowledge base for recognizing lab test types and understanding lab result values as expressed in the lab result data records from multiple EHR systems. This lab test knowledge curation workflow is shown in the following diagram. This knowledge base, accumulated from many lab result records across multiple EHR systems, allows for automated categorization of lab results (positive, negative, pending, ambiguous, test problem, unmapped) and lab test types (detection of SARS-CoV-2 material, detection of antibody to SARS-CoV-2). This classification system is of great convenience for use in analytics to find patterns of interest in, for example, the data of patients who are confirmed or suspected COVID-19 cases. The harmonized set of lab result values allows for rapid automated use of readily comprehensible data (i.e., positive and negative result values), identification of results that may be usable with additional data curation (i.e., unmapped), and the ability to select for patients whose tests may still be in process or unavailable (i.e., pending).

Figure 2. Laboratory Test Knowledge Curation Workflow

Chart - Lab Test Knowledge Curation Workflow

References

Johns Hopkins Coronavirus Resource Center. COVID-19 Map [Internet]. [cited 2020 May18].
Lipsitch M, Swerdlow DL, Finelli L. Defining the Epidemiology of Covid-19 – Studies Needed. N Engl J Med. 2020 Mar 26;382(13):1194-1196. doi:10.1056/NEJMp2002125.
Hospital Experiences Responding to the COVID-19 Pandemic: Results of a National Pulse Survey March 23–27, 2020. U.S. Department of Health and Human Services Office of Inspector General. [cited 2020 Apr 10].
Consortium for Clinical Characterization of COVID-19 by EHR. [cited 2020 May 21].
The Centers for Disease Control and Prevention. Human Infection with 2019 Novel Coronavirus Person Under Investigation (PUI) and Case Report Form. [cited 2020 Apr 10].
World Health Organization. Emergency Use ICD Codes for COVID-19 disease outbreak. [cited 2020 May 21].
COVID-19 coding in ICD-10. World Health Organization. [cited 2020 May 21].
American Medical Association. COVID-19 coding and guidance. [cited 2020 May 21].
National Library of Medicine. COVID-19 Value Sets in VSAC. [cited 2020 May 21].
American Academy of Professional Coders. COVID-19: Your Medical Coding and Compliance Headquarters [cited 2020 May 21].

Additional Reading

Would you like to learn more about this topic? Here are some articles we suggest:

‍