Data Collection and Reporting for Healthcare Disparities

by Jennifer Hornung Garvin , PhD, MBA, RHIA; Theresa D. Jones , MHA, RHIA; Lydia Washington , MS, RHIA, CPHIMS; and Christine Weeks , BA

Collecting accurate equity data supports efforts to reduce healthcare disparities and create equal care for all.

The benefits of complete and accurate data capture have far-reaching impact, including improving the quality of care by addressing disparities in care.

In the 2002 landmark study Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care, the Institute of Medicine documented evidence that race and ethnicity are significant predictors of the quality of care, observing that minorities who had the same insurance, status, and income as nonminorities received a lower quality of care.1

In that study IOM described racial and ethnic healthcare disparities as racial or ethnic differences in the quality of healthcare that are not due to access-related factors or clinical needs, preferences, and appropriateness of intervention. Other studies and reports have demonstrated a similar relationship between healthcare disparities and the quality of healthcare.

Addressing such disparities requires that providers capture better data about race, ethnicity, and socioeconomic status, an effort complicated by the sensitive nature of the data and the challenges of categorizing them appropriately.

Addressing Health Disparities

The IOM study provides recommendations for research and addresses the importance of data collection that affects care disparities. Federal and state organizations have turned their attention to the issue. The Agency for Healthcare Research and Quality, the Centers for Medicare and Medicaid Services, and state public health entities have ongoing initiatives to address healthcare disparities.

Accrediting agencies are also focusing on aspects of care that could be associated with disparities. For example, the Joint Commission’s Hospitals, Language and Culture Project has identified the challenges associated with cultural and language barriers in hospital settings and offers a framework and organizational self-assessment tool for addressing these barriers and meeting the needs of diverse patient populations.2,3

At the heart of these and other efforts to develop effective strategies to address healthcare disparities is the need for accurate and complete data. However, data describing racial, ethnic, language, cultural, and socioeconomic characteristics are frequently inaccurate, incomplete, and lacking in detail in the healthcare setting. Sometimes they are not collected at all.

This may be due to the nature of the information itself. Information about an individual’s race, ethnicity, and socioeconomic status—sometimes referred to as “equity data”—may be considered to be of a sensitive nature both by those collecting it and the individuals to whom it pertains.

Even so, equity data are essential for research, analysis, planning, measurement, and implementation of initiatives that could reduce healthcare disparities, and healthcare organizations are increasingly being called on to ensure that the data they capture can meet these needs. Similar to diagnosis and procedure coding, equity data developed during a healthcare encounter have many uses, and this multidimensional aspect of their use should be a consideration in the assignment of racial and ethnic categories.

The HRET Toolkit

The Health Research and Educational Trust (HRET), an affiliate of the American Hospital Association, offers providers a toolkit for the systematic collection of data used to assess healthcare disparities ( The toolkit is an excellent resource for organizations seeking to standardize and improve their data collection processes.

At a minimum, HRET recommends the following data be collected in settings that measure and manage quality: race and ethnicity, language, and socioeconomic status.

A central issue in collecting race and ethnicity data is determining the number of categories sufficient to differentiate between groups. Too many categories could produce unmanageable results and may not be supported by patient registration or other information systems that collect the data.

As noted, other data sets are available. The Office of Management and Budget issues the Standards for the Classification of Federal Data on Race and Ethnicity. The Centers for Disease Control and Prevention issues the UHDDS. Healthcare organizations should assess which code set best meets their needs. Regardless of the categories used, it is highly recommended that individuals be allowed to self-select the category or categories they feel best describe their race and ethnicity.

Capturing data about language includes a person’s preferred language, ability to speak and understand English, need for translation services, and speech, hearing, and literacy impairments. This information is necessary for determining the need for language assistance services and ensuring that patients understand and can participate in their care. Several states and the Joint Commission have specific requirements for capturing and reporting data relating to language.

Socioeconomic status gauges a person’s relative economic and social position based on factors such as their education, income, and occupation. Common indicators include zip code, because where a person resides is associated with income level.

In healthcare settings, insurance coverage can be a socioeconomic indicator (both the presence and type of insurance). Highest level of education of patient (or parent, if the patient is a child) is a predictor of both behavior and income.

Performance Improvement Process for Data Collection

Although equity data are frequently collected during the registration or initial assessment or intake process, HIM departments play an important role in ensuring the quality of the data. This can include conducting a performance improvement assessment similar to the one shown here.

Race Indicators:

  • American Indian Eskimo/Aleut
  • Asian or Pacific Islander
  • Black
  • White
  • Other Race
  • Unknown

Ethnicity Indicators:

  • Spanish origin/Hispanic
  • Non-Spanish origin/Non-Hispanic
  • Unknown

3. Analyze and compare internal and external data

Analyze race and ethnicity indicator data to determine if the data sets are properly utilized; for example, overutilization of the racial category “unknown.” Analyze a sample of each minority racial category to determine if patients are being properly interviewed.

4. Identify improvement opportunity

Identify procedures to improve the assignment of data.

5. Perform ongoing monitoring

Review the assignment of race and ethnicity indicators on a quarterly basis.

Source: Shaw, Patricia, Chris Elliott, Polly Isaacson, and Elizabeth Murphy. Quality and Performance Improvement in Healthcare. Chicago, IL: AHIMA, 2007.

Uniform Data Collection Process

The Uniform Hospital Discharge Data Set (UHDDS), issued by the Centers for Disease Control and Prevention, has been considered the de facto standard for collecting data on inpatients related to race and ethnicity.4 The UHDDS currently describes race using the following categories: American Indian/Eskimo/Aleut, Asian or Pacific Islander, Black, White, Other Race, and Unknown.5 The data set defines ethnicity as Spanish origin/Hispanic, Non-Spanish origin/Non-Hispanic, and Unknown. The Uniform Ambulatory Care data set uses the same definitions for race and ethnicity, making it easier to compare data for inpatients and ambulatory patients in the same facility.6

The limits imposed by the UHDDS categories may need to be addressed in order to facilitate the use of racial and ethnic categories for performance measurement, administrative planning, and regulatory purposes. The Office of Management and Budget and the US Census Bureau both use more extensive descriptors for race and ethnicity, and these descriptors may well need to be evaluated and harmonized for use in revised UHDDS racial and ethnic categories.

The following data are actual racial summary data reported from one hospital using the UHDDS. In this example, the number of patients in the “unknown” category represents the third largest racial designation, suggesting an overuse of the category. One possible contributor may be that there are too few categories.

Race Data 2007 Percentage
White 28,950 78.3%
Black 4,535 12.3%
Unknown 2,286 6.2%
Asian 733 2.0%
Other 427 1.2%

In this case the UHDDS race and ethnicity classifications may result in data that lack the specificity required for use in quality assessment and improvement of health disparities. The industry would benefit from an analysis of the data set categories to ensure that the categorizations are accurate and adequate.

The optimal number of categories is that which sufficiently differentiates between groups with unique needs and issues while affording individuals the opportunity to self-identify their group or groups. For example, studies have shown that Latinos frequently do not make a distinction between race and ethnicity, sometimes necessitating one or more categories that capture both race and ethnicity (e.g., Hispanic/White; Hispanic/Black; Hispanic/Declined).7 Broad categories such as “Asian” may not capture important ethnic information when such a category could pertain to individuals from countries as culturally diverse as India, Japan, and Vietnam.

In addition to the US Office of Management and Budget’s Standards for the Classification of Federal Data on Race and Ethnicity, the UHDDS, and the Census Bureau classifications, the HRET Toolkit recommends code sets and guidelines for systematically collecting data on race, ethnicity, and primary language (see sidebar above). Healthcare organizations should assess which code set best meets their needs and make procedural modifications as necessary to capture the information they need to address inequities in the populations they serve.8

Regardless of how many categories an organization uses, the process that staff use to collect the data is important to the quality of the data. Typically, accuracy increases dramatically when individuals are allowed to self-identify their race or ethnicity, rather than admission staff recording the information by observation or assumption.9 Therefore it is highly recommended that individuals be allowed to self-select as few or as many categories as they feel are necessary to describe themselves.

Although equity data are frequently collected during the registration or initial assessment or intake process, HIM departments play an important role in ensuring the quality of the data. This may include conducting a performance improvement assessment (as illustrated in the sidebar, at left), developing policies and procedures, and conducting training for those involved in direct collection and follow-up auditing. All are important initiatives that will ensure high quality data.

In the earlier example of summary race data, the hospital used a performance improvement process to determine that the “unknown” category should not be used by registrars unless the patient is not coherent and there is no one accompanying who can provide the information. The hospital also determined the need for training for registrars on how to talk with patients to obtain accurate information. As a result the use of the category decreased.

Important questions for an assessment include:

  • Is the patient admission or registration process in which the data are collected centralized or decentralized? This may affect the consistency of data collection.
  • Do information systems adequately support the required levels of detail and granularity? This may affect the categories of collected data.
  • What type and frequency of training are provided to staff responsible for capturing the data?
  • Is information self-identified by the patient or gleaned from observation?
  • Is the reason for the data collection explained to the patient?

Accurate and valid data about race and ethnicity, language, and socioeconomic status are essential in identifying and addressing disparities in healthcare, which in turn can mean significant improvements in the quality of care.


Barbara Odom-Wesley, PhD, RHIA, FAHIMA
Rachelle Stewart, DrPH, RHIA, FAHIMA
Mattie Wilson, MA, RHIA


  1. Institute of Medicine. Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care. Edited by Brian D. Smedley, Adrienne Y. Stith, and Alan R. Nelson. Washington, DC: National Academy Press, 2002.
  2. Wilson-Stronks, Amy, et al. One Size Does Not Fit All: Meeting the Health Care Needs of Diverse Populations. The Joint Commission, 2008. Available online at
  3. The Joint Commission. “Developing Culturally Competent Patient-Centered Care Standards.” Available online at
  4. NCVHS. “The National Committee on Vital and Health Statistics, 1949–1999: A History.” Available online at
  5. Johns, Merida. Health Information Management Technology: An Applied Approach, 2nd ed. Chicago, IL: AHIMA, 2007.
  6. Ibid.
  7. Weinick, Robin M., Katherine Flaherty, and Steffanie J. Bristol. “Creating Equity Reports: A Guide for Hospitals.” 2008. Available online at
  8. Ibid.
  9. Ibid.


Agency for Healthcare Research and Quality. “National Healthcare Disparities Report.” 2007. Available online at

Health Research and Educational Trust. “Collecting Race, Ethnicity, and Primary Language Data: Tools to Improve Quality of Care and Reduce Health Care Disparities.” 2005. Available online at

Jennifer Hornung Garvin ( is research health science specialist at IDEAS Center SLCVA and assistant professor in the Division of Clinical Epidemiology, University of Utah. Theresa D. Jones ( is director of clinical information services at Abington Memorial Hospital, Abington, PA. Lydia Washington is a practice director at AHIMA. Christine Weeks is communications coordinator at the Center for Health Equity Research and Promotion, Philadelphia VA Medical Center.

Article citation:
Garvin, Jennifer Hornung; Jones, Theresa D.; Washington, Lydia; Weeks, Christine. "Data Collection and Reporting for Healthcare Disparities" Journal of AHIMA 80, no.4 (April 2009): 40-43.