A Guide to Untangling the Complexities of State and CMS Databases

A Guide to Untangling the Complexities of State and CMS Databases

Understand How this Data is Collected and How to Use it for Growth

By Haley Attridge, Mike McDonald, and Megan Reeves

Healthcare patient data may hold the potential to help transform the U.S. healthcare system. By providing greater insight into the need for care, the level of quality provided, and the costs associated with care, this data offers the opportunity to improve delivery and accessibility. This data can also deliver great insight to strategic planners looking to deliver the best care offerings possible to expand and better support their service areas.

The clear challenge planners face when trying to extract these insights, particularly from state and CMS (Centers for Medicare and Medicaid Services) databases, resides in the degree of variance that exists in how data is collected, organized, and categorized from state to state. Additionally the sheer availability of this data from each state and how it is accessible—be it public, private, or a blend of both—greatly varies across the nation.

To make sense of all the available state and CMS data—and therefore collect insights from it—let’s start by first defining the types of data available and how the data is classified. We’ll then discuss what’s included in this data, how it’s submitted, and how we can use it.

Download the Ultimate Guide to Healthcare Data

To gain an in-depth guide on Healthcare Data, download this Ultimate Guide to Healthcare Data by filling out this form.

Hospital Patients Type Classifications

State and CMS data can be defined as the billable interactions (insurance claims) between insured patients and the healthcare delivery system. Below, we’ll classify this data into four general categories: inpatient, outpatient, observation, and emergency.


  • Covered by Medicare Part A (insurance that covers hospital expenses and includes hospital stays, skilled nursing care, hospice, and home health-care services)
  • Provider determines a patient will most likely be in the hospital for at least two midnights


  • Covered by Medicare Part B (part of “Original Medicare” and covers medical services and supplies medically necessary to treat a health condition)
  • Can include outpatient care, preventive services (clinic visits/urgent care/ physician office visits), ambulance services/surgery, and durable medical equipment)
  • Outpatient visits: X-rays (MRI, CT), lab tests, physical therapy, radiation therapy visits


  • Covered by Medicare Part B
  • Provider determines a patient will most likely be in the hospital for less than two midnights
  • Allows hospitals to avoid the Medicare penalties from readmissions
  • Seeks to treat patients whose condition doesn’t justify a hospital admission, but may still need follow-up, testing, or a little bit of “wait and see”


  • Covered by Medicare Part B
  • Care provided in the hospital ER or a free-standing ERs
  • 8% of Emergency Room visits result in an inpatient admission*
  • Half of all hospital admissions come from the Emergency Department**

*National Hospital Ambulatory Medical Care Survey: 2015 Emergency Department Summary Tables, tables 1, 4, 15, 25, 26

**The Evolving Role of Emergency Departments in the United States Table 4.1

What Data is Available: Private vs Public Data

Public data is data made available by a state or hospital association to users outside their state or to users who are not members of their hospital association. For example, a license for THCIC data can be purchased directly by a vendor, like Stratasan, with restrictions and costs associated with how the data can be used.

Private data is data that will only be available to hospital association contributors or to hospitals within a specific state. While Texas has their public data sets (THCIC), they also have other private data available for members of the Texas Hospital Association.

While public data often is very useful, private data provides a few perks. Continuing to use Texas as an example, the private data provided by THA has more timely data releases and strong user support for analyzing the data.

Pros and cons of each are listed below.

Private data:

  • Costs for data are commonly (but not always) included in membership fees for contributing to hospital associations
  • Access to data commonly includes access to analytic tools
  • Data is available only to contributing hospitals, encouraging more area hospitals to participate
  • Data is typically released more regularly and much earlier than public data sets

Public data:

  • More flexibility with uses of the data
  • Costs are usually much less than association membership or private data fees
  • Consultants (like Stratasan!) will often have this data readily available for clients use


The map above is a quick guide to the types of data available state by state, updated regularly at Stratasan. It’s used as a reference for our data services team as we work with clients throughout the nation. 

What’s in State Data? Every State is Different.

A constant theme with state data is that every state has a different system for reporting patient data. This can be a critical point of concern for multi-state hospitals looking to improve organizational alignment, identify opportunities for growth, or simply have standardized reporting across their hospitals system. Variances from state to state can include:

  • Different release schedules
  • Varying code formats: decimal vs. non-decimal, or FIPS codes vs. state-specific codes
  • Different data formats: flat files, .csv, etc.

These variances can often lead to errors in the data and inconsistent, unreliable reporting for hospital systems dealing with multi-state data.

There are important state data intricacies to consider even for hospitals and health systems that reside within one state. State data is very comprehensive for inpatient care and emergency care. Outpatient care, though, is typically limited to ambulatory surgeries within a specific set of procedure codes. In truth, only a small fraction of cases overall are required to report to state data. However, there are usually* very strict, enforced, reporting requirements for those cases. This means the available state data is predictable and can be used as a source of truth.

Because state legislations and hospital associations write the rules governing reported data, there are usually* strong quality controls and clear data documentation. This means while state data may not be as comprehensive as some would like, it often makes up for that in quality and documentation.

With quality comes a lot of process. The detailed processing due to quality controls and required documentation causes state data to be released several months or quarters after it is collected. Typically, state data sets releases have a two or three quarter lag and the most recent state data sets have a 4.5 month lag.

*Reporting requirements vary by state.

How Data is Submitted

Often, hospitals and health systems gather and clean state and CMS data themselves. This process usually comes together in the following way:

  • Hospital planners will establish data submission requirements and monitor for when changes are needed.
  • The hospital IT team must then become familiar with the requirements set, so they can write programs that will pull the data.
  • Hospitals will purchase data sets from their respective state hospital associations or other data vendors.
  • IT professionals then pull data from the appropriate association or vendor, share it with planners, and keep them up-to-date as the new data is released.

Once data is in hand, hospitals are often confronted with the difficult task of refining the data themselves. This is a challenging and time-consuming task due to the amount of technical work it takes to make the data actionable. Each time data is refreshed, individual hospitals can spend anywhere from 6 days to 2 weeks processing it. Depending on the amount of dedicated resources available and the number of releases per year, refining data can be an overwhelming task for facilities. It can be beneficial to have a partner like Stratasan who has a streamlined method for transforming data into actionable intelligence.

Refining Data for Use

Stratasan’s Data Processing Service collects, aggregates, cleans, and updates data, so hospitals and health systems can use it to make informed strategic decisions. Stratasan provides state and CMS data to clients in standardized formats, making data retrieval from our software a painless process.

We ensure quality data for our clients by addressing the following:

  • Outliers
    • Facilities may have a lower/higher number of records than expected for a particular quarter and/or year
    • Certain record counts for particular patient ZIP codes could be lower/higher than expected
    • States that experience natural or man-made disasters that result in changes or delays in reporting
    • Facility closures, acquisitions, or expansions.
  • Caveats
    • An increase or decrease in patient volume 11% or more
    • No patient volume reported
    • Facility openings/closures/acquisitions/changes in ownership
    • Renovations/Expansions
    • Changes in reporting standards of codes 

How We Do It

Below are the steps Stratasan takes to make sure our client’s data intelligence needs are met:

  1. Data Receipt: There are a few different ways Stratasan receives data from our clients. The facility/hospital system may order the data from their respective states. The facility may choose to upload data directly to us through our HIPAA compliant website. Clients can also send the data directly in an encrypted format via mail, though this is not our preferred method of transaction. The final way we receive data is by acquiring the credentials from the state to download the data on the client’s behalf. Check with your state, as some of them have strict rules against this method of data delivery.
  2. Verify Trends: Next, we verify quarterly and yearly trends multiple times prior to the data being uploaded into our cloud-based analytics platforms.
  3. Connect codes: We then connect specific codes (APR/MS-DRG, ICD9/10,HCPCS) to descriptions for easier usability of the data within the Stratasan platforms. These fields are important for analysis. By identifying these fields with Stratasan or client product line definitions, analysis can be easily expanded using fields like “Admit Source,” “Age Range,” “Payer,” and “Race.”
  4. Validate Formatting: We validate the formatting of specific fields to ensure the correct connections between the codes and descriptions.
  5. Quality Assurance: Once the data has been extracted and transformed, it is sent through our quality assurance process. During quality assurance, we examine overall volume, facility discharges or visits, and product line comparisons. If there are issues with the data found in this process which cannot be justified using notes provided by the state, they are flagged and investigated.
  6. Go Live: The data is then approved and made “live” for users once the quality assurance process has been completed.

The Takeaway

Clear value can be found in acquiring state and CMS data for strategic planners looking to improve and expand their care offerings. However, the monumental complexity that comes with refining and cleaning this data means too many health care professionals  find themselves caught in the weeds, spending the majority of their time digging through data.

This leaves less time for the higher, more strategic levels of thinking. By taking this arduous and time-consuming task off the plates of our partners, prior to even loading the data into our software platform, Stratasan provides opportunity for focus on the more important job of making informed growth decisions.

Stratasan uses state and CMS data in a number of ways through several of our software offerings:

  • Blackbird: This queryable database allows for the flexibility to analyze raw data or a curated view to identify the most commonly analyzed fields.
  • Canvas: This platform provides templated, easy to understand reports covering market share, patient origin, cost reporting, and demographics.
  • Launch Pathway generates aggregated data reports in a visual and interactive format. Users can clearly see where market volume is increasing or decreasing, a service area's demographic makeup, and what service lines are being utilized.

For more information on how our Data Processing Service and tools can free up your time for more focus on strategy and growth, contact Sean Conway and schedule a discovery call today.

New call-to-action

Article by Haley Attridge, Manager, HR and Administration, Mike McDonald, Data Wrangler, and Megan Reeves, Customer Success Manager for Stratasan

Connect with Stratasan on LinkedIn or follow Stratasan on Twitter and Facebook

data analysis data analytics data strategy medicare state data cms data