arrows arrow-right arrow-left menu search rss youtube linkedin twitter facebook arrow-play
physician reviewing details on a tablet

Sematrix: Reducing the Research Time for CMS Measure Development from Months to Moments


Over the past several years, the Centers for Medicare & Medicaid Services (CMS) has been moving toward a “valuebased purchasing” model in which healthcare providers are evaluated and ultimately paid—by the outcomes they achieve rather than the services delivered. The ultimate goal is to have a system in which the value of a treatment or service, and the amount that CMS will reimburse for it, is directly tied to the benefit that it brings to the patient or the efficiencies it brings to the healthcare system.

In order to make this vision a reality, CMS needs to have quality measures that can be used to determine the value of a treatment or service. These quality measures define the patient population and characteristics, the treatment or service provided and the expected outcomes for the treatment or service. Once fully defined, the measures provide a standard benchmark for evaluating the treatments and services delivered by individual healthcare providers or facilities. Are their patients’ outcomes aligned with what would be expected based on current research, or are the outcomes significantly better or worse than expected?

To develop each measure, CMS must find and evaluate the relevant research that has been published on each topic. Currently, that means searching by keyword through the National Library of Medicine (PubMed) as well as policy documents, study results and summary information available through other libraries—a corpus containing millions of existing documents and growing daily. Human searchers then must evaluate all of the articles that are returned to find the ones that are most relevant, reliable and useful for measure development. This process can take hundreds of human hours per measure. With hundreds of measures in need of development, CMS and Battelle partnered to find a better way.


CMS is using Battelle Sematrix™ to reduce the time it takes to locate and evaluate the most relevant and usable articles for measure development from weeks or months to moments. Sematrix is a powerful cognitive analytical system that automatically extracts and represents the knowledge contained in scientific, technical or general text in a form that enables complex query, knowledge discovery and visualization and advanced analytics.

Sematrix is able to read technical text at a very granular level and extract formal knowledge that can then be analyzed in a variety of ways. Unlike keyword search programs, Sematrix uses advanced machine learning and natural language processing to understand what it is reading to create context with other articles in its database. This allows it to not only find the most relevant research for a particular problem but also to draw inferential conclusions by combining knowledge extracted from multiple documents. In other words, if one article says A=B, and another says B=C, then by the property of transitivity Sematrix can extrapolate that A must equal C—even though no single document in the database makes that explicit claim.

CMS uses Sematrix to tag and analyze millions of documents from PubMed and other relevant libraries. Using a training set of 137 CMS clinical quality measures, researchers are developing a system that will be able to quickly find the most useful and relevant articles related to each measure and extract the information required for measure development into a usable form for CMS. First, the description of each measure is analyzed to extract the relevant information from the measure. For example, a measure may describe the percentage of diabetes patients between 18 and 75 who have been able to control their blood pressure. Sematrix automatically converts the proposed measure itself into a query containing the measure facts to find the relevant studies and extract the pertinent details from the studies. The system retrieves and ranks these articles.


Using Sematrix, CMS will be able to vastly decrease the amount of time it takes to research and develop each quality measure. While human subject matter experts will still need to read and evaluate the most relevant articles and the knowledge extracted by Sematrix, the system significantly reduces the number of hours currently spent searching for articles that contain the needed information. Instead of combing through hundreds or thousands of articles returned by a keyword search, measure developers can now focus on the top 10 or 20 articles identified by Sematrix as most relevant for the problem at hand.

For each quality measure query, Sematrix will populate, on the basis of articles found, a CMS-developed Measure Information Form that compares the most relevant information for measure development, such as results and population characteristics, so that measure developers can quickly see what information they have and what information is missing from our current body of knowledge. Sematrix’s unique ontological system for representing knowledge will allow it to find connections between articles in order to extract inferential conclusions and to summarize knowledge for non-technical users in plain language.

Ultimately, Sematrix drastically reduces the time it takes to locate and evaluate the most relevant and usable articles for measure development.