Skip to contents

The Network Data Quality (NDQ) package contains several data quality modules intended to evaluate the overall condition of the data in a clinical research network. These modules, which cover a broad range of data quality domains from conformance to plausibility, are flexible and can be configured to execute checks specific to desired use cases in both the OMOP and PCORnet common data models (CDMs).

Installation

You can install the development version of this package from GitHub:

remotes::install_github('PEDSnet/ndq')

Current Functionality

The package currently (as of 09/2025) contains 10 distinct analysis types that can be configured to run innumerable data quality checks. See the table below for a list of the current offerings.

Analysis Type Description Functions
Data Cycle Changes Computes row & patient counts in the specified tables for both the current data model version and a previous data model version in order to assess changes across data extractions. check_dc
process_dc
Domain Concordance Given the details of a pair of clinical events, this function will determine the count of patients OR visits that meet criteria for the first event, the second event, and both events. check_dcon
process_dcon
Date Plausibility Identifies the proportion of rows in each fact type that have an implausible date, where implausibility is defined as a date that falls before the associated visit start date, after the associated visit end date, or before the patient’s birth date. check_dp
process_dp
Best Mapped Concepts Identifies the existing concepts within the specified field so the user can assess which of these concepts are acceptable (“best”) or should not be used in that field (“not best”). check_bmc
process_bmc
Expected Concepts Present Identifies the count of patients who have at least one occurrence of the concept defined in the associated concept set and the proportion of patients who have the concept based on the user-provided denominator cohort. check_ecp
process_ecp
Clinical Fact Documentation Identifies visits that do and do not link to at least one of each user-specified clinical fact type. Will also compute the counts of patients who have at least one visit that does / does not link to the specified fact type. check_cfd
process_cfd
Missing Field: Visit ID Checks to see if the visit_occurrence_id/encounterid in a given fact table also exists in the visit_occurrence/encounter table and identify cases where the ID is missing entirely (NULL). check_mf_visitid
process_mf_visitid
Unmapped Concepts Evaluates the count and proportion of unmapped concepts associated with the fact type of interest. Can also be executed longitudinally by year. check_uc
process_uc
Valueset Conformance Intakes a limited valueset that is expected to make up the entire contents of a field (minus the specified null_values) and identifies if any non-permitted values exist in the field (and how often). check_vs
process_vs
Vocabulary Conformance Use a provided vocabulary definition table to identify the vocabulary of each concept and determine how many rows comply with the standard vocabularies expected for that field. check_vc
process_vc
Facts Over Time Computes the number of rows, patients, and (optionally) visits associated with the fact of interest within a specified time period. check_fot
process_fot

Example Usage

To see a sample repository where these functions have been executed, see PEDSnet NDQ.