Notes
Slide Show
Outline
1
Linking Recipients of Public Services
  • Daniel McCandless
  • Michigan Medicaid Program
  • MI Department of Community Health
2
What am I going to talk about?
  • Michigan has a very extensive data warehouse with more than 20 separate databases
  • We have developed a method of linking beneficiaries in many of those data bases
  • What are the benefits to state users?
  • What are the privacy and security issues raised by this technology?
3
State of Michigan’s Data Warehouse
4
Medicaid Data Sources
5
More Efficient Access
  • Data is loaded once for many users and is stored on disk for fast access
  • Ad Hoc and Standard response times are in minutes instead of days
  • Data Warehouse data can be cleaned and consistent
  • A single data source to provide consistent reporting
6
Security Issues for Data Warehouse
  • Data users (analysts and operational users now have access to detailed data)
  • Security on data warehouse can be set for each user at the table, row, column or data element level
  • Technical capabilities greater than staffing resources
  • Each user is now a possible conduit of confidential information to inappropriate destinations
7
MDCH Data Warehouse
  • Beneficiary & Provider Contact Tracking System
  • Community Mental Health
  • Children’s Special Health Care Services
  • Epidemiology
  • Hearing Screening
  • Home Help Agency Program
  • Lead Screening
  • Michigan Child Immunization Registry
  • Medicaid Fee-for-Service
  • Medicaid Beneficiary Eligibility
  • Medicaid Provider Eligibility


  • Medicaid Managed Care
  • Medicaid MI Choice Minimum Data Set
  • Newborn Metabolic & Hearing Screening
  • Nursing Home Minimum Data Set
  • Substance Abuse
  • Vital Record - Death/Birth
  • Women Infants & Children
8
Data Warehouse Security
  • Security Profiles are used by the Data Warehouse to protect the confidentiality of MDCH data.


  • Access is restricted for user groups (i.e., certain data may not be available to a particular program area).


  • Access is granted to only needed data.


  • To obtain access Request for Access to Secure Program Data form is to be completed.


  • Program area directors will approve access to their data
9
Unique Client Identifier
Overview
  • The primary goal of the UCI is the development of procedures for accurately linking person-level records from multiple programs, as well as linking individuals with multiple program identifiers (i.e. Beneficiary ID, WIC ID).
  • To accomplish the linking with a minimum of human intervention.
10
Unique Client Identifier
Challenges
  • The agencies use different system identifiers.
  • The name fields do not have uniform formats.
  • The common fields are not present or not populated in all programs.
  • Errors introduced during the Interview, Transcription and Entry processes.
11
Unique Client Identifier
Types of Linking
  • Merge Operation
  • Deterministic Matching
    Scaling Factors, Measures, Set Weights
    Pre-defined Thresholds
  • Probabilistic Matching
    Scaling Factors, Measures, Deterministic Match, Calculated Weights, Computed Thresholds, Multiple Runs
12
Unique Client Identifier
Threshold Diagram
13
Benefits of Unique Client Identifier
  • Links data sets together
    • Transparent record linkage process
    • Process uses a non-accessible linkage identifier
    • At a person level, identifies and organizes data on all services that a person receives across all programs
    • Increases the ability to do pattern analysis
    • Identifies gaps and duplication of services
    • Provides a common source for data rather than multiple sources thus promoting consistent reporting across programs
14
Unique Client Identifier
Security Privacy
  • The owner agency controls access to its data sources
  • Access to data sources is controlled on a user-by-user basis
  • Users cannot determine program participation for programs the user does not have access to
  • Compliance with HIPAA and Center for Disease Control (CDC) privacy requirements
  • Concealment of the UCI during record linkage
15
Controlled User Views
  • Allows users to continue to see individual agency programs
  • Allows users to view clients across programs as needed
  • Data sharing agreements between agencies define data access down to the data element level
16
Identifiable and Linked Data
  • Identifiable data contains information such as name, address, Social Security Number, date of birth, gender, etc. that could be used to identify an individual.  Access to an identifiable model implies access to all information in the model


  • Deidentifiable data has had all possible identifying elements removed or grouped so individuals cannot be identified.


  • Linked databases are linked together using the Unique Client Identifier.  A “foreign” database linked to your program database will show information from the “foreign” database for only your program’s clients.


  • Each program area is working to define specifics of identifiable and deidentifiable data
17
Unique Client Identifier
Availability
18
Blood Lead Data Flow
19
Results of Blood Level Study
20
Ongoing Security Issues
  • Continuing to define acceptable links between databases – requires extensive discussions between program areas
  • Educating users on security
  • Establishing clear guidelines for access by researchers.