Exciting Holiday Reading: How to De-Identify Health Information!
Health Law Update 12/03/12 Sarah E. Coyne
Fellow HIPAA-philes: Dry your tears and get ready for a cozy winter reading session! You may not have the HITECH rules yet, for which you have pined and clamored for months on end, but you now have guidance on how to de-identify. Why is this exciting? Because once the information is de-identified, you no longer care about HIPAA! Well, perhaps that is a dramatic overstatement. But you certainly care less than you did prior to de-identification.
On November 26, 2012, the Health & Human Services Office of Civil Rights ("OCR") released guidance on the de-identification of health information pursuant to the HIPAA Privacy Rule. As you all know well, the Privacy Rule permits a covered entity or business associate to use and disclose de-identified information, i.e., information that neither identifies an individual nor provides a reasonable basis to identify an individual. (Translation: Even if you are burningly curious and completely immoral, you will not be able to figure out the identity of the patient described.)
We never had a depth of guidance on exactly how to be sure a given record was de-identified. We knew we had to remove the many identifiers listed in the rule but, when push comes to shove, how do you really ensure you've found them all? We were given the famous two options: the Safe Harbor Method and the Expert Determination Method. These sounded good on the surface but may have given you some sleepless nights in practice. The guidance may help with the sleeplessness problem. No, not because it is boring! Would we say that??? (Don't answer that.) You can rest peacefully now that OCR has provided us with practical information on how to implement each of these methods.
Safe Harbor Method
The Safe Harbor Method is also known as "How to Make Employees Quit Their Jobs." This method means removing all 18 identifiers relating to the individual or to relatives, employers or household members of the individual. For example, under the Safe Harbor Method you can't use:
- Derivatives of patient identifiers (such as patient initials or the last four digits of patient social security numbers);
- Dates of test measures (such as lab report dates or dates of procedures during a week-long admission)
- Barcodes; or
- Unique characteristics (such as listing an occupation that could identify the patient, i.e., "current President of the United States;" the authors think that this listing could potentially identify the patient, at least for very sophisticated snoopers).
The guidance provides a bit more detail on which occurrences really need to be removed from records, but if OCR had really wanted to make us happy, they would have put out a music album with the guidance to entertain us while we are tediously redacting.
In addition to removing the 18 identifiers, the covered entity must not have actual knowledge that the information could be used alone or in combination with other information to identify an individual who is a subject of the information. For example, a well-publicized event (think "Octomom") might make some seemingly de-identified data identifiable (e.g., number of children born to a patient in 2009). Information is also not de-identified if you know the recipient has an algorithm that can be used to identify the information. We are not sure but we think spelling out a patient's name in "Pig Latin" would not pass muster.
Expert Determination Method
The Expert Determination Method requires that an "expert" determine that the risk is "very small" that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information. The Privacy Rule requires that the expert have "appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable." Who are these people? Where do they live? How much do they charge?
The new guidance gives some helpful tips but leaves room for wild speculation:
- Expert Qualifications: An expert need not have a specific professional degree or attend a special certification program. Instead, relevant expertise may be gained through various routes of education and experience. The authors wonder if we are actually such experts? We don't think so because we have trouble with math.
- "Very Small" Remains a Mystery: The authors note that "very small" is one of our favorite parts of the rule. What is a very small risk? We don't know. One of the authors is pretty sure that the risk of her daughter getting an "A" in pre-calculus is very small. Perhaps this could be a benchmark? The guidance does not preclude it!
- Replicability and Distinguishability: Are these not cool words? We are proud of OCR for using these difficult-to-use words, which describe two principles experts may apply to assess the risk of whether a patient is identifiable from the records at issue. Replicability refers to the chance that the information will consistently occur with regard to the patient (e.g. birth date is a highly replicable piece of information because a person only has one birth date, although one of the authors has considered changing hers). Distinguishability refers to the extent to which the patient's data is different from others. OCR also notes that whether there is an available data source that could possibly connect the patient's replicable health information to the patient's name is also relevant.
So what's the takeaway message? Understand that you have to rebut the presumption, which is that the records are identifiable. You rebut the presumption by meeting the requirements of one of these two de-identification methods and neither one is easy. Thus you had better get busy rebutting.
 45 C.F.R. § 164.514(a)-(c).
 Names; all geographic subdivisions smaller than a state, including street address, city, county, precinct, zip code, and their equivalent geocodes, except for the initial three digits of a zip code for geographic units containing more than 20,000 individuals; all elements of dates (except year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death, and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older; telephone numbers; fax numbers; email addresses; Social Security Numbers; medical record numbers; health plan beneficiary numbers; account numbers; certificate/license numbers; vehicle identifiers and serial numbers, including license plate numbers; device identifiers and serial numbers; web Universal Resource Locators (URLs); Internet Protocol (IP) address numbers; biometric identifiers, including finger and voice prints; full face photographic images and any comparable images; and any other unique identifying number, characteristic, or code, except as permitted for re-identification purposes.