Healthcare Data Masking: Tokenization, HIPAA and More (2024)

When trying to protect your data from the nefarious souls that would like access to it (?), there are several options available that apply to very specific use cases. In order for us to talk about the different solutions - it is important to define all of the terms:

  • PII - Personally Identifiable Information - any data that could potentially identify a specific individual. Any information that can be used to distinguish one person from another and can be used for de-anonymizing anonymous data can be considered PII
  • GSA's Rules of Behavior for Handling Personally Identifiable Information - This directive provides GSA’s policy on how to properly handle PII and the consequences and corrective actions that will be taken if a breach occurs
  • PHI - Protected Health Information - any information about health status, provision of health care, or payment for health care that can be lined to a specific individual
  • HIPAA Privacy Rule - The HIPAA Privacy Rule establishes national standards to protect individuals’ medical records and other personal health information and applies to health plans, health care clearinghouses, and those health care providers that conduct certain health care transactions electronically. The Rule requires appropriate safeguards to protect the privacy of personal health information, and sets limits and conditions on the uses and disclosures that may be made of such information without patient authorization. The Rule also gives patients rights over their health information, including rights to examine and obtain a copy of their health records, and to request corrections.
  • Encryption - a method of protecting data by scrambling it into an unreadable form. It is a systematic encoding process which is only reversible with the right key.
  • Tokenization - a method of replacing sensitive data with non-sensitive placeholder tokens. These tokens are swapped with data stored in relational databases and files.
  • Data masking - a process that scrambles data, either an entire database or a subset. Unlike encryption, masking is not reversible; unlike tokenization, masked data is useful for limited purposes. There are several types of data masking:
    • Static data masking (SDM) masks data in advance of using it. Non production databases masked NOT in real-time.
    • Dynamic data masking (DDM) masks production data in real time
    • Data Redaction - masks unstructured content (PDF, Word, Excel)

Each of the three methods for protecting data (encryption, tokenization and data masking) have different benefits and work to solve different security issues . We'll address them in a bit. For a visual representation of the three methods – please see the table below:

Original ValueEncryptedTokenizedMasked
Last Namejohnson8UY%45Sjwjehneosimpson
First Namemargaret3%ERT22##$owhksoesmarge
SSN585-88-9874Mh9&o03ms))93nmvhf93na345-79-4444

Encryption

For protecting PHI data - encryption is superior to tokenization. You encrypt different portions of personal healthcare data under different encryption keys. Only those with the requisite keys can see the data. This form of encryption requires advanced application support to manage the different data sets to be viewed or updated by different audiences. The key management service must be very scalable to handle even a modest community of users. Record management is particularly complicated. Encryption works better than tokenization for PHI - but it does not scale well. Properly deployed, encryption is a perfectly suitable tool for protecting PII. It can be set up to protect archived data or data residing on file systems without modification to business processes.

  • To protect the data, you must install encryption and key management services to protect the data - this only protects the data from access that circumvents applications
  • You can add application layer encryption to protect data in use
    • This requires changing applications and databases to support the additional protection
    • You will pay the cost of modification and the performance of the application will be impacted

Tokenization

For tokenization of PHI - there are many pieces of data which must be bundled up in different ways for many different audiences. Using the tokenized data requires it to be de-tokenized (which usually includes a decryption process). This introduces an overhead to the process. A person's medical history is a combination of medical attributes, doctor visits, outsourced visits. It is an entangled set of personal, financial, and medical data. Different groups need access to different subsets. Each audience needs a different slice of the data - but must not see the rest of it. You need to issue a different token for each and every audience. You will need a very sophisticated token management and tracking system to divide up the data, issuing and tracking different tokens for each audience.

Data Masking

Masking can scramble individual data columns in different ways so that the masked data looks like the original (retaining its format and data type) but it is no longer sensitive data. Masking is effective for maintaining aggregate values across an entire database, enabling preservation of sum and average values within a data set, while changing all the individual data elements. Masking plus encryption provide a powerful combination for distribution and sharing of medical information

Traditionally, data masking has been viewed as a technique for solving a test data problem. The December 2014 Gartner Magic Quadrant Report on Data Masking Technology extends the scope of data masking to more broadly include data de-identification in production, non-production, and analytic use cases. The challenge is to do this while retaining business value in the information for consumption and use.

Masked data should be realistic and quasi-real. It should satisfy the same business rules as real data. It is very common to use masked data in test and development environments as the data looks like "real" data, but doesn't contain any sensitive information.

Healthcare Data Masking: Tokenization, HIPAA and More (2024)

FAQs

What is the difference between data masking and tokenization? ›

Data masking is used to protect sensitive data while allowing the use of realistic test or demo data, while tokenization is used to protect sensitive data while allowing authorized users to access and process the tokenized data, for example, for use in analytics.

What is the difference between redaction and masking? ›

Data masking is often used in non-production environments, for example, during software testing or development, where data structure needs to be maintained without exposing sensitive information. Data redaction is commonly used before sharing documents outside the organization, where specific details must be hidden.

Is tokenization a form of masking? ›

Masking is essentially permanent tokenization. Sensitive information is replaced by random characters in the same format as the original data, but without a mechanism for retrieving the original values.

What are the 5 HIPAA rules? ›

HHS initiated 5 rules to enforce Administrative Simplification: (1) Privacy Rule, (2) Transactions and Code Sets Rule, (3) Security Rule, (4) Unique Identifiers Rule, and (5) Enforcement Rule.

What is a simple example of data masking? ›

Data values are substituted with fake, but realistic, alternative values. For example, real customer names are replaced by a random selection of names from a phonebook.

What is an example of data masking? ›

Substitution masking involves replacing sensitive data with similar but fictitious data. For example, you can replace actual names with names from a predefined list. You can also use algorithms to generate similar but fake credit card numbers.

What is the difference between data masking and data obfuscation? ›

Data masking changes the value of data while using the same format for the masked data. Two major differences distinguish data masking from other types of data obfuscation. First, masked data is still usable in its obfuscated form. Second, once data is masked, the original values cannot be recovered.

What is the difference between data masking and anonymization? ›

Data anonymization removes classified, personal, or sensitive information from datasets, while data masking obscures confidential data with altered values.

What is the purpose of data masking? ›

Data masking is essential in many regulated industries where personally identifiable information must be protected from overexposure. By masking data, the organization can expose the data as needed to test teams or database administrators without compromising the data or getting out of compliance.

What is an example of data tokenization? ›

Tokenization replaces a sensitive data element, for example, a bank account number, with a non-sensitive substitute, known as a token. The token is a randomized data string that has no essential or exploitable value or meaning.

What are examples of tokenization? ›

Tokenization is used to secure many different types of sensitive data, including:
  • payment card data.
  • U.S. Social Security numbers and other national identification numbers.
  • telephone numbers.
  • passport numbers.
  • driver's license numbers.
  • email addresses.
  • bank account numbers.
  • names, addresses, birth dates.

What is tokenization in simple words? ›

Tokenization refers to a process by which a piece of sensitive data, such as a credit card number, is replaced by a surrogate value known as a token.

What is the golden rule of HIPAA? ›

The Health Insurance Portability and Accountability Act (HIPAA) is the golden rule when it comes to health data confidentiality. Any company that deals with Protected Health Information (PHI) must have the proper network, physical storage, and security measures in place to ensure they are in compliance with HIPAA.

What is the HIPAA data rule? ›

The HIPAA Privacy Rule provides federal standards to safeguard the privacy of personal health information and gives patients an array of rights with respect to that information, including rights to examine and obtain a copy of their health records and to request corrections.

What are 3 key elements of HIPAA? ›

The Health Insurance Portability and Accountability Act (HIPAA) lays out three rules for protecting patient health information, namely:
  • The Privacy Rule.
  • The Security Rule.
  • The Breach Notification Rule.

What is the difference between dynamic masking and tokenization? ›

Both static and dynamic masking are irreversible and therefore applied to a copy of the data. The original data is not de-identified. In contrast, Tokenization de-identifies the original data in place. Only actors with access to the tokenization (symmetric) key can decrypt and read the original value.

What is the difference between dynamic data masking and external tokenization? ›

Dynamic Data Masking is a Column-level Security feature that uses masking policies to selectively mask plain-text data in table and view columns at query time. External Tokenization enables accounts to tokenize data before loading it into Snowflake and detokenize the data at query runtime.

What do you mean by data masking? ›

Data masking is a data security technique in which a dataset is copied but with sensitive data obfuscated. This benign replica is then used instead of the authentic data for testing or training purposes.

What is meant by tokenization of data? ›

What is Tokenization. Tokenization replaces a sensitive data element, for example, a bank account number, with a non-sensitive substitute, known as a token. The token is a randomized data string that has no essential or exploitable value or meaning.

Top Articles
Latest Posts
Article information

Author: Margart Wisoky

Last Updated:

Views: 6271

Rating: 4.8 / 5 (78 voted)

Reviews: 93% of readers found this page helpful

Author information

Name: Margart Wisoky

Birthday: 1993-05-13

Address: 2113 Abernathy Knoll, New Tamerafurt, CT 66893-2169

Phone: +25815234346805

Job: Central Developer

Hobby: Machining, Pottery, Rafting, Cosplaying, Jogging, Taekwondo, Scouting

Introduction: My name is Margart Wisoky, I am a gorgeous, shiny, successful, beautiful, adventurous, excited, pleasant person who loves writing and wants to share my knowledge and understanding with you.