STEP1       MODERNIZATION
STEP2       ADMINISTRATION
MIGRATE-DB
RELATE-DB
EVOLVE-DB
VIEWER-DB
TEST-DB
ANONYMIZE-DB
METHODOLOGIES

Anonymization Challenges

In a properly managed DevOps environment, only anonymized data should be available in Dev/Test databases.

  • The Dev/Test database environment should be open by nature with little constraints in order not to impair the work of the Designers, Developers and Testers
  • The data on those environments should be as close as possible to Prod data but should be protected. The best way to handle this “dilemma” is to anonymize sensitive data for two main purposes:

– Complying with legal and regulatory concerns about sensitive personal data protection

– Protecting business data from possible breaches

Which data should be anonymized?

  • Sensitive personal data should not be attributed to an identifiable person
  • Sensitive business data should not be retrieved by the competition

What is “sensitive” is decided by legal, cultural and business concerns depending on the specific organization and country.

Anonymization technical constraints

When data is anonymized, it is imperative not to break existing application constraints and validations

  • Preserving the inner coherence of composite data (e.g.: address with street, city, zip…)
  • Preserving the structure rules of the data (e.g.: Credit Card, Iban… structure)
  • Preserving dependency rules (e.g.: Social Security number might contain birth date which is also stored in a different column of the record)

Anonymize-DB: Anonymize data in Dev/Test environments.

Once anonymized, developers have a high level of freedom for their tests, and Development teams can work free of constraint with high-quality data without endangering compliance with legal regulations or internal business rules.

Provides the anonymization manager with a user-friendly and effective tool to anonymize sensitive business or personal data in your Dev/Test databases

 Your security and legal compliance are enforced.

  • This module can be used as a complete stand-alone solution (independent of other Xcase for i solutions) to comply with your regulation obligations.
  • Anonymize-DB supports multiple concurrent RDBMS (DB2, SQL Server, My SQL…)

Anonymize DB main benefits:

High quality anonymized data

  • Usable (no hieroglyphs, pronounceable…)
  • Coherent (repeated data is anonymized in the same way)
  • Making sense (attribute separately anonymized female and male surnames when appropriate)
  • Culturally coherent (use addresses and names matching the organization geographical location)
  • Consistent (same anonymization is applied when data is re-loaded, so the Test/Dev environment remains familiar)
  • Non-reversible. It is not possible to deduce the original value from the anonymized one when unauthorized

Facilitates discovery of redundant sensitive data

  • The anonymization consistency is ensured by the use of conversion dictionaries stored in a secured location
  • 8 different anonymization methods are proposed, allowing to use the most appropriate one for each type of data
  • Anonymize-DB identifies groups of columns where the same data appears in multiple locations across the database, allowing you to recognize them as belonging to the same “domain”, and to anonymize them consistently
  • Anonymize-DB creates unified consistent dictionaries even when data is stored in multiple and heterogeneous databases

Facilitates the work and performance of your team

  • Anonymize-DB produces an SQL script, that can be ran by the infrastructure operator, which anonymizes the data. This script can be used for multiple data sets and environments yielding to a repeatable and automatable process
  • With the quality anonymized data provided, testers feel as if they were in production and data remains consistent even when re-loaded
  • Test-DB, if combined with Anonymize-DB, will generate only a small subset of the Prod database for Test/Dev so the anonymization process will have a minimal impact on the data delivery time

The whole process is documented

  • Anonymize-DB provides textual and graphical documentation that is important both for auditing reports and for follow-up and system evolution

What is the Anonymize-DB methodology?

Identifying Column Groups: Anonymize-DB identifies groups of columns where data appear in multiple locations across the database, allowing you to identify them as belonging to the same “domain”, and to anonymize them consistently.

Dictionaries: The anonymization consistency is ensured by the use of conversion dictionaries, stored in a secured location. The conversion method in these dictionaries is configured by the anonymization manager.

Methods of Anonymization: 8 different anonymization methods are proposed, allowing to use the most appropriate one to each type of data, and to mix methods to further enforce security.

Data in multiple and heterogeneous databases: Anonymize-DB creates unified consistent dictionaries even when data are stored in multiple and heterogeneous databases.

Activation: Anonymize-DB produces an SQL script, which can be transmitted and ran by the infrastructure operator, which anonymizes the data. This script can be used for multiple data sets and environments yielding to a repeatable and automatable process.

Performance: When properly set, the amount of data to be anonymized is in general small. Consequently, the time needed to anonymize data does not add much to the test database delivery process.