Faker-driven, configuration-based, platform-agnostic, locale-compatible data faker tool

Overview

Masquerade logo

Masquerade

Faker-driven, platform-agnostic, locale-compatible data faker tool

Point Masquerade to a database, give it a rule-set defined in YAML and Masquerade will anonymize the data for you automatically!

Out-of-the-box supported frameworks

  • Magento 2
  • Shopware 6

Customization

You can add your own configuration files in a directory named config in the same directory as where you run masquerade. The configuration files will be merged with any already present configuration files for that platform, overriding any out-of-the-box values.

See the Magento 2 YAML files as examples for notation.

For example, to override the admin.yaml for Magento 2, you place a file in config/magento2/admin.yaml. For example, if you want to completely disable/skip a group, just add this content;

admin:

You can add your own config files for custom tables or tables from 3rd party vendors. Here are a few examples:

To generate such files, you can run the masquerade identify command. This will look for columns that show a hint of personal identifiable data in the name, such as name or address. It will interactively ask you to add it to a config file for the chosen platform.

Partial anonymization

You can affect only certain records by including a 'where' clause - for example to avoid anonymising certain admin accounts, or to preserve data used in unit tests, like this:

customers:
  customer_entity:
    provider: # this sets options specific to the type of table
      where: "`email` not like '%@mycompany.com'" # leave mycompany.com emails alone

Delete Data

You might want to fully or partially delete data - eg. if your developers don't need sales orders, or you want to keep the database size a lot smaller than the production database. Specify the 'delete' option.

When deleting some Magento data, eg. sales orders, add the command line option --with-integrity which enforces foreign key checks, so for example sales_invoice records will be deleted automatically if their parent sales_order is deleted:

orders:
  sales_order:
    provider:
      delete: true
      where: "customer_id != 3" # delete all except customer 3's orders because we use that for testing
    # no need to specify columns if you're using 'delete'      

If you use 'delete' without a 'where', and without '--with-integrity', it will use 'truncate' to delete the entire table. It will not use truncate if --with-integrity is specified since that bypasses key checks.

Magento EAV Attributes

You can use the Magento2Eav table type to treat EAV attributes just like normal columns, eg.

products:
  catalog_product_entity: # specify the base table of the entity
    eav: true
    provider:
      where: "sku != 'TESTPRODUCT'" # you can still use 'where' and 'delete'
    columns:
      my_custom_attribute:
        formatter: sentence
      my_other_attribute:
        formatter: email

  catalog_category_entity:
    eav: true
    columns:
      description: # refer to EAV attributes like normal columns
        formatter: paragraph

Formatter Options

For formatters, you can use all default Faker formatters.

Custom Data Providers / Formatters

You can also create your own custom providers with formatters. They need to extend Faker\Provider\Base and they need to live in either ~/.masquerade or .masquerade relative from where you run masquerade.

An example file .masquerade/Custom/WoopFormatter.php;

<?php

namespace Custom;

use Faker\Provider\Base;

class WoopFormatter extends Base {

    public function woopwoop() {
        $woops = ['woop', 'wop', 'wopwop', 'woopwoop'];
        return $woops[array_rand($woops)];
    }
}

And then use it in your YAML file. A provider needs to be set on the column name level, not on the formatter level.

customer:
  customer_entity:
    columns:
      firstname:
        provider: \Custom\WoopFormatter
        formatter:
          name: woopwoop

Custom Table Type Providers

Some systems have linked tables containing related data - eg. Magento's EAV system, Drupal's entity fields and Wordpress's post metadata tables. You can provide custom table types. In order to do it you need to implement 2 interfaces:

  • Elgentos\Masquerade\DataProcessorFactory is to instantiate your custom processor. It receives table service factory, output object and whole array of yaml configuration specified for your table.

  • Elgentos\Masquerade\DataProcessor is to process various operations required by run command like:

    • truncate should truncate table in provided table via configuration
    • delete should delete table in provided table via configuration
    • updateTable should update table with values provided by generator based on columns definitions in the configuration. See Elgentos\Masquerade\DataProcessor\RegularTableProcessor::updateTable for a reference.

First you need to start with a factory that will instantiate an actual processor

An example file .masquerade/Custom/WoopTableFactory.php;

<?php

namespace Custom;

use Elgentos\Masquerade\DataProcessor;
use Elgentos\Masquerade\DataProcessor\TableServiceFactory;
use Elgentos\Masquerade\DataProcessorFactory;
use Elgentos\Masquerade\Output;
 
class WoopTableFactory implements DataProcessorFactory 
{

    public function create(
        Output $output, 
        TableServiceFactory $tableServiceFactory,
        array $tableConfiguration
    ): DataProcessor {
        $tableService = $tableServiceFactory->create($tableConfiguration['name']);

        return new WoopTable($output, $tableService, $tableConfiguration);
    }
}

An example file .masquerade/Custom/WoopTable.php;

<?php

namespace Custom;

use Elgentos\Masquerade\DataProcessor;
use Elgentos\Masquerade\DataProcessor\TableService;
use Elgentos\Masquerade\Output;

class WoopTable implements DataProcessor
{
    /** @var Output */
    private $output;

    /** @var array */
    private $configuration;

    /** @var TableService */
    private $tableService;

    public function __construct(Output $output, TableService $tableService, array $configuration)
    {
        $this->output = $output;
        $this->tableService = $tableService;
        $this->configuration = $configuration;
    }

    public function truncate(): void
    {
        $this->tableService->truncate();
    }
    
    public function delete(): void
    {
        $this->tableService->delete($this->configuration['provider']['where'] ?? '');
    }
    
    public function updateTable(int $batchSize, callable $generator): void
    {
        $columns = $this->tableService->filterColumns($this->configuration['columns'] ?? []);
        $primaryKey = $this->configuration['pk'] ?? $this->tableService->getPrimaryKey();
        
        $this->tableService->updateTable(
            $columns, 
            $this->configuration['provider']['where'] ?? '', 
            $primaryKey,
            $this->output,
            $generator,
            $batchSize
        );
    }
}

And then use it in your YAML file. A processor factory needs to be set on the table level, and can be a simple class name, or a set of options which are available to your class.

customer:
  customer_entity:
    processor_factory: \Custom\WoopTableFactory
    some_custom_config:
      option1: "test"
      option2: false
    columns:
      firstname:
        formatter:
          name: firstName

Installation

Download the phar file:

curl -L -o masquerade.phar https://github.com/elgentos/masquerade/releases/latest/download/masquerade.phar

Usage

$ php masquerade.phar run --help

Description:
  List of tables (and columns) to be faked

Usage:
  run [options]

Options:
      --platform[=PLATFORM]
      --driver[=DRIVER]      Database driver [mysql]
      --database[=DATABASE]
      --username[=USERNAME]
      --password[=PASSWORD]
      --host[=HOST]          Database host [localhost]
      --port[=PORT]          Database port [3306]
      --prefix[=PREFIX]      Database prefix [empty]
      --locale[=LOCALE]      Locale for Faker data [en_US]
      --group[=GROUP]        Comma-separated groups to run masquerade on [all]
      --with-integrity       Run with foreign key checks enabled
      --batch-size=BATCH-SIZE  Batch size to use for anonymization [default: 500]

You can also set these variables in a config.yaml file in the same location as where you run masquerade from, for example:

platform: magento2
database: dbnamehere
username: userhere
password: passhere
host: localhost
port: porthere

Running it nightly

Check out the wiki on how to run Masquerade nightly in CI/CD;

Building from source

To build the phar from source you can use the build.sh script. Note that it depends on Box which is included in this repository.

# git clone https://github.com/elgentos/masquerade
# cd masquerade
# composer install
# chmod +x build.sh
# ./build.sh
# bin/masquerade

Debian Packaging

To build a deb for this project run:

# apt-get install debhelper cowbuilder git-buildpackage
# export ARCH=amd64
# export DIST=buster
# cowbuilder --create --distribution buster --architecture amd64 --basepath /var/cache/pbuilder/base-$DIST-amd64.cow --mirror http://ftp.debian.org/debian/ --components=main
# echo "USENETWORK=yes" > ~/.pbuilderrc
# git clone https://github.com/elgentos/masquerade
# cd masquerade
# gbp buildpackage --git-pbuilder --git-dist=$DIST --git-arch=$ARCH --git-ignore-branch -us -uc -sa --git-ignore-new

To generate a new debian/changelog for a new release:

export BRANCH=master
export VERSION=$(date "+%Y%m%d.%H%M%S")
gbp dch --debian-tag="%(version)s" --new-version=$VERSION --debian-branch $BRANCH --release --commit

Credits

Comments
  • Masquerade (out of the box) doesn't anonymise Paypal payment detail

    Masquerade (out of the box) doesn't anonymise Paypal payment detail

    I wasn't sure whether this is in scope or not as I'm sure most merchants run third-party payment methods. However, considering that Paypal is shipped with Magento, I probably would have expected this to happen.

    opened by erfanimani 12
  • Error after install

    Error after install

    Hi I have followed the installation instructions and "Download the phar file:

    curl -L -o masquerade.phar https://github.com/elgentos/masquerade/releases/latest/download/masquerade.phar "

    When I then run

    php masquerade.phar run --help

    I get the error

    PHP Warning: require(phar:///home/ubuntu/masquerade.phar/vendor/composer/../symfony/translation/Resources/functions.php): failed to open stream: phar error: "vendor/symfony/translation/Resources/functions.php" is not a file in phar "/home/ubuntu/masquerade.phar" in phar:///home/ubuntu/masquerade.phar/vendor/composer/autoload_real.php on line 69 PHP Fatal error: require(): Failed opening required 'phar:///home/ubuntu/masquerade.phar/vendor/composer/../symfony/translation/Resources/functions.php' (include_path='.:/usr/share/php') in phar:///home/ubuntu/masquerade.phar/vendor/composer/autoload_real.php on line 69

    Am I missing a step?

    I am trying to run it on Ubuntu 20.04 with php 7.3.26

    Thanks Rob

    opened by deedy-tech 8
  • doubt about making new build

    doubt about making new build

    Hi! First of all thank your for this tool!

    I'm trying to build a new .phar file following your process but it doesn't seem to work. So I decided just to clone your repo and try to build the same .phar and I found differences between the phar uploaded here and the phar is being created when I use the build process.

    • clone project && composer install
    • download box
    • run build.sh

    Being more specific I found a difference in one of the depencencies: elgentos/parser. composer.lock is pointing to 1.4.2 (5ef1c392c83d928bdb58778618c7811e24f82416) If you take a look at the FileAbstract class inside your actual phar you will see the next code (I'm not able to find this code inside elgentos/parser repository):

    private function safePath(string $path): string
    {
        while (($newPath = str_replace('..', '', $path)) !== $path) {
            $path = $newPath;
        }
        return $path;
    }
    

    But if you take a look at the referenced code in composer.lock FileAbstract you will notice that the code differs:

    private function safePath(string $path): string
    {
        while (($newPath = str_replace(['..', '//'], ['', '/'], $path)) !== $path) {
             $path = $newPath;
        }
         return str_replace(['..', '//'], ['', '/'], $path);
    }
    

    So the new builds can't find correctly the config folder (here is where it fails

    Error message:

    In Glob.php line 55:
     RecursiveDirectoryIterator::__construct(phar:/blablabla/dist/masquerade.phar/src/config/magento2): failed to open dir: No such file or directory
    

    I think that maybe a rebase broke this depencency, any idea why is this happening?

    Thank you in advance.

    opened by danielozano 8
  • Configuration ignored or wrong usage on my side

    Configuration ignored or wrong usage on my side

    Hi,

    I try to anonymize the customer entity except several domain

    So i create a yaml file under

    src/config/magento2/customer.yaml

    in my yaml file i try first with only one domain to be sure everything is working as expected like

    customers:
      customer_entity:
        provider: # this sets options specific to the type of table
          where: "`email` not LIKE '%@domaineA.com' "
    

    I run my command like that :

    ./masquerade.phar run --group=customer --platform=magento2 --config=src/config/magento2/

    but all customer entities are "rewriten" to fake one including the one with domainA.com in the email field.

    What I've done wrong ? and is it available to add a multiple filter like we can do on sql

    customers:
      customer_entity:
        provider: # this sets options specific to the type of table
          where: "`email` not LIKE '%@domaineA.com' AND `email` not LIKE '%@domaineB.com'"
    

    Thanks for your help and thumb up to @peterjaap for the slack help also :)

    opened by julienanquetil 6
  • Allow EAV and other extended table types

    Allow EAV and other extended table types

    Working with Magento2 EAV.

    This adds 'table providers' - so new types of complex table can be added either by the end user or within the package.

    See src/config/sample/magento_eav.yaml for example config.

    It also adds new 'where' and 'delete' options at table level, so you can truncate tables you don't need or only affect certain data - for example, to leave data alone which is used by unit tests.

    This opens the door for adding table types for Wordpress and Drupal, which have their own linked-table systems - eg. in a Wordpress table provider, we'd be able to specify 'postmeta' fields to be anonymised, simply by adding a new class.

    See the updates in README.md for more info and see examples in src/config/sample.

    opened by johnorourke 6
  • Custom config files do not seem to be working

    Custom config files do not seem to be working

    Issue: I followed the documentation surrounding how to add custom config values, but masquerade does not seem to find my custom configuration.

    My ultimate goal is to completely override the admin.yaml file, as I don't want to masquerade any values in the admin_user table. But I'm first trying to get masquerade to pickup my custom config files.

    Steps to reproduce:

    1. Create a config/admin.yaml file with these contents:

      admin:
        admin_user:
          pk: user_id
          columns:
            firstname:
              formatter: firstName
            lastname2:
              formatter: lastName
      
    2. Run masquerade groups

    Expected results:

    +----------+------------------+---------------------------------------+--------------------+---------------------+
    | Platform | Group            | Table                                 | Column             | Formatter           |
    +----------+------------------+---------------------------------------+--------------------+---------------------+
    | magento2 | admin            | admin_user                            | firstname          | email           |
    | magento2 | admin            | admin_user                            | lastname           | lastName            |
    | magento2 | admin            | admin_user                            | lastname2          | lastName            |
    | magento2 | admin            | admin_user                            | email              | email               |
    | magento2 | admin            | admin_user                            | username           | firstName           |
    | magento2 | admin            | admin_user                            | password           | password            |
    …
    

    Actual results:

    +----------+------------------+---------------------------------------+--------------------+---------------------+
    | Platform | Group            | Table                                 | Column             | Formatter           |
    +----------+------------------+---------------------------------------+--------------------+---------------------+
    | magento2 | admin            | admin_user                            | firstname          | firstName           |
    | magento2 | admin            | admin_user                            | lastname           | lastName            |
    | magento2 | admin            | admin_user                            | email              | email               |
    | magento2 | admin            | admin_user                            | username           | firstName           |
    | magento2 | admin            | admin_user                            | password           | password            |
    …
    

    Screenshot, verifying my setup:

    masquerade

    opened by erikhansen 6
  • Skip if table does not exist

    Skip if table does not exist

    Hello,

    is skipping not existing tables possible? I can see #11 resolved non-existing columns issue, but it would be great not to fail when the specific table isn't present (i.e. with a cmd flag switch?)

    opened by uznog 5
  • Remove table alias

    Remove table alias

    Fixes #70 - the alias breaks the delete option.

    I did a test without the table alias and It didn't break anything. I also searched the source code for the alias and I couldn't find anything.

    opened by Tjitse-E 4
  • Redesign of database components to process anonymization extremely fast

    Redesign of database components to process anonymization extremely fast

    Key features of the current redesign:

    • Significant improvement in the performance of anonymization
    • More feedback during process execution
    • Removed inheritance as an extension mechanism

    Prior to these changes, a database with 2 million quotes, 300k customers, and tons of orders was taking around 2 days to complete, after this PR it takes only 17 minutes total.

    opened by IvanChepurnyi 4
  • I think the latest release might be broken?

    I think the latest release might be broken?

    0.1.13 wget https://github.com/elgentos/masquerade/releases/download/0.1.13/masquerade.phar

    returns 647bytes

    0.1.12 wget https://github.com/elgentos/masquerade/releases/download/0.1.12/masquerade.phar

    returns 3.2MB

    This is a terrific tool BTW 🥇

    opened by mikemountjoy99 4
  • Bugfix/explode deprecated error

    Bugfix/explode deprecated error

    On PHP81 the script will throw the following error:

    PHP Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated

    Casting the option to a string will make sure that null is never passed to the explode function.

    opened by rickdegraaf-dq 3
  • Unable to process a line with a primary key set to zero

    Unable to process a line with a primary key set to zero

    First at all, thanks for this wonderful project.

    I am having an issue with a primary key having the zero value
    (It's an existing database, don't ask me how it happens)

    How to reproduce

    Given the following schema:

    CREATE TABLE `my_table` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `label` varchar(255) COLLATE utf8mb3_unicode_ci NOT NULL,
      PRIMARY KEY (`id`),
    ) ENGINE=InnoDB AUTO_INCREMENT=3745 DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_unicode_ci
    

    I insert a record with a primary key set to zero:

    TRUNCATE my_table;
    ALTER TABLE my_table CHANGE id id INT(10) UNSIGNED NOT NULL;
    INSERT INTO my_table (id, label) VALUES (0, 'test 0');
    INSERT INTO my_table (id, label) VALUES (1, 'test 1');
    select * from my_table
    

    If I run the tool with the config below, the entry 0 is not altered:

    entry:
      my_table:
        columns:
          label:
            formatter: firstName
    

    Other entry are changed

    opened by ragusa87 1
  • Run masquerade against a .sql file instead of database

    Run masquerade against a .sql file instead of database

    is it possible to run this agains an mysqldump ? it would be great to masquerade an existing mysql backup instead of changing data on database itself.

    hacktoberfest 
    opened by alexhert 4
  • PostgreSQL compatibility

    PostgreSQL compatibility

    Hello,

    Is Masquerade compatible with PostgreSQL ? I am having many errors:

    SQLSTATE[42704]: Undefined object: 7 ERROR: unrecognized configuration parameter "foreign_key_checks" SQLSTATE[42704]: Undefined object: 7 ERROR: unrecognized configuration parameter "sql_mode" SQLSTATE[42703]: Undefined column: 7 ERROR: column "column_key" does not exist

    opened by GautierDig 1
  • Export db

    Export db

    The GDPR prevents us from downloading any customer data so to get the faked database and not faked the live one on the server we will need an export command that will create a temporary db, clone the original db in to it, faked that db, export it and delete the temp db. Added another attribute to pick a specific place for export to go to, if not default is where the command is run. In the identify command when pressing enter it will skip the current table instead of creating it, just makes it faster than to push "n" and enter. I had cases where email was not identified correctly because it was too low in the list. When creating the config file it's a good idea to let everybody know to what package throws config files belong to. Added an auto-detection for m2 and c5 to skip writhing a long parameter list and also god for non-technical ppl Added small boilerplate for c5 db

    opened by adrianalin89 4
Releases(1.2.2)
Owner
elgentos ecommerce solutions
Magento 2 & Laravel boutique e-commerce agency
elgentos ecommerce solutions
provides a nested object property based user interface for accessing this configuration data within application code

laminas-config This package is considered feature-complete, and is now in security-only maintenance mode, following a decision by the Technical Steeri

Laminas Project 43 Dec 26, 2022
Enables developers to modify Magento installations (configuration, data) based on the given environment using n98-magerun.

Enables developers to modify Magento installations (configuration, data) based on the given environment using n98-magerun.

LimeSoda Interactive Marketing GmbH 73 Apr 1, 2022
A wrapper around faker for factory muffin

Factory Muffin Faker 2.3 The goal of this package is to wrap Faker to make it super easy to use with Factory Muffin. Note that this library does not a

The League of Extraordinary Packages 36 Nov 29, 2022
Import/Export configuration data in Magento 2 via CLI.

ConfigImportExport This module provides new CLI commands for Magento 2 to import/export data in/from core_config_data. This module is inspired by the

semaio 135 Dec 9, 2022
Import/Export configuration data in Magento 2 via CLI.

ConfigImportExport This module provides new CLI commands for Magento 2 to import/export data in/from core_config_data. This module is inspired by the

semaio 117 Mar 23, 2022
Magento n98-magerun module for importing and exporting configuration data

Magento n98-magerun module for importing and exporting configuration data. Import supports hierarchical folder structure and of course different environments.

Zookal 61 Apr 1, 2022
Melek Berita Backend is a service for crawling data from various websites and processing the data to be used for news data needs.

About Laravel Laravel is a web application framework with expressive, elegant syntax. We believe development must be an enjoyable and creative experie

Chacha Nurholis 2 Oct 9, 2022
🙈 Code style configuration for `php-cs-fixer` based on PSR-12.

php-code-style Code style configuration for friendsofphp/php-cs-fixer based on PSR-12. Installation Step 1 of 3 Install gomzyakov/php-code-style via c

Alexander Gomzyakov 5 Nov 27, 2022
A framework agnostic PHP library to build chat bots

BotMan If you want to learn how to create reusable PHP packages yourself, take a look at my upcoming PHP Package Development video course. About BotMa

BotMan 5.8k Jan 1, 2023
A framework agnostic, multi-gateway payment processing library for PHP 5.6+

Omnipay An easy to use, consistent payment processing library for PHP Omnipay is a payment processing library for PHP. It has been designed based on i

The League of Extraordinary Packages 5.7k Dec 30, 2022
A lightweight framework-agnostic library in pure PHP for part-of-speech tagging

N-ai php pos tagger A lightweight framework-agnostic library in pure PHP for part-of-speech tagging. Can be used for chatbots, personal assistants, ke

Giorgio Rey 8 Nov 8, 2022
Contains a few tools usefull for making your test-expectations agnostic to operating system specifics

PHPUnit Tools to ease cross operating system Testing make assertEquals* comparisons end-of-line (aka PHP_EOL) character agnostic Make use of EolAgnost

Markus Staab 1 Jan 3, 2022
WordPress plugin which contains a collection of modules to apply theme-agnostic front-end modifications

Soil A WordPress plugin which contains a collection of modules to apply theme-agnostic front-end modifications. Soil is a commercial plugin available

Roots 1k Dec 20, 2022
Simple opinionated framework agnostic PHP 8.1 enum helper

Enum Helper A simple and opinionated collections of PHP 8.1 enum helpers inspired by archtechx/enums and BenSampo/laravel-enum. This package is framew

Datomatic 52 Jan 1, 2023
⚡ Php snippets, random stuff, demos, functions, fast message system, agnostic and framework free - 100% compactible ;) ⚡

⚡ Php8 FPM Nginx Fast, Scripts, Pearls & Treasures ?? Want to run and test asap ? docker-compose up -d phpgit_php8;ip=$(docker-machine ip default);ech

Benjamin FONTAINE 0 Mar 20, 2022
Ecotone Framework is Service Bus Implementation. It enables message driven architecture and DDD, CQRS, Event Sourcing PHP

This is Read Only Repository To contribute make use of Ecotone-Dev repository. Ecotone is Service Bus Implementation, which enables message driven arc

EcotoneFramework 308 Dec 29, 2022
Last Wishes is a PHP application written following Domain-Driven Design approach

Last Wishes is a PHP application written following Domain-Driven Design approach. It's one of the sample applications where you can check the concepts explained in the Domain-Driven Design in PHP book.

DDD Shelf 644 Dec 28, 2022
Simple yet powerful, PSR-compliant, Symfony-driven PHP Blog engine.

brodaty-blog ✒️ Simple Blog Engine based on pure Markdown files. ?? Works without database, caches HTML templates from Markdown files. ?? Fast and ext

Sebastian 3 Nov 15, 2022
Packet-driven global form interaction-spam blocker for PM.

Looking for testers and README improvers! Form Interaction Fix Interaction-spam Interaction spam is often a problem for players who use the mouse as t

EndermanbugZJFC 5 Dec 16, 2022