Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed WooCommerce data privacy eraser query deletes all orders. #2975

Merged

Conversation

sun
Copy link
Contributor

@sun sun commented Aug 31, 2022

Description of the Change

Scope

  • Major loss of customer data, only recoverable from backups.

Context/Requirements

  • ElasticPress is enabled for both admin and AJAX queries.
        add_filter('ep_admin_wp_query_integration', '__return_true');
        add_filter('ep_ajax_wp_query_integration', '__return_true');

Problem

  • When using the built-in tool for removing personal data in WordPress Core on /wp-admin/erase-personal-data.php, all orders (⚠️) are anonymized instead of only the requested ones.

Cause

  1. WooCommerce sets WC_Order_Query filters on 'customer' in
    https://github.com/woocommerce/woocommerce/blob/9e9b4ef844ef015388c21a401aba7bdee11a0d72/plugins/woocommerce/includes/class-wc-privacy-erasers.php#L125-L133

  2. The WooCommerce order data store expands the 'customer' parameter in
    https://github.com/woocommerce/woocommerce/blob/9e9b4ef844ef015388c21a401aba7bdee11a0d72/plugins/woocommerce/includes/data-stores/class-wc-order-data-store-cpt.php#L453-L466

    into a meta query, resulting in the following meta query:

        [meta_query] => Array
            (
                [0] => Array
                    (
                        [relation] => OR
                        [customer_emails] => Array
                            (
                                [key] => _billing_email
                                [value] => Array
                                    (
                                        [0] => example@example.com
                                    )
                                [compare] => IN
                            )
                        [customer_ids] => Array
                            (
                                [key] => _customer_user
                                [value] => Array
                                    (
                                        [0] => 12345
                                    )
                                [compare] => IN
                            )
                    )
            )
    

    ☝️ Note that WooCommerce is using named keys (customer_emails and customer_ids) for the conditions and not indexed keys.

  3. The meta query processing in Indexable only expects indexed keys, and there is no following handling of other array keys:

    } elseif ( is_array( $single_meta_query ) && isset( $single_meta_query[0] ) && is_array( $single_meta_query[0] ) ) {

    so the meta query parameters are ignored altogether – resulting in the following, unfiltered ES statement querying all orders:

    GET active-post-1/_search
    {
      "from": 0,
      "size": 10,
      "sort": [
        {
          "post_date": {
            "order": "desc"
          }
        }
      ],
      "query": {
        "match_all": {
          "boost": 1
        }
      },
      "post_filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "post_type.raw": [
                  "shop_order",
                  "shop_order_refund"
                ]
              }
            },
            {
              "terms": {
                "post_status": [
                  "wc-pending",
                  "wc-processing",
                  "wc-on-hold",
                  "wc-completed",
                  "wc-cancelled",
                  "wc-refunded",
                  "wc-failed",
                  "wc-checkout-draft"
                ]
              }
            }
          ]
        }
      }
    }
    
  4. All orders are getting anonymized.
    image

Proposed solution

  1. Remove the condition that is only considering meta query conditions in indexed keys.

Notes

  • It is probably true that all documentation and examples about meta query conditions is using indexed keys, but neither the docs nor the meta query code is stating in any way that this would be a requirement.
  • Therefore, using named keys like WooCommerce is perfectly legit.

How to test the Change

  1. Enable ElasticPress for admin and ajax requests (see above).
  2. Enable WooCommerce and create some orders with different users.
  3. Use the personal data eraser tool of WordPress Core.

Changelog Entry

Fixed WooCommerce data privacy eraser query deletes all orders if ElasticPress is enabled for admin and Ajax requests.

Credits

GitHub

Props @sun, @bogdanarizancu

WordPress.org

Props tha_sun, bogdanarizancu

Checklist:

  • I agree to follow this project's Code of Conduct.
  • I have updated the documentation accordingly.
  • I have added tests to cover my change.
  • All new and existing tests pass.

@sun
Copy link
Contributor Author

sun commented Aug 31, 2022

Cypress errors on EP.io are not relevant / not caused by this PR

      -'123456Starting sync…
    Indexing posts…Mapping failed
    cURL error 7: Failed to connect to 172.17.0.1 port 8890: Connection refused
    Number of posts index errors: 74
    Sync complete'

...
Error: Could not fetch indices names.

sun added a commit to makers99/wp-cli-shared-patches that referenced this pull request Sep 1, 2022
@sun
Copy link
Contributor Author

sun commented Sep 6, 2022

@felipeelia This PR could use some attention before more people running into it, because of severe data loss.

Enabling ElasticPress also for admin and Ajax queries is not the default, of course. But it's fair to assume that larger sites that need Elasticsearch in the frontend will also enable it in the backend.

Normally, all queries integrating with Elasticsearch are just searches and lookups. But in the special case of the data eraser, the lookup results are actually used to delete/anonymize meta data without further validation.

@felipeelia felipeelia added this to the 4.4.0 milestone Sep 6, 2022
@felipeelia felipeelia self-assigned this Sep 6, 2022
@felipeelia
Copy link
Member

Thanks @sun, I'll give a look into it in the next few days.

@chaselivingston chaselivingston modified the milestones: 4.4.0, 4.3.1 Sep 12, 2022
@felipeelia felipeelia merged commit dfc45ab into 10up:develop Sep 16, 2022
felipeelia added a commit that referenced this pull request Nov 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants