The 2020 URL Querystring Data Leaks — Millions of User Emails Leaking from Popular Websites to…

Breaches have been found on websites including Wish.com, JetBlue.com, Quibi.com, WashingtonPost.com, NGPVan.com and numerous other organizations…

Most popular websites on the internet are using 3rd party analytics and advertising Javascript code — and depending on how a website sets up their marketing systems, typically email systems and new user signup flows, the user emails can accidentally and/or purposefully leak to companies across the global data supply chain.

The organizations included in this research have hundreds of millions of emails and real users between them — and only Wish.com, Mailchimp and The Washington Post took this report on their user email breaches seriously — Wish updated their email system within ~72 hours of the report being sent and the other two started taking actions relatively quickly — whereas many other organizations either didn’t respond or have failed to take any actions for weeks or months.

All organizations need to be aware of this significant user data vulnerability, but more importantly, there needs to be significant efforts by organizations sharing user emails in this way, to submit *partner deletion requests* to the 3rd party advertising and analytics companies who received the user emails.

Throughout this research, some of the advertising companies that were tracked receiving the user emails are included — but this should not be considered 100% complete due to the fact that this has been going on for years on some websites, and it’s impossible to know externally all of the organizations who synced data on a specific website or webpage at any historical point.

All organizations included in this research leaking user emails should publicly post the list of all their historical advertising and analytics vendors who could have received the user emails while their respective breaches were active.

One important trend to notice is how often Google Analytics, Google’s DoubleClick, Facebook, and Twitter are ingesting the user emails — these are organizations that should be receiving deletion requests en-masse and they should all have processes to handle this type of effort already (Facebook likely has this tech already based on conversations on this research and additional research from a private report from several years ago).

In this research, there are also “red flag organizations” who have ingested user emails that are small or relatively unknown organizations, yet likely receiving huge amounts of user emails in their request logs. These smaller organizations need a unique type of scrutiny due to the power that an advertising or analytics company can attain from ingesting millions of user emails from their enterprise clients — the Cambridge Analytica effect if you will…

Numerous Enterprise Organizations Leaking User Emails Through 3rd Party Javascript Request Headers Sent via Browsers to 3rd Party Advertising & Analytics Companies

Each of these orgs leaked user emails by unsafely appending the user email to a URL in plain text (or encoded in base64) and then having the user emails leak to 3rd party advertising and analytics companies.

When any 3rd party Javascript code loads on a website, metadata from the user and the website can be transmitted to the 3rd party domain / company that controls that code — this is technically through the “Request Headers” sent through a browser — and this data can include what page a user is visiting, what type of device and browser they are using, their location, and other forms of fingerprinting / cookies / URL querystring/ URL parameters that are used by advertising and analytics companies.

This type of email user data in a URL bar synced into Javascript pixels is most typically blocked by a regular person through “Ad blockers” or through browsers like Safari, Brave, and Firefox — those browsers use Javascript/cookie blocking as a default features to protect users (each browser handles it slightly differently). This breach and research included here would impact all Chrome users of these websites who went through these specific user flows and who didn’t proactively block all Javascript (a rarely used option) or use a Chrome “Ad blocker” extension that blocked this type of Javascript. Some people using the other “safe” browsers (Safari/Brave/Firefox) could have been protected from the leak due to their 3rd party Javascript requests being blocked.

Most of the data breaches that were found (some are still live breaches as of publishing) are caused by a sloppy and dangerous growth hack that is used to improve attribution tracking for analytics tools and used to optimize and segment retargeting advertising campaigns.

Several of the breaches involve “plain text” user emails — this is when you can literally read the email in the URL with minimal changes/encodings.

Some of the breaches involve a form of plain text known as “base64 encoding” — in short, base64 is a programming language feature that is NOT a form of encryption and provides no user protections. A base64 string can be decoded through many tools, and there is even a free service from the s̶p̶i̶e̶s̶ nice folks at GCHQ called CyberChef for parsing custom base64 encodings.

Before I get into the details about how this breach happens, and the specific circumstances surrounding the examples, I want to briefly acknowledge and give credit to the team at Wish.com for how quickly they changed their entire email architecture after being informed of their breach — in less than 72 hours Wish had completely rebuilt their email architecture and they had built a completely new auto-login flow via email.

I believe the Wish.com breach was the largest out of all the examples in this research, and it lasted over a year and likely involved hundreds of millions of user emails in a base64 plain-text format being shared with analytics and advertising companies, but their work to quickly escalate the problem, realize the scope, and then pull the trigger to rebuild their systems was a dramatically better response than how other organizations handled these reports. I believe Wish and all organizations in this research should be requesting deletion of user emails from any 3rd party logs held by external advertising and analytics companies, but it appears no organization has submitted this request to their partners, even after being notified of their breaches.

For the most part, most of these user email data breaches are still live as of publishing this research — and in this research I’ll show you how to “breach yourself” by just using current website signup flows and other normal website features on the specific websites in question.

I also want to thank Eliya Stein at Confiant.com for being a sounding board on these technical issues, and helping to provide an additional vet and other important context around the Wish.com breach (those details below).

3rd Party Javascript Collects a “Referrer” URL Field, Which Can Leak User Data and Email Addresses from a Website

This research is focused on a specific type of user data breach that occurs due to how Javascript collects data on a website. When a user loads a web page, the URL that they are visiting, along with any URL parameters (extra tracking codes appended after a “?” in a URL) are shared with any advertising or analytics companies through the javascript code on that page and through a technical browser transmission “request header” known as a “Referrer” field.

Quibi Leaking New User Emails on Email Confirmation Webpage to Advertising and Analytics Companies

(Pre-Publishing Note: Quibi reached out hours before publication with an apology and several sentences explaining “how this happened” and what they were doing to fix it. Apparently they no longer leak user emails — I have doubts about some of their statements and will let other reporters publish their remarks)

When you install the Quibi app, you are asked to submit an email to create your account, and then emailed a confirmation link that must be clicked to confirm the account. When a user clicks this email confirmation link, their email address is appended into the URL they are clicking in plain text, and sent to 3rd party advertising and analytics companies.

Quibi was informed of their user email data breach on April 17, 2020 but haven’t responded to the details other than through their automated customer support system.

Here’s a screen shot showing the Quibi New User Email Verification Webpage URLs and how this page was built to leak the user email in plain text to advertising and analytics companies:

That same “Email verification” webpage above from Quibi sends the data to advertising and analytics companies through the referrer fields in the request headers — a screen shot below includes the user email sync from Quibi to Snapchat’s sc-static.net advertising endpoint.

Here’s a screen shot of the Twitter request as it receives the user email in the URL:

Here’s what one of the email confirmation links looks like:

https://quibi.com/email_verified/?email=quibi%40victorymedium.com&message=This%20URL%20can%20be%20used%20only%20once&success=false&_branch_match_id=759077528166021115&utm_source=Email&utm_campaign=Account%20Management&utm_medium=Email%20Verification#

When initially tested, the user email address in plain text format was transmitted to:

1) Google’s DoubleClick.net endpoint

2) Google’s updated ads endpoint @ google.com

3) Google Tag Manager (and therefore potentially custom tags could fire for specific visitors/geos/URL params, thus leaking this to more companies)

4) Twitter ads endpoint

5) Snapchat ads endpoint & the tr.Snapchat.com subdomain

6) Google Cloud infrastructure via cloudfunctions.net

7) CivicComputing.com, which redirects to https://www.civicuk.com/ and appears to be a company based in the United Kingdom.. this raises big GDPR red flags….
8) Facebook events / custom audiences for ads

9) Google ads conversion pixel

10) Twitter ads conversion pixel

11) Google Analytics

12) Facebook analytics, Google Analytics, Twitter analytics (they fire at the end of the page load again)

The Quibi new account email confirmation flow was tested again on April 26, 2020 and it was confirmed that the user email is still being appended to the email confirmation page URL in plain text and leaked to 3rd party advertising and analytics companies.

Since the original test, several new advertising companies were found receiving the user data including LiveRamp.com, SkimAds, and Tapad — it seems likely that numerous ad tech orgs have been syncing the Quibi new user emails and the list included here could be incomplete.

Quibi’s user data breach is one of the most egregious in this research, because they are a new and extremely well-funded organization and were launched well after both GDPR and CCPA went into effect. In 2020, no new technology organizations should be launching that leaks all new user-confirmed emails to advertising and analytics companies — yet that’s what Quibi apparently decided to do.

Out of all the data breaches in this research, the Quibi research is the hardest to swallow due to how new this organization is, and how much money they had to push into their marketing and advertising to grow new users — it’s an extremely disrespectful decision to purposefully leak all new user emails to your advertising partners, and there’s almost no way that numerous people at Quibi were not only aware of this plan, but helped to architect this user data breach.

It’s 2020, and this type of growth-hack needs to stop being green lit. Quibi needs to explain to their users why this was done and why it hasn’t been changed even after being notified…

The Biggest Breach: Wish.com Likely Leaked Hundreds of Millions of User Emails for Over a Year, With the User Emails Encoded into Base64 Strings

From July 2018 until January 2020 when this research was initially shared with Wish.com, Wish transmitted user emails to at least Google, Facebook, Pinterest, Criteo, PayPal and Stripe, and potentially other companies.

In July 2018, Wish.com deployed code that started their user email breaches — this was tracked due to user emails in base64 format being cached in systems like URLscan.io — the Wish.com developers deployed code that started to encode users emails in base64 plain text and then append that string into URLs sent to users via email in a URL parameter named “ee” — when users clicked on any marketing emails from Wish, their email was appended to the URL for any page/product-page they clicked from the marketing emails from Wish, and then when the user visited the Wish page, their email in base64 format was transmitted to Wish’s 3rd party advertising and analytics partners.

A URLScan.io capture of a Wish.com page view from July 23, 2018 that captured a user’s “ee” parameter and their email encoded in base64 plain text. The ee string is blurred for privacy due to it containing the user email.

Approximately ~72 hours after being informed of this research Wish rebuilt their entire email architecture and stopped appending the “ee” parameter with base64 user encoded emails into marketing emails. It does not appear Wish has informed their users of this user email breach, but they did take the issue very seriously and quickly agreed that the base64 email encoding was a practice they weren’t going to continue. Minimal comments from Wish were received after the research was submitted, but they were more professional than the most organizations when confronted with this type of research.

Due to Wish.com being a massive multi-billion dollar company, who in 2015 was Facebook and Instagram’s #1 app advertiser over Christmas, spending upwards of $100 million, and their previous valuations, it’s likely that tens of millions, if not hundreds of millions of user emails were pushed through the “ee” parameter and leaked to advertising and analytics companies.

To repeat: from 2018–2020, most if not all of the Wish.com marketing emails appended user emails in a format that, if the user clicked on the email and they were using a browser that didn’t block 3rd party javascript, then that user had their email in base64 plain text format leaked to 3rd party advertising and analytics companies including Google, Facebook, Pinterest, Criteo, PayPal and Stripe, and potentially other companies.

The URLs being shared by Wish during this period looked like this (my base64 email is replaced below with XXXXX):

https://www.wish.com/feed/xparam-5e1ca48aac2ad7067968f60b?utm_campaign=5e1ca314ac2ad7067968f60a&uuid=214e5a2c231841ca89c7bb953681de36&cmpgnid=5e1ca314ac2ad7067968f60a&ee=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&email_section=core_cids_0&rerank=5a6c31baff96b67f61800650&exzpl=ctp-1&filter=xparam-5e1ca48aac2ad7067968f60b&utm_medium=email&utm_source=Wish+Discount&recvuid=5de37c007ad4122b6cdd8bc6&iscommerc=1

At least thousands of Wish.com users had their base64 email address cached into Google search results, URLScan and other public systems — today, you can still search for this on Google, and a huge portion of the results, you will see are actually the “ee={user-base64-emails}” string that leaks via email clicks: https://www.google.com/search?q=site%3Awish.com+inurl%3Aee

Confirmation from Eliya Stein at Confliant.com, Including Flagging Several New Organizations Receiving Data from Wish.com

Early in this research, Eliya Stein (Twitter, Linkedin, Confiant) was contacted for a quick technical double check, and helped to identify several new service providers that had been receiving data from the Wish.com base64 user data breach. Eliya’s concise report is included here:

It didn’t take long for Wish to send me a marketing email and I was able to confirm the finding immediately.

When a subscriber clicks a link in the email, the destination URL has several parameters in the querystring, including “ee” which is paired with the recipients base64 encoded email address.

This means that this entire URL, including the ee parameter can potentially be leaked to any 3rd party resources that are loaded on the page.

In this case, I can confirm that it’s being leaked at least to Facebook, Google, Pinterest, and Critieo per Zach’s observation, but also additional 3rd parties including Paypal & Stripe.

From my observations, it looks like these are mostly tracking endpoints and not actual ad slots on these pages. If they ever introduce display ads connected to rtb on these pages, then the impact of this leak has the potential to be quite large.

I’m not in a great position to comment on GDPR implications, because that’s a little bit outside of my expertise, but for sure it’s a terribly bad practice to pass around PII in plain text in the URL like this, and I do consider base64 encoding to be plain text.

One thing that we’re not able to observe is if and how this data is being abused, but any ad tech company with integrity should scrub data like this if they recognize it as PII.

I’ve included a screenshot of the parameter being leaked to Facebook via the referer.

Wish.com, like the other organizations included in this research, would ideally submit deletion requests to all of their advertising and analytics partners who received data during this period with requests to those partners to delete the request logs containing base64 user emails.

Ideally, Wish.com would also inform their own users about this breach too.

JetBlue.com Still Leaking New User Emails to Advertising and Analytics Partners

Jetblue has known about their ongoing data breach since March 2020 and sent several email responses after being shown this research, but still haven’t made any changes to their website or the ongoing leak of new user emails to 3rd party advertising and analytics companies.

After being informed of the leak, Jetblue stated they would never do what they are doing because it would be against the law (*NOTE: JetBlue wrote “Federal Passenger Privacy Act” in their response — this may be a reference to a 1974′ privacy bill — or as this Berkley Law paper on page 14 indicates, JetBlue has sent this statement before and is possibly referencing a nonexistent law), writing this in a response recently:

We regret to hear of any disappointment you experienced when creating a TrueBlue account. We can assure you we don’t share your information. The Federal Passenger Privacy Act* strictly prohibits the release of any information regarding our customers or their travel to any other party. We even require specific security information to verify the identity of our customers before we’re able to discuss their own information.

These details were tweeted out in March and then emailed to JetBlue, which still didn’t have an impact to get them to change:

Howdy Data Supply Researchers, tired of sitting on all this research I have so i'll drip some out right now.

JetBlue claims they don't have an active data breach today, that they don't share the user email on registration w/ ad tech companies.

Test me https://t.co/4aKbLxCRX8 pic.twitter.com/rDxEckRRk2

— ℨ𝔞𝔠𝔥 𝔈𝔡𝔴𝔞𝔯𝔡𝔰 (@thezedwards) March 17, 2020

Here’s the flow of how all new JetBlue users are having their email addresses leaked to 3rd party advertising and analytics companies, in violation of the Federal Passenger Privacy Act* (and potentially other privacy laws)— step one, click “Join” in the menu bar on Jetblue.com from the homepage or any page on the site:

Then, you’ll be prompted to enter your email — whatever you enter here, when you click the next step, your email is passed into the URL and subsequently leaked to the 3rd party advertising and analytics companies:

Here’s a screen shot of the next step, with the user email being passed into the URL — the icon showing “45” is the Ghostery.com count of advertising & analytics companies receiving data on the webpage — it’s not a complete list but this shows dozens of companies are receiving user emails from the current JetBlue.com data leak.

Here’s a screen shot from a previous test showing one of the advertising pixels firing and how it receives the user data through the request headers (notice only 39 pixels were tracked on this page last month, April’s test showed 45):

The companies receiving data from Jetblue includes basically all the major advertising companies — Google, Facebook and all the niche but major advertising players. The Jetblue user email leak easily syncs to the most 3rd party companies out of any other leak in this research.

The Way Back Machine also has many copies of JetBlue’s website — at some point in 2019 they rolled out a new version of their website, and since at least July 9, 2019 they’ve been using the current version of their new account signup flow, you can see and literally test the July 2019 archived version here: http://web.archive.org/web/20190709195758/https://trueblue.jetblue.com/enroll/join-us

Here’s a screen shot of the July 2019 version of the JetBlue.com user account creation 2-step form that leaks user emails on the 2nd step:

Jetblue.com has been leaking user emails for about nine months for people creating new accounts…. it’s unclear when JetBlue will update this but they have rejected the research even though being informed on multiple occasions.

The Bezos-Schmidt-Funded KongHQ.com (Formerly Known as Mashape) Using Common 2-Step Form That Leaks on the 2nd Step

The company formerly known as Mashape, now known as KongHQ, was founded in 2007 and received $1.5 million in seed funding in 2011 from a round of investors that included Jeff Bezos and Eric Schmidt through Innovation Endeavors.

KongHQ has a 2-step signup form similar to the JetBlue leak, but the KongHQ form starts on their homepage. When a user puts their email in the form on the homepage and hits enter, their email is immediately pushed into the URL bar and then transmitted to 3rd party advertising and analytics partners.

KongHQ was informed of this breach back in February 2020 but still haven’t made any changes to their website and 2-step form — their response was similar to JetBlue in totally ignoring the issue.

In the original tests KongHQ transmitted data to:

Google
Linkedin Twitter
Facebook
Drawbridge
Mixpanel
CrazyEgg
New Relic
Pardot
Wistia

You can breach your own email address right now by filling out the form on the homepage of KONGHQ.com, but user beware!

After clicking “Request Demo” on the homepage, you are transmitted to the 2nd step of the form, with your email address added into the URL bar to auto-fill the form…

Unfortunately, anywhere you can find a 2-step signup form where the 2nd step has some form of autofill, many of those systems are being built with insecure technology and sometimes the user emails are purposefully leaked to optimize retargeting advertising campaigns or improve analytics attribution data.

Democratic Data Broker NGPVAn.com / EveryAction.com (& Their Clients) have been Pushing User Emails into Google Analytics & Other Systems for Years

NGPVan.com/EveryAction.com are owned by the same company and provide a wide range of CRM/marketing services for political and nonprofit clients. These platforms have an enormous range of features — and similar to the Mailchimp-Mandrill email breach described in this research, NGPVan created a legacy URL field for “emailAddress” that is appended into URLs, mostly on unsubscribe pages, and this can lead to NGPVan/Everyaction clients leaking user emails to 3rd party advertising and analytics companies.

A typical NGPVan unsubscribe URL that has the user email in it, looks like this (The email is appended at the back of the URL):

email.everyaction(.)com/unsubscribeUnique/3d7e893c-921a-ea11-828b-2818784d6d68/ad179bfc-9c1a-ea11-828b-2818784d6d68?nvep=ew0KICAiVGVuYW50VXJpIjogIm5ncHZhbjovL3Zhbi9FQS9FQTAwMS8xLzU1NDA3IiwNCiAgIkRpc3RyaWJ1dGlvblVuaXF1ZUlkIjogImFkMTc5YmZjLTljMWEtZWExMS04MjhiLTI4MTg3ODRkNmQ2OCIsDQogICJFbWFpbEFkZHJlc3MiOiAiemFjaEB2aWN0b3J5bWVkaXVtLmNvbSINCn0%3D&hmac=73-vUhguqpitg-5DybUg7PmTNqxOTTLllnLe8CYE0y0=&id=107423875&emailAddress=cats%40victorymedium.com

Unfortunately, not only are advertising and analytics companies ingesting the user emails on random unsubscribe pages all across the NGPVan client base, but those same URLs with user emails in plain text are also cached in Google search results, URLscan results, and in other repositories of cached user pages across the internet.

The primary company ingesting the user emails from NGPVan clients appears to be Google via their Google Analytics product, but a Microsoft endpoint also receives data. Here’s what Ghostery picks up on one of the unsubscribe pages with a user email in plain text in it:

And then here’s an additional screen shot of the actual data transfer, showing that Microsoft is also receiving the user emails through their visualstudio.com endpoints.

NGPVan has been appending the user email address to unsubscribe links across their own emails, and client emails for several years — the start date isn’t exactly clear but emails from 2018 have this same plain text email.

Google has also been aware of the NGPVan user email leaks since January, and Google clarified their Google Analytics policy around this type of ingestion, which apparently requires a “mandatory remediation process with the customer where they must stop sending PII to Analytics and ensure all historical PII data must be removed.” This statement was sent by Google on January 13, 2020, and several organizations have been flagged for Google who are sending user emails into Google Analytics, yet it appears no actions have been taken by Google on any of these issues.

NGPVan is currently used on the https://covid19responsefund.org/ website sponsored by the World Health Organization (WHO), the United Nations Foundation, the Swiss Philanthropy Foundation, and with supporters including Google, Facebook, Microsoft, and others. It’s unclear if people who donate money through the NGPVan donation form on the website are subsequently sent emails with their email address leaking via the unsubscribe links, but the form does provide options to join the email lists of several sponsoring organizations…

NGPVan is a for-profit company and just because their clients are largely political campaigns and nonprofits, it doesn’t give them the right to leak user emails to advertising and analytics companies — hopefully this issue is resolved so that as the 2020 campaign heats up and users take advantage of unsubscribe forms more often, those user emails aren’t also leaking en-masse to 3rd party companies.

Growing Child, Popular Magazine for Parents, Leaking Emails on Unsubscribe Page to Google Analytics, Google’s DoubleClick, only Google Pixels Receiving Data

GrowingChild.com is a magazine founded in 1971 that describes itself as “serving millions of families in the United States and around the world…”

Unfortunately for the families who have subscribed to GrowingChild.com newsletters and then decided to unsubscribe, their unsubscribe pages print the user email in plain text into the URL and then share the user email in plain text to Google and several Google products including Google Analytics, Google Doubleclick and several other Google advertising endpoints.

The GrowingChild unsubscribe URls are built like this: https://growingchild.com/index.php/unsubscribe/unsubscribe.html?email=growingchild@victorymedium.com and this leaks as a referrer to the Google pixels here:

It’s unfortunately too common for unsubscribe pages to be built this way from legacy organizations, but organizations like Google seem to almost capitalize on it sometimes, like in the requests above that trigger across numerous Google advertising endpoints.

MailChimp’s Mandrill Legacy Email Redirect via their API Can Leak Mandril-Client-User Emails to Advertising and Analytics Companies

Mailchimp’s developer product Mandrill.com was founded in 2012 and claimed 80,000 users by 2015 — lots of developers still use their products, but one of their legacy APIs still has some clients using it. This legacy Mandrill API has a feature that can be used and then it can potentially expose user email addresses on unsubscribe pages to 3rd party advertising and analytics companies.

This Mandrill API doesn’t automatically leak user data but there is the option for Mandrill clients to redirect an unsubscribe URL sent via their API to include the user’s email address in the URL bar.

Mailchimp was informed of this issue relatively recently, they acknowledged the report and mentioned it was already escalated, but haven’t appeared to make many changes yet besides deleting old support articles which recommended the process that could potentially leak a user email to a 3rd party advertising or analytics company.

Here’s an example MailChimp Mandrill API redirect URL; this will redirect into a business email address for a newsletter:

https://launch.us2.list-manage.com/track/click?u=baefb9fcb23d26e0308254e5c&id=87ad1425d5&e=af88ddc5b8

If you visit “list-manage.com” you’ll be redirected to a Mailchimp error/details page

This “list-manage(.)com” domain is owned by MailChimp and numerous legacy Mandrill clients can be found who have various endpoints from this domain embedded and cached in URLscan.io like in this screen shot..

Google has also cached ~47,000 of the Mandrill unsubscribe pages via this search @ https://www.google.com/search?q=site%3Alist-manage.com%20unsubscribe — not all of these results have user emails appended to them, which makes it clear that this is not a feature deployed by all Mandrill clients.

The process to add user emails into the Mandrill redirect URLs was covered in a legacy support article from Mailchimp. This support article was sent to Mailchimp with this research that they haven’t substantially responded to, yet they had time to delete the support article and try to hide what they were recommending to their clients — this page was deleted: https://mandrill(.)zendesk(.)com/hc/en-us/articles/205583017-Can-I-add-an-automatic-unsubscribe-link-to-Mandrill-emails-

https://mandrill.zendesk.com/hc/en-us/articles/205583017-Can-I-add-an-automatic-unsubscribe-link-to-Mandrill-emails-

Even though Mailchimp deleted this support article sometime in the last week, it’s still available in the Google Cache, you can see how Mailchimp showed how to put the user email into a specific query string:

MailChimp also scrubbed their larger support article in Mandrill about unsubscribe pages which used to be @ https://mandrill(.)zendesk(.)com/hc/en-us/articles/205582947-About-Unsubscribes — the page is about appending user emails into the URL bar — Google cached that page too here.

Here’s a screen shot of that page before Mailchimp deleted it this past week:

Again, MailChimp received the report, they’ve obviously been scrubbing their content since receiving the research, but haven’t sent any other details about their plans to notify users or rearchitect the Mandrill service.

Washington Post Leaks Some User Emails in Base64 to Service Providers, Appears Not to Send Data to Any External Advertising Companies

The Washington Post was recently alerted to a base64 user email data leak to a limited number of analytics companies, primarily Chartbeat.com (and maybe a few others) — it appears no advertising companies received the base64 user email strings that several of their newsletters append to their unsubscribe links.

The Washington Post escalated the initial report quite fast and noted they were addressing their issues — Wapo’s base64 user email sharing could be resolved by the time of publication or likely sometime soon after.

Here’s one of the unsubscribe links that has had the user base64 email strings:

Not all of the Washington Post newsletters are built the same way — the leak occurs in the unsubscribe links for the “This Week in Ideas” newsletter and another one of their weekly newsletters — their core system for newspapers subscribers that sends daily emails does not seem to be built this same way and doesn’t seem to leak user emails.

The user emails are encoded in base64 plain text format and appended into a “bem” URL parameter — you can see one of these unsubscribe link via this link @ https://s2.washingtonpost.com/wp-unsubscribe/newsletters?bem=ZWR3YXXXXXXXXXNoLnNjb3R0QGdtYWlsLmNvbQ%3D%3D&nlsendid=5e6e0c0bfe1ff6038cda4f2e

The base64 string is the “bem” param above — mine is slightly obscured above but these can be found across old emails and in some locations on the internet.

Facebook Manipulates URL Query Parameters (for Filtering), But Still has System That Can be Broken to Leak Emails to 3rd Party Advertising and Analytics Companies

Nearly all modern websites use Javascript for advertising and analytics tracking, but it’s still very rare for organizations to “sandbox” their partner javascript pixels in a way that ensures that URL parameters don’t get transmitted accidentally to ad tech and analytics partners.

Facebook is one of the few organizations that regularly does “URL referrer filtering” for their javascript partners, and across most (or all?) of their websites. Facebook does this to filter certain URL parameters like the “mkt_tok” parameter reported to them back in 2019 that could leak user emails through Adobe’s one-token-user-authentication architecture.

Facebook has never open sourced this filtering product of theirs, likely because it helps to cut down on spam and extra work they need to do internally, but their filters are largely built on their own domain, and they have certain Facebook business marketing pages with 3rd party advertising and analytics pixels, and even though Facebook’s filtering product is deployed there, it still typically transmits user emails in some field into 3rd party systems — possibly in fields that can be easily purged, but ingested nonetheless.

Attached below is a screen shot example showing a type of filter where the unknown/unexpected URL parameters that I put into the Facebook URL were then passed into a unique Adobe Marketo field named “_mchQp” — it’s possible that this is a field Facebook parses to ingest unknown inbound data that may/may not get deleted based on some other criteria.

Organizations that have Javascript advertising and analytics partners need to be aware of their own user data breaches but also plan for ways that attackers could inject bad data into 3rd party systems to corrupt retargeting campaigns or break analytics systems. And at some point, more organizations will need to look to architecture from orgs like Facebook who filter URL parameters and build internal sandboxes to protect user data from flowing across their global data supply legal exposure chain.

What’s Next? What Questions are Important?

The organizations included in this research can request any changes or comment additions via URLdatabreach@victorymedium.com

Individuals who use any of these services and who believe had their emails leaked, should be given an easy processes to request the deletion of their user emails that were sent to 3rd party advertising and analytics companies. The organizations involved in this research should provide that process in whatever format they can provide.

Each organization included in this research should be changing their systems to stop leaking user emails in plain text or base64 plain text formats, they should notify users who could have been impacted by the leaks, and also issue deletion requests for all their users to 3rd party advertising and analytics organizations they work with.

All organizations should be extremely careful about 2-step forms, email tracking that appends encoded or plain text emails into URLs, and any process that “syncs a user email” to a 3rd party company. This process is almost assuredly not described properly in Terms of Service and Privacy Policies for organizations, and it’s obviously not a process that most users expect to occur.

Unfortunately, as auditors saw with the Cambridge Analytica scandal and Facebook’s inability to confirm that the data was completely deleted, the organizations involved in this research face a similar dilemma tracking down and deleting user emails that were sent to their 3rd party advertising and analytics partners — how can you actually ensure and know this data was deleted? How can users who were involved in these flows ensure their emails are deleted? Who is in charge of requesting that? Each user to every ad tech/analytics company? Each user to the original offending organization? Should organizations proactively request the deletion for all their users? Will the previously leaked user email data just stay with these 3rd party advertising and analytics companies with only a small minority of users requesting deletion?

Finally, many advertising companies have features they’ve built to sync user emails into retargeting lists and other audience advertising targeting strategies, without properly notifying users? How many of those organizations have user emails that were given without the user fully understanding what was occurring or having an ability to delete or modify that information after it was sent?

Hopefully, organizations will start to take a more proactive approach to trying to stop this type of data supply data breach, and a more responsible plan of action after being notified of significant problems.

Additional questions or concerns? Ping me on twitter @thezedwards.

Recent Posts

Recent Comments