Select Page

Welcome to the Data Blog 🤖

Thoughts on User Data, Privacy & the Global Data Supply Chain

Do you enjoy niche data supply research and trying to stay on top of various user data news and privacy laws? Check out some of the content below or follow Zach Edwards on Twitter for up-to-date ramblings.

Thoughts on the U.S. <-> China Semiconductor Supply Chain Challenges

Earlier this week I spoke with VOA China about export controls (Google Translate Link) being implemented to prevent the sale of specific types of semiconductor / chips / finished hardware products between the United States and China. I think it's important for policy makers and the public to understand the importance of these chips, and the complexities of these discussions. In this post, I've attempted to frame a few core concepts, provide a little recent history, and included a few...

ICYMI 2019 SEC NOTICE: Restructuring the Corporate Governance of the Yandex Group with a new Putin-aligned “Public Interest Foundation”

The contents below were directly copied/pasted from this SEC report submitted by Yandex @ https://www.sec.gov/Archives/edgar/data/1513845/000110465919064935/a19-23249_1ex99d2.htm The only change is to highlight several sections in green, yellow and red that should be reviewed by policy makers based on escalating priority levels. ............. LETTER FROM OUR CHAIRMAN November 18, 2019 Dear Shareholders, I am pleased to introduce this Shareholder Circular for Yandex N.V. in relation to two...

SafeGraph Claimed to Not Deanonymize Individuals, Partnered with Former Yandex Data Scraper with a Shell Company Who Openly Scraped U.S. Government Datasets to Build SafeGraph Integration for Ad Targeting

For a significant period of time, and until relatively recently, registered California data broker SafeGraph was providing an “Ad Targeting” user data appending service that openly used a 100+ million person-to-home-ownership database in the United States, openly scraped from U.S. government data sources, which was owned by a Delaware-registered shell company called “BA45.” The research included below has been deeply fact checked and additional documentation is available as-needed to prove the...

Compromised Godaddy Infrastructure Attacking Numerous U.S. Government Websites to Promote “Canadian Pharmacy” Scam Websites

GoDaddy has responded to this research here, with this statement (lightly redacted, red highlights were added to note ridiculous statements): Zach, [redacted] brought your post to my attention. While this isn't the right forum to dive deep into your article, there's a few things I want to make clear: We will not be filing another SEC incident about a breach any time soon You are seeing Black Hat SEO tactics on several hosting accounts, across different platforms, including several platforms...

Twitter allowing YouTube fingerprint scraping via ~unknown org, Twitter users including BTS fans are one-click away from a URL redirection data scrape

Today on Twitter, you can't tell which YouTube embeds are actually hiding a dark secret that will share your user data to a totally random organization, which isn't Google or Twitter, and that organization has a business model which appears to include sharing or selling user data to advertising companies. Twitter knows about this problem, but the problem exists due to how Twitter embeds a redirected YouTube iFrame, and this problem can happen to a few vendors other than YouTube... Be warned,...

Scraping an ad network and back appending their targeting data into new leads – a common tactic

Every advertising network, if you are able to use custom UTM parameters on the link click, you can back-append that social network/ad network's targeting data into your own user database without the consent or knowledge of users -- and it's extremely common for enterprise orgs. On Twitter, the standards are…loose.. (ditto for Facebook) so I like to urge folks to click on ads occasionally and/or grab the full URL, and take a look at the UTM parameters + values that are in the URL. For bigger...

Is President Biden giving Google a White House Handout on FloC?

President Barack Obama's White House was exceptionally close to Google, but until March 2021, most of the world had no clue the core benefits Google acquired from this relationship until Politico reported on 312 pages of confidential memos proving that antitrust regulators appointed by Obama declined to sue Google for spurious reasons. In the 4 years since President Obama left office, the world's understanding of Google's past behavior, private lobbying, and problematic advertising practices...

Breitbart.com is Partnering with RT.com & Other Sites via Mislabeled Advertising Inventory

A large group of alt-right sites, low quality publishers, and other websites are mislabeling Ads.Txt publisher relationships and potentially committing a form of advertising fraud. Summary: The Interactive Advertising Bureau’s ads.txt standard is being abused by publishers mislabeling and sharing a “DIRECT” label for account-bidding IDs used in online bidding protocols — with the DIRECT labels being spread out across sometimes hundreds of unrelated websites. This inventory mislabeling creates...

July 2020 Compromised PaF Subdomains (mostly via Microsoft Azure)

This is what many of the compromised subdomain homepages look like — “coming soon” type pages in different languages… Continued from Twitter… please read this thread before engaging… Crowdsourcing research project ahead! Please be extremely careful with the subdomains listed below — many of these are still compromised. If you find a compromised subdomain, please consider reaching out to anyone at that organization who could take down the subdomain. Ping me @thezedwards on Twitter and i’ll help...

Final Statement of Reasons from The California Attorney General for CCPA Raises Important Questions

Big Data Organizations & Service Providers have weeks to get ready The California Consumer Privacy Act (CCPA)is going to be enforced starting on July 1, 2020 having gone into effect at the start of 2020 — and new guidance from the California Attorney General should quickly become the focus of any digital organizations with significant amounts of user data. This blog post is not meant to be an all-encompassing summary of how to get ready for CCPA or the frameworks for sharing and...

Epic Games Ignored Epic Subdomain Takeover on their Authentication Domain, Promoted $1 Million…

A global hacking group took over Epic Games subdomains, then the problem was swept under the rug by Epic Games. At the end of March, 2020, Epic Games posted on their Twitter account a $1 million bounty for anyone to provide information of any corporate astroturfing spreading rumors about Epic Games, particularly with regard to Epic Games’ House Party users complaining about being hacked. This unusual ‘commercial smear’ bounty was covered by a variety of reporters, with a limited amount of...

The 2020 URL Querystring Data Leaks — Millions of User Emails Leaking from Popular Websites to…

Breaches have been found on websites including Wish.com, JetBlue.com, Quibi.com, WashingtonPost.com, NGPVan.com and numerous other organizations… Most popular websites on the internet are using 3rd party analytics and advertising Javascript code — and depending on how a website sets up their marketing systems, typically email systems and new user signup flows, the user emails can accidentally and/or purposefully leak to companies across the global data supply chain. The organizations included...

What your lawyers say you’re doing VS What your growth team and developers are actually doing ….

CCPA and GDPR force companies to “put pen to paper” about their global user data policies and partner data sharing, but as with any system of accountability in a marketplace, the mere existence of regulatory frameworks that force transparency, don’t ensure that the data being shared is accurate or universal for all users. And with both CCPA and GDPR, there will always be a common mantra of exposure… “your data sharing partners put your business out of compliance…” Over the last 12–24 months,...

Who fixes cached search results? An odd Facebook user vulnerability

There are numerous coding practices and server setups that can result in unexpected cached pages -- pages that shouldn't be able to be served to another user. One of the most-abused page caching features is "search index caching" -- aka using a website's search feature to inject your own content/domains/spam into a permanently cached version of that page, so that other users who stumble across that page via Google or other means, they could see content on the search result page input by...

Advertising & Analytics Red Team: Attribution Attacks via Facebook’s “fbclid” Parameter

As someone who has been working on enterprise digital stacks for over 12 years and building analytics stacks for over 7 years, I've broken my fair-share of client websites with malformed javascript. It happens to pretty much every analytics professional *on accident* at semi-regular intervals --  but there are very few organizations that have teams trying to take down the data layer on a daily basis *on purpose* - an Advertising & Analytics Red Team. When was the last time you heard about...