Ecommerce stores official website

Who is behind this site? A checklist.

By Priyanjana Bengani (@acookiecrumbles) and Jon Keegan (@jonkeegan) IRE NICAR Conference – March 4, 2022 Slides: English | Russian

The Tow Center would like to thank Dr. Svetlana Borodina and the Harriman Institute for translating this presentation into Russian.

What is that?

This checklist is intended to be used as a reporting tool to help journalists and researchers when trying to find out who published a website. This is intended for use in conjunction with offline reporting techniques.

Following this checklist doesn’t guarantee you can unmask a website owner who doesn’t want to be found, but it can help reveal crucial clues and connections that can serve as leads for further reporting.

🌟 Strong recommendation: while performing this checklist, create a data log: it can be a TextEdit document, a Google document, just the Notes app, whatever. It is important to be able to retrace your steps.

Content of the site

Features and Functions
Connections
Photos, pictures and documents

Social networks

If there are any social media profiles mentioned on the site, they are worth investigating.

On the Facebook profile, go to Page Transparency:

Twitter

On Twitter, the account can be part of a pod or a network that boosts it. Using en.whotwi.com, it is worth checking out:

Other platforms

Don’t forget to check if the site has accounts on Youtube, Instagram, Reddit, Github…

Infrastructure

  • 🗄 Have you archived the website? (You always should!)

    • you can do this at archive.org or use their browser extension.
    • you can grab the whole website on Terminal with wget: wget -mpEk
  • 🖥 What does the website use?

    • Does it use WordPress, Squarespace, anything else?
  • ☁️ Where is it hosted?

    • Is it on Google Cloud, AWS, Cloudflare, something else?
  • 🪳 Are there any trackers present?

  • 🛍 How is the site monetized?

    • Are there affiliate links (Amazon, etc.)?
  • 🧬 What are the different tracking IDs, and are they shared with other domains?

  • Are there any relevant subdomains?

  • 📜 Are there historical WHOIS records?

  • ⌛️ Has the site evolved over time?

    • Look at archive.org to see if the domain has changed dramatically and if so, when.
  • 🗑 Did the previous version of the site contain more information?

    • Users can delete information when a site has been online for a while.

Resources and tools

Open Source Intelligence Techniques – Michael Bazzell https://inteltechniques.com/book1.html

Verification Handbook – edited by Craig Silverman https://datajournalism.com/read/handbook/verification-3

Website infrastructure
  • Blacklight: The Markup’s real-time website privacy inspector.
  • builtwith.com: gives you the site infrastructure, including IP addresses, scan codes, technology stack, etc. Freemium model.
  • DNSDBScout: Allows you to search and “flex search” for passive DNS lookups, including IP domain mapping .
  • Dnslytics: Offers a range of tools, including reverse scans and reverse DNS lookups, as well as WHOIS data. Freemium.
  • RiskIQ: a “threat intelligence” tool that lets you get reverse IPs, reverse scans, WHOIS, SSL, subdomains, and more.
  • Whox: a tool that allows you to view the history of WHOIS records. Free.
  • The Internet Archive browser extension.
Social media accounts
  • Sensity AI: check whether an image is generated by the GAN or not. Freemium.
  • whotwi.com: Create a profile at a glance for any account on Twitter. Free.

Check out this checklist on GitHub.

Sign up for CJRit’s daily email

Has America ever needed a media watchdog more than it does now? Help us by joining CJR today.

Priyanjana Bengani and Jon Keegan

TOP IMAGE: Hana Joy