Who's stealing my content? Good question. Find the scrapers with wpCop's aptly-titled investigative guide How to Find Scraped Content.
In How to Protect Content from Scrapers we took steps to preemptively defend our content using a bunch of techniques and plugins. But what if your content is being abused already? This is when we need a reactive response.
We may wish to accept or negotiate fair use and properly attributed scenarios while fighting our corner where our work is being used cynically. First though, we need to know who's using what content, if any. Here's how.
Seeking out scrapers
Quite likely, you'll have no clue your content has been collared until you look for copies.
As with most security topics, setting up and mastering a system is what takes the most time but, thereon, keeping tabs on scrapers makes for light work.
We'll trawl the many research options using the web in a moment but, first, let's look at using WordPress and its site logs to help.
Uncovering plagiarism using the web
Turning over a few digital stones to see who is using what and how may make for a fascinating, if time-consuming, exercise. Fortunately we have tools aplenty and, surprise surprise, many of those involve the vast search capacity that is Google.
The big G aside, here are some more avenues of online detection.
Pinpointing scrapers
When you've found an offending scraper site you can start a dialogue … or you could if there was one of those new-fangled Contact Us buttons. Often, there aren't. As for the About Us button, scraper types are often shy of those too. An e-mail address? Hahaha!
You could chance it and simply send an e-mail to webmaster@somescrapersite.com. Then again, it's better to be precise, if only to show that you know your way around the web. That way you will be taken more seriously.
Run a WHOIS search
Here's a clue. To save repetition, read WHOIS whacking which explains how to find the right service to gain the most detail for the offending domain. Unless the domain registrant has applied privacy settings, the results will yield many details to get you started. For example:
- The name and address of the domain registrant, or owner
- The domain registrar
- Administrative and technical contacts, addresses, and telephone numbers
- The domain's IP address
- The nameservers, giving a clue as to the web host
From the record you will, at least, be able to track contact details for the domain's ISP by taking a nameserver and running it by a service such as Network-Tools. The resulting DNS records will provide, among other things, e-mail addresses for the host's support and, generally, for their abuse departments.
From investigation to acting on scrapers
Cool, we've uncovered whatever instances of content abuse and we know who to talk to.
So what now?
In Deal with Content Scrapers: Multiple Techniques we'll look at a variety of methods to employ to regain our work – from meow to snarl to bark and bite – and edit some proven stock template letters that we can issue.