What is your website and online marketing campaign missing that will save you time and make you more money? Click here to find out

Scrape-O-Rama

In 2002 Google was still in its infancy and still quite wet behind the ears.

So now I have a way to display any of the items from eBay on my website. Cool... Now what?

To scale this thing out, I needed to get Google to index it, but how?

I looked at how eBay was set up. They had categories but at the time you couldn't scrape these easily so the category structure wasn't going to work on my sites.

The only option (since the scraper was based on search results) was keywords and phrases. Shitloads of them.

eBamazon

Even back then Amazon had hundreds of thousands of products and I had a way to display them on my sites from the category listings right down to the individual product via David Cusimano's script.

With the combination of the Amazon script and the eBay scraper I created websites that were full of near limitless content. I called it "comparison shopping", which was a stretch. Technically it was a way to see what the price was at Amazon and see if you could beat it on eBay.

For every keyword, category, brand, product name, isbn, etc... I could create a page of eBay affiliate links that Google would index.

It didn't take long for this to start looking very lucrative.

I remember going away for a couple days and when I got home I logged into my CJ account to see what was going on with the eBay stuff. I saw that I made $1700 during the two days before and was on track to better $1000 for the third day.

I remember thinking two things:

Holy shit this is going to work and I'm gonna need more keywords.

Sources for Keywords

I hunted for keyword lists all over the internet. Car makes and models, brand names, pro sports teams, celebrity names, etc... pretty much anything that word turn up something on eBay.

After learning PHP to keep the eBay scraper running, I was fairly proficient at coding PHP scraper scripts. (eBay changed their page layout several times over the period that I was scraping, so the pattern matching, etc... needed to be tweaked every now and then to keep it running.) If there was a pattern to match, it could be scraped.

I went back to CJ to look for other merchants that had enabled flexible linking and whose sites could be scraped. (This was before most merchants offered datafeeds.) The ones that could be scraped got thrown into the mix (mashup?).

All of this data got crosslinked with the eBay scraper sites to create even more pages of eBay items. Every item or keyword I could add would make the eBay scraper sites grow.

At the height of the scraping I had literally millions of pages indexed in Google.

eBay Affiliate API

Once we started making big checks from the eBay affiliate program we were on their radar. I soon got a call from the head of eBay's affiliate operations.

The woman there asked me how I was driving the traffic to get so many ACRUs (active new member signups). We were sending a shitload of people there every month and they wanted to make sure that it was not adult traffic or some spyware operation.

She forced me to tip my hand to prove that what I was doing was not against their terms of service. I showed her one of the eBamazon scraper sites and that was enough to prove that the traffic was legit.

At the time scraping eBay was against their TOS, but I guess I was driving enough ACRUs that they didn't care or didn't know.

A couple weeks later the woman from eBay called again and put me on a conference call with some other department (I think it was their SEO or marketing department). They had me walk them through the scraper sites again and asked me a bunch of questions about how this whole thing worked.

During this entire call they never asked me how I was getting the eBay data, which I thought was weird.

eBay started keeping in regular contact with me and one day I got a call from them asking telling me to join their new affiliate API program. eBay's API is basically a back door to all their product data, categories, member profiles, etc...

So I guess they knew about all the scraping and wanted a way to track my operation internally. You see when you are scraping, they really can't track exactly what you are doing with their data. With an API program they give you a developer key and every call you make to request data can be logged via your key, so they can build a profile of your usage.

So I called my connection at Cape.com to line up a programmer to build the original eBay PHP API script I used. At the time I didn't know enough about making XML requests and parsing arrays to hack together a script myself and there were no other scripts available at that time to access the API.

I soon ditched the scraper script and had everything running the new API setup.

Adsense

When Adsense came out in 2003 I integrated it into the spamdexing operation. This worked out great. Another revenue stream to go alongside they other affiliate stuff I had going.

if ebay == "no" then show adsense

I remember one day eBay's API was down so the sites were broken. I went into the code and made it so if the eBay API request failed it would show two large Adsense blocks. The CTR on those ads was phenomenal. So that became the failsafe.

One thing I did notice about Adsense is that it targeted well on search results pages. At that time it was against the Adsense TOS to display Adsense on search results pages, but with some clever URL rewriting the pages appeared to be just static pages and flew under the radar.

Made for Adsense

Made for Adsense or MFA websites proliferated soon after the program was opened, but they weren't called that yet.

As I noted above, the Adsense ads on search results pages had a great CTR and were well targeted, so this was the new endeavor.

I had already been running my own local search engine at capelinks.com (FDSE) and had a PPC advertising system (Smartsearch CGI) already in place since 2002. This PPC advertising system also had a web search that scraped web results from MSN, ODP, Raging, etc...

I set up several websites using the Smartsearch script via includes, fed them lists of keywords and crosslinked them to the eBay, Amazon and all of the other scraper sites. This created a whole other spamdexing network of search results pages.

At the time Overture (formerly GoTo, the first major PPC search engine) had a tool for getting related keywords, called the Overture Keyword Suggestion Tool. This was a really awesome tool at the time, but now it's part of internet history.

The OKST could be scraped rather easily, so I integrated scraped it into the search results and eBay sites. Every time a search was performed it would get the related keywords via the scraper and add them to the page as it rendered. I also wrote the script to cache the related keywords data to lighten the load on the server.

This type of thing allowed the sites to grow exponentially. It added 25 new pages to the site every time a keyword was searched. Already powered by massive keyword lists and crosslinked with all the other afffiliate stuff already, this started to push it way beyond what I had expected.

More Like This

Since this was really ramping up, I decided to see how far it would go.

I was surfing around one day and came across a search engine (I don't remember which one), but next to each search result on the page they had a link to "more like this". What this did was feed the page titles of the pages in the search results listings back into the search engine.

I thought this would be a great experiment to expand the spamdexing operation, so for each listing that was displayed on all the search results sites I had, I added the more like this link.

Now this experiment put this whole operation way over the top. In early January of 2004 I let it loose on Google.

Plesk under fire. Here's the screenshot from one of the servers:

spamdexing

Ooops!

I will say one thing, I could not kill that Smartsearch script.

Now this ran for months, but opened a huge can of worms. For one, all the trademarks that were being searched from the keyword lists, affliate scrapers, OKST, more like this, etc... were ranking quite well. Google couldn't eat enough of it.

I received a few Cease and Desist letters from outfits like V!ctoria's Secret. Evidently they were annoyed that I was number one for "V!ctoria's Secret Honolulu HI".
(I still have the letter.)

I responded to the C&D stuff immediately and added conditionals to the script so that didn't become a bigger problem. I just rewrote the code to do stuff like this:

if($search) { $search = ereg_replace("pissedoffbrand1\.com", "", $search); $search = ereg_replace("pissed off brand name", "", $search); $q = urlencode($search); }

This would prevent the trademark from triggering a search results page for the brand name. For existing pages that were already indexed I had to add code to redirect the page to a 404 error if it contained one of the brands that complained.

I also received threatening phone calls and emails from other jealous affiliates and was ratted out on every affiliate forum from people that were doing the same thing as I was, but I was burying them in the search results.

I never got sued and no one responded to my replies to the threatening phone calls and emails, asking them to meet up with me anytime.

Regrets

CapeLinks.com which had been trumping all the other local tourist sites for all the major Cape Cod searches (cape cod hotels, cape cod rentals, etc...) for years was banned from the Google index for exactly one year and one day. I set up a Google alert for site:capelinks.com -fdgashdffhyt and it triggered one year and one day from the date it was banned.

My only regrets are that I used my primary domain for this experiment and I didn't do it ten times as hard as I did, since the outcome would have been the same either way and ten times the money would have allowed me to retire.

Home » About » Internet » Spamdexing: Scrape-O-Rama

CapeLinks
P.O. Box 403
South Dennis, MA 02660

Wordpress help ExpressionEngine help

About

CapeLinks is owned and operated by Darren Vlacich.

Darren is a Certified Professional Webmaster and has been involved in all phases of online marketing and web development since 2000.

How much is your business website and online marketing costing you? Find out now.

CapeLinks Network

Cape Cod Daily News

CapeCodDaily.com

The Latest Breaking Cape Cod News Headlines Updated Hourly from over 50 sources

Cape Cod Home Improvement

CapeCodHomeImprovement.com

Cape Cod home improvement, maintenance & service leads for local Cape Cod contractors, service providers and maintenance companies

Cape Cod Summer Rentals Network

CapeCodSummerRentals.Net

Cape Cod vacation rental leads for property owners, rental agents and property managers