<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>marketingfan.com &#187; supplemental index</title>
	<atom:link href="http://www.marketingfan.com/tags/supplemental-index/feed" rel="self" type="application/rss+xml" />
	<link>http://www.marketingfan.com</link>
	<description></description>
	<lastBuildDate>Tue, 13 Apr 2010 09:53:58 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Google Bowling via Proxy</title>
		<link>http://www.marketingfan.com/google-proxy-bowling</link>
		<comments>http://www.marketingfan.com/google-proxy-bowling#comments</comments>
		<pubDate>Thu, 16 Aug 2007 19:00:22 +0000</pubDate>
		<dc:creator>Christoph C. Cemper</dc:creator>
				<category><![CDATA[bowling]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[google bowling]]></category>
		<category><![CDATA[google proxy bowling]]></category>
		<category><![CDATA[google proxy hacking]]></category>
		<category><![CDATA[google search]]></category>
		<category><![CDATA[hacking]]></category>
		<category><![CDATA[incredibill]]></category>
		<category><![CDATA[index labels]]></category>
		<category><![CDATA[matt twine]]></category>
		<category><![CDATA[penalties]]></category>
		<category><![CDATA[proxies]]></category>
		<category><![CDATA[proxy sites]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[supplemental index]]></category>
		<category><![CDATA[www google]]></category>
		<category><![CDATA[dan thies]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[<p><a href="http://www.marketingfan.com/google-proxy-bowling">Google Bowling via Proxy</a><br/><br/>By <a href="http://www.marketingfan.com">Marketingfan.com Internet Markting Blog</a></p>
No matter what Google says, other evil competitors can "bowl" your site out of the Google search results by utilizing proxy sites.... Dan Thies has a great post up - and to illustrate that I thought I can add to this great post with a concrete examples, screen shots, sites and urls - no need to keep silent on this anymore... ]]></description>
			<content:encoded><![CDATA[	<p><p><a href="http://www.marketingfan.com/google-proxy-bowling">Google Bowling via Proxy</a><br/><br/>By <a href="http://www.marketingfan.com">Marketingfan.com Internet Markting Blog</a></p><br />
So with <span class="caps">SES </span>San Jose just around the corner, Dan Thies put up a great post detailling <a href="http://www.seofaststart.com/blog/google-proxy-hacking" title="">all the headaches</a> that a website owner could get when looking at his serps a bit closer or with <a href="http://www.marketingfan.com/search-engines/why-removed-supplemental-index-labels-are-good-my-business" title="">the right tools to do so</a> ...</p>

	<p>Dan calls this &#8220;Google Proxy Hacking&#8221;, but frankly, we are not hacking any of Google&#8217;s proxies &#8211; so I&#8217;m talking about <b>Google Bowling via Proxy Sites</b> &#8211; related to the older black hat term &#8220;Google Bowling&#8221; for buying too many / bad links for competitor sites to knock them off the serps. Yes, it IS possible to knock a competitor site off the <span class="caps">SER</span>Ps, altought <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=34449&#038;query=harm&#038;topic=&#038;type" title="">Google says</a>= there is <s>nothing</s> <em>almost nothing</em> a competitor can do to harm you (yeah, right &#8211; the Google folks weakened this message some months ago, because the &#8220;nothing&#8221; was plain wrong &#8211; and they knew it).</p>

	<p>If you read thru <a href="http://www.seofaststart.com/blog/google-proxy-hacking" title="">Dan&#8217;s post</a> you might get <a href="http://www.seofaststart.com/blog/google-proxy-hacking#comment-688" title="">headaches just like this guy</a>  from all those details and the partly <b>wrong promises</b> for a cure for it with two solutions that <span class="caps">BOTH</span> address only the outdated part of the problem.</p>

	<p>So I though I have to illustrate to you what&#8217;s going on and <b>how Google Bowling via Proxies</b> actually looks like</p>

	<p><img src="/files/u2/proxy-dust-rules1-070816.png" width="716" height="391" alt="proxy-dust-rules1-070816.png" /></p>

	<p>The above results are returned if you search for the unique phrase</p>

	<p><b><br />
<a href="http://www.google.com/search?client=opera&#038;rls=en&#038;q=%22related+details+is+the+CEMPER.COM+expertise+that+you+can+order%22&#038;sourceid=opera&#038;num=10&#038;ie=utf-8&#038;oe=utf-8" title="">related details is the <span class="caps">CEMPER</span>.COM expertise that you can order</a>   </b></p>

	<p>which <s>is</s> was only found on my company site <a href="http://www.cemper.com" title="">cemper.com</a> ... (ok &#8211; now it&#8217;s also found on this marketingfan.com<br />
blog and on <a href="http://www.marketingfan.at" title="">marketingfan.at</a> as soon as we translate it)</p>

	<p><h3><b>But <span class="caps">WTH</span> is Proxy Dust ???</b></h3></p>

	<p>As you can see this unique phrase which <a href="http://www.marketingfan.com/search-engines/why-removed-supplemental-index-labels-are-good-my-business" title="">should id if my page is healthy</a> does not show my <a href="http://www.cemper.com" title="">own site</a> but &#8220;one of those <span class="caps">PITA</span> sites&#8221;: run by a guy called Matt Twine from the <span class="caps">UK </span>(if that IS his real name&#8230;)</p>

	<p>and as you can image the url <a href="http://www.proxydust.com/index.php?q=aHR0cDovL3d3dy5jZW1wZXIuY29t" title="">http://www.proxydust.com/index.php?q=aHR0cDovL3d3dy5jZW1wZXIuY29t</a> has an <span class="caps">EXACT</span> copy of my company site&#8217;s home page there&#8230;</p>

	<p>Did I hear Spam Report? yadda yadda &#8211; don&#8217;t bother &#8211; the Googlers don&#8217;t seem to care, because I submitted that 2 weeks ago&#8230;</p>

	<p><h3><b>But it get&#8217;s worse</b></h3></p>

	<p>Now clicking that &#8220;filter=0&#8221; to reveal all search results we see this <span class="caps">HUGE</span> list of pages &#8211; cemper.com coming second&#8230;. as a filtered result right after that proxy site used for google bowling&#8230;</p>

	<p><img src="/files/u2/proxy-dust-no1-cemper-3more-proxysites-part1.png" width="695" height="485" alt="proxy-dust-no1-cemper-3more-proxysites-part1.png" /></p>

	<p>[... pages cut out here &#8230; ]</p>

	<p><img src="/files/u2/proxy-dust-no1-cemper-3more-proxysites-part2.png" width="695" height="398" alt="proxy-dust-no1-cemper-3more-proxysites-part2.png" /></p>


	<p>But also we have a <a href="http://www.unblockfilters.com/index.php?q=aHR0cDovL3d3dy5jZW1wZXIuY29t" title="">couple more</a> <a href="http://www.glik.us/scgi-bin/nph-noxy.cgi/000110A/http/www.cemper.com" title="">scumbags</a> <a href="http://69.41.173.145/ru/www.cemper.com/" title="">stealing my content</a> and trying to hijack my site&#8230;</p>

	<p>In fact only the <a href="http://www.proxydust.com/index.php?q=aHR0cDovL3d3dy5jZW1wZXIuY29t" title="">ProxyDust copy wins</a> big time over <span class="caps">CEMPER</span>.COM because &#8230; believe it or not&#8230;</p>

	<p><b>that fricking domain registered in January 2007 got a Wikipedia backlink</b></p>

	<p>And <a href="http://www.cemper.com" title="">my site</a> does not.</p>

	<p>I currently think that&#8217;s the main reason why Google chose them over my own site &#8211; which is from 2000, not heavily SEOed, but I bet a handful more trusted than this Mark &#8220;Thief&#8221; Twine&#8217;s site.</p>

	<p>Well, it might well be that Mark has <span class="caps">NO CLUE</span> about what he does, but all those ads plastered around my site indicate different.</p>

	<p>In fact it appears the whole strategy of running those proxy sites is to earn money from the ads placed on other&#8217;s content and cashing in on their work&#8230;</p>


	<p><h3><b>What we (legit webmasters) can do&#8230; </b></h3></p>

	<p>Frankly, I love <a href="http://www.seofaststart.com/blog/google-proxy-hacking" title="">Dan&#8217;s general post</a> as an introduction to this post, because I would have hated to explain it in all length as he did.   <s>But what he points out as &#8220;solutions&#8221; are somewhat <b>old school methods</b> to identify bots that pretend to be Google, Yahoo or MSNbot&#8230;.  </s></p>

	<p>Dan&#8217;s post <span class="caps">ALSO</span> contains the 2nd method for sending <span class="caps">ALL</span> visitors a &#8220;noindex, nofollow&#8221; that do <span class="caps">NOT</span></p>

	<p>1) Identify as spiders<br />
2) Pass a &#8220;valid IP address&#8221; test</p>

	<p>Pretty cool &#8211; I think that might work &#8211; and will test this <span class="caps">ASAP</span>, in addition to my own method of blocking those scumbags.</p>


	<p>Further readings:</p>

	<p>I discussed this with <a href="http://incredibill.blogspot.com/2007/07/google-proxy-hijacking-myths-urban.html" title="">IncrediBill last week</a> who has a great post up on identifying fake bots &#8211; but his comment is also just</p>

	<p><blockquote><br />
<span class="caps">PROXYDUST</span> appears to just pass thru the user agent as-is, hard to say without seeing an actual hijacking if they do something special with Googlebot.</p>

	<p>Anyway, they operate out of uk2net and the easiest way to make sure you&#8217;ve got all their IPs is to just block the entire data center.</p>

	<p>inetnum: 83.170.96.0 &#8211; 83.170.111.255<br />
netname: <span class="caps">UK2</span>-NET<br />
route: 83.170.96.0/20<br />
</blockquote></p>

	<p>and then</p>

	<p><blockquote><br />
Automating it is sometimes proxy and behavior specific, nothing I could tell you how to do in a quick post.</p>

	<p>Some of them actually slip through the cracks for a while until they reveal themselves so it&#8217;s not 100% bulletproof.</p>

	<p>The only way to get most of them is to simply block all hosting centers.<br />
</blockquote></p>

	<p>I actually blocked a <span class="caps">TON</span> of IP ranges,including those of a rogue bot called Twiceler in the last 2 weeks&#8230;</p>

	<p>but the &#8220;noindex&#8221; hack mentioned above is the next countermeasure&#8230;</p>

	<p><b><span class="caps">I REALLY</span> hope I can generalize this to protect <span class="caps">ALL</span> my sites without having to change all of them&#8230;</b></p>



	<p>And then we got some more cool posts on</p>

	<p><a href="http://hamletbatista.com/2007/07/16/you've-won-the-battle-but-not-the-war-10-ways-to-protect-your-site-from-negative-seo/  ">10 Ways to protect your site from negative <span class="caps">SEO</span>&#8221;</a> where hamlet refers to &#8220;negative <span class="caps">SEO</span>&#8221; for all kinds of actions a competitor could take against you &#8230; frightening &#8230;. and <a href="http://hamletbatista.com/2007/07/03/the-never-ending-serps-hijacking-problem-is-there-a-definite-solution/" title="">Never Ending <span class="caps">SERP </span>Hijacking</a>  where he correctly states that the <span class="caps">REAL</span> problem are those sites like proxydust that <span class="caps">DO NOT</span> pretend to be Google&#8230;.</p>


	<p><h3>What about you?</h3></p>

	<p>Has <span class="caps">YOUR</span> site been hijacked? Do you know?</p>

	<p>How you could know? Just follow <a href="http://www.jimboykin.com/google-supplemental-results/" title="">Jim&#8217;s post</a>  to find if a page is in supplemental &#8230; but actually make sure you look at the results closely&#8230; because what you might find is that somebody is stealing your content&#8230;.</p>

	<p>You should do that for <span class="caps">EVERY PAGE</span> of your site &#8211; best case &#8211; if you <a href="http://www.marketingfan.com/search-engines/why-removed-supplemental-index-labels-are-good-my-business" title="">got the right tools</a> for it&#8230;. but it costs a lot of resources either way &#8211; by hand or by machine tool.</p>


	<p><b><br />
Let me know about <span class="caps">YOUR</span> hijack experiences !<br />
</b><br />
(and I&#8217;m sure people <em>should</em> talk about this at the <span class="caps">SES</span> in San Jose , however I fear they won&#8217;t too much&#8230;)</p>




	<p>Update: You could of course get around the initial problem of having too less trust in Google by <a href="http://www.marketingfan.com/tools/seo-tools/common-forward-links-tool-super-authority-links" title="">getting real juicy authority links</a> <a href="http://www.marketingfan.com/a/search-engines/3-great-uses-for-the-msn-linkfromdomain-command.php" title="">using <span class="caps">MSN</span>&#8217;s linkfromdomain command</a> by effectively even letting your competitor <a href="http://www.marketingfan.com/a/search-engines/research/indirect-linking-truncated-page-rank-and-getting-rid-of-link-buying-penalties.php" title="">link indirect to you</a>  ... obviously you still want to make sure you get only the <a href="http://www.marketingfan.com/search-engines/seo/link-building/strongest-subpages-suck-where-you-should-really-get-links" title="">juicy pages</a> and not spend your time with dead meat.</p>
 <p align="left"><a class="tt" href="http://twitter.com/home/?status=Google+Bowling+via+Proxy+http://ri.ms/bqm5" title="Post to Twitter"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-twitter-big4.png" alt="[Post to Twitter]" border="0" /></a>&nbsp; <a class="tt" href="http://delicious.com/post?url=http://www.marketingfan.com/google-proxy-bowling&amp;title=Google+Bowling+via+Proxy" title="Post to Delicious"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-delicious-big4.png" alt="[Post to Delicious]" border="0" /></a>&nbsp; <a class="tt" href="http://reddit.com/submit?url=http://www.marketingfan.com/google-proxy-bowling&amp;title=Google+Bowling+via+Proxy" title="Post to Reddit"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-reddit-big4.png" alt="[Post to Reddit]" border="0" /></a>&nbsp; <a class="tt" href="http://stumbleupon.com/submit?url=http://www.marketingfan.com/google-proxy-bowling&amp;title=Google+Bowling+via+Proxy" title="Post to StumbleUpon"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-su-big4.png" alt="[Post to StumbleUpon]" border="0" /></a>&nbsp; </p>]]></content:encoded>
			<wfw:commentRss>http://www.marketingfan.com/google-proxy-bowling/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Why Removed Supplemental Index Labels are good for my business</title>
		<link>http://www.marketingfan.com/why-removed-supplemental-index-labels-are-good-my-business</link>
		<comments>http://www.marketingfan.com/why-removed-supplemental-index-labels-are-good-my-business#comments</comments>
		<pubDate>Sat, 04 Aug 2007 11:50:18 +0000</pubDate>
		<dc:creator>christiner</dc:creator>
				<category><![CDATA[google]]></category>
		<category><![CDATA[hell]]></category>
		<category><![CDATA[link building]]></category>
		<category><![CDATA[payperpost]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[supplemental index]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[<p><a href="http://www.marketingfan.com/why-removed-supplemental-index-labels-are-good-my-business">Why Removed Supplemental Index Labels are good for my business</a><br/><br/>By <a href="http://www.marketingfan.com">Marketingfan.com Internet Markting Blog</a></p>
Thanks To Google For Removing the Supplemental Index label]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.marketingfan.com/why-removed-supplemental-index-labels-are-good-my-business">Why Removed Supplemental Index Labels are good for my business</a><br/><br/>By <a href="http://www.marketingfan.com">Marketingfan.com Internet Markting Blog</a></p>
Earlier this week Google has removed the &#8220;supplemental index&#8221; labels from the <span class="caps">SERP</span>s, and as with every major poops from Google the whole <span class="caps">SEO </span>scene freaked out on this! 

<p>Me too &#8211; because I have to thank Google for giving me &#8211; and my clients a new competitive advantage.</p>

<p><b>Why do I thank Google for removing interesting signals?</b></p>

<p>Well, until last week every wannabeo-seo and his mother could see (in the <span class="caps">SERP</span>s, see grandfathered sample below)<br />
if a page had a problem with ranking&#8230; </p>

<p><img src=http://farm2.static.flickr.com/1282/968031873_7e98839066_o.jpg /></p>

<p>There have been huge posts by <a href="http://www.jimboykin.com/damned-to-google-hell-supplemental-results/">Jim</a> , <a href="http://www.seo4fun.com/notes/supplementals.html">Halfdeck</a> and <a href="http://www.seo4fun.com/blog/2007/02/19/why-duplicate-content-causes-supplimental-results.html">Halfdeck again</a> and a lot more on what/why/where supplementals are. Even I posted about <a href="http://www.marketingfan.com/a/45-of-zero-pages-listed-welcome-to-supplemental-hell.php">Supplemental Hell</a> and one <a href="http://www.marketingfan.com/a/buy-blog-posts-get-supplemental.php">PayPerPost link buying penalty</a> bringing pages into the to supplemental index.</p>

<p><b>Today however&#8230;</b></p>

<p>People need to put more effort into detecting if a page has problem due to being in the supplemental index.<br />
That means a certain (large) amount of <span class="caps">SEO</span>s just won&#8217;t be able to do this in their everyday job.</p>

<p>Halfdeck has his own <a href="http://www.seo4fun.com/php/pagerankbot.php">Supplemenal Detector</a> which is a fancy <span class="caps">JAVA </span>application that is in fact a &#8220;pagerank emulator&#8221; &#8211; <s>and all pages below a certain threshold are marked as supplemental.  I encourage you to download this data scraper, and I&#8217;m sure it works nicely &#8211; but haven&#8217;t tried it. </s></p>

<p>After playing with Halfdeck&#8217;s Pagerank emulator I must say that it&#8217;s a great way to simulate how the &#8220;link juice&#8221; flows thru your site and where you are actually wasting precious link juice (i.e. on useless stats pages). Halfdeck even implemented a &#8220;backlink emulator&#8221; where you can judge on the effects of an additional PRx link to any page you like&#8230; pretty cool tool &#8211; it just lacks <span class="caps">TBPR </span>live queries, but I hope he can add that in the next version.</p>

<p>But, in fact I never cared much about the <span class="caps">TOTAL </span>number of supplementals, but always if <strong>a single</strong> page is in supplemental. Why that? </p>

<p>Well, I guess Link Ninja Master Jim Boykin knows why &#8211; it&#8217;s because you don&#8217;t want to get links on pages in the supplemental index because they won&#8217;t get crawled as often.</p>

<p>Jim&#8217;s recent explanation on finding <a href="http://www.jimboykin.com/google-supplemental-results/">if a page is in supplemental</a> pretty well details how to detect if a page is &#8220;healthy&#8221; at all &#8211; i.e. ranks for obscure terms.</p>

<p>If a page does not even rank for an obscure terms on it, you don&#8217;t need a link there.</p>


<p>So you ask again, <b>why is this cool for your business?</b></p>

<p>Because the way to check if a page is worthy to spend time to get a link on it has just become a bit harder. You will need more work, time, effort, unless <b>you automate it</b>. Just as we do here.</p>

<p>And this is the perfect situation to use an &#8220;internal tool&#8221; (as many <span class="caps">SEO</span>s have) as a competitie advantage to get more and better links in a shorter time&#8230; heck &#8211; some link builders might spend another couple clicks on each page to find out if it qualifies for hunting for link. </p>

<p>We Don&#8217;t <img src='http://www.marketingfan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>

<p>In fact we already see the &#8220;Supplemental Index&#8221; on top of our brower toolbar bar when we visit a page.<br />
In fact we already see the &#8220;Supplemental Index&#8221; label as it used to be printed for <span class="caps">ALL </span>google users in the past, nicely embedded in the <span class="caps">SERP</span>s.</p>

<p>And in fact Google has bought itself now 10-20 more Google queries per <span class="caps">SERP </span>page my Link Arbeiter team screens when looking for links &#8211; plus we are inflating the pageloads of all those sites we screen by one&#8230;</p>

<p>Do you think that hurts Google? Nah &#8211; enough resources.<br />
Do you think it hurts me? Nah &#8211; just got a bunch more proxy IPs to make up for the bigger Google scraping load. </p>

<p>Do you think it will hurt the <span class="caps">SEO </span>scene building links? Well &#8230;</p>

<p>I would assume a huge bunch of people won&#8217;t even notice a difference &#8211; after all even <span class="caps">SEO</span>moz de-classified <a href="http://www.seomoz.org/blog/answer-these-ten-questions-before-you-charge-for-seo-services" title="even large scale">70% of</a> <span class="caps">SEO </span>companies for not knowing the <span class="caps">SEO </span>basic</p>

<p>Furthermore a couple of smaller equipped companies will struggle as they will need to do more a lot more work (as per Jim&#8217;s description) to get the same results&#8230;. and I mean &#8211; A <span class="caps">LOT MORE.</span></p>

<p>Now this moves the benefits to larger scale companies (as Google is) that DO have an intact <span class="caps">SEO </span>infrastructure for their daily <span class="caps">SEO </span>work.</p>

<p>But the small scale link builders will first have to build that infrastructure, browser plugins and knowledge to make visible what Google has just taken from the public.</p>

<h2><b>Thanks Google !</b></h2>


<p><hr /><br />
Update 2009-09-01: The Link Juice tool and a lot more is now finally available to the public in the <a href="http://www.linkresearchtools.com">Link Research Tools</a> by <span class="caps">CEMPER.COM</span><br />
<hr /></p><p align="left"><a class="tt" href="http://twitter.com/home/?status=Why+Removed+Supplemental+Index+Labels+are+good+for+my+business+http://ri.ms/u2bo" title="Post to Twitter"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-twitter-big4.png" alt="[Post to Twitter]" border="0" /></a>&nbsp; <a class="tt" href="http://delicious.com/post?url=http://www.marketingfan.com/why-removed-supplemental-index-labels-are-good-my-business&amp;title=Why+Removed+Supplemental+Index+Labels+are+good+for+my+business" title="Post to Delicious"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-delicious-big4.png" alt="[Post to Delicious]" border="0" /></a>&nbsp; <a class="tt" href="http://reddit.com/submit?url=http://www.marketingfan.com/why-removed-supplemental-index-labels-are-good-my-business&amp;title=Why+Removed+Supplemental+Index+Labels+are+good+for+my+business" title="Post to Reddit"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-reddit-big4.png" alt="[Post to Reddit]" border="0" /></a>&nbsp; <a class="tt" href="http://stumbleupon.com/submit?url=http://www.marketingfan.com/why-removed-supplemental-index-labels-are-good-my-business&amp;title=Why+Removed+Supplemental+Index+Labels+are+good+for+my+business" title="Post to StumbleUpon"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-su-big4.png" alt="[Post to StumbleUpon]" border="0" /></a>&nbsp; </p>]]></content:encoded>
			<wfw:commentRss>http://www.marketingfan.com/why-removed-supplemental-index-labels-are-good-my-business/feed</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Google Infrastructure update &#8211; pagerank, supplementals, indexing</title>
		<link>http://www.marketingfan.com/google-infrastructure-update-pagerank-supplementals-indexing</link>
		<comments>http://www.marketingfan.com/google-infrastructure-update-pagerank-supplementals-indexing#comments</comments>
		<pubDate>Thu, 01 Jan 1970 01:00:00 +0000</pubDate>
		<dc:creator></dc:creator>
				<category><![CDATA[pagerank update]]></category>
		<category><![CDATA[supplemental index]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[<p><a href="http://www.marketingfan.com/google-infrastructure-update-pagerank-supplementals-indexing">Google Infrastructure update &#8211; pagerank, supplementals, indexing</a><br/><br/>By <a href="http://www.marketingfan.com">Marketingfan.com Internet Markting Blog</a></p>
Google does a "data-push" aka PR update once again...]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.marketingfan.com/google-infrastructure-update-pagerank-supplementals-indexing">Google Infrastructure update &#8211; pagerank, supplementals, indexing</a><br/><br/>By <a href="http://www.marketingfan.com">Marketingfan.com Internet Markting Blog</a></p>
Today I found that Matt Cutts, mr. GoogleGuy himself had two really super posts on his blog

<p><a href="http://www.mattcutts.com/blog/infrastructure-status-january-2007/">First Pagerank update 2007</a> &#8211; for those obsessed, yes, they are updating it once again&#8230; some new values here and there&#8230; who cares at all?</p>

<p>All remaining words on the green bar are here at <a href="http://www.jimboykin.com/pagerank-4/">Jim&#8217;s blog</a> who is bitching again about people caring too much (or anything at all) about page rank (but he&#8217;s also got a nice post today with his internal <a href="http://www.jimboykin.com/putting-a-price-on-a-link-jims-value-indicators/">link valuation tool</a> )</p>


<p>Matt Cutts also gives another official explanation on what supplemental pages are, how they work and that they are improving their crawl frequency&#8230;</p>

<p>so supplementals are not older than 4 <span class="caps">MONTHS </span>now anymore&#8230;  <img src='http://www.marketingfan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>

<p>Another funny thing is Matt <a href="http://www.mattcutts.com/blog/why-isnt-email-authenticated/">bitching about email authentication</a> including pointing to concepts like domainkeys that I already <a href="http://weblog.cemper.com/a/200611/29-domainkeys-experimental-implementation-worth-the-hassle.php">bitched about</a> some weeks before </p>

<p>Looks like Mr. Cutts was hit by the same amount of <a href="http://weblog.cemper.com/a/200701/10-how-to-get-rid-of-the-re-my-somecrap-spam.php">RE: my crap spam</a> that was hitting my own (G)mail accounts &#8211; undetected <img src='http://www.marketingfan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>


<p><span class="caps">LOL</span></p><p align="left"><a class="tt" href="http://twitter.com/home/?status=Google+Infrastructure+update+--+pagerank%2C+supplementals%2C+indexing+http://ri.ms/ycah" title="Post to Twitter"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-twitter-big4.png" alt="[Post to Twitter]" border="0" /></a>&nbsp; <a class="tt" href="http://delicious.com/post?url=http://www.marketingfan.com/google-infrastructure-update-pagerank-supplementals-indexing&amp;title=Google+Infrastructure+update+--+pagerank%2C+supplementals%2C+indexing" title="Post to Delicious"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-delicious-big4.png" alt="[Post to Delicious]" border="0" /></a>&nbsp; <a class="tt" href="http://reddit.com/submit?url=http://www.marketingfan.com/google-infrastructure-update-pagerank-supplementals-indexing&amp;title=Google+Infrastructure+update+--+pagerank%2C+supplementals%2C+indexing" title="Post to Reddit"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-reddit-big4.png" alt="[Post to Reddit]" border="0" /></a>&nbsp; <a class="tt" href="http://stumbleupon.com/submit?url=http://www.marketingfan.com/google-infrastructure-update-pagerank-supplementals-indexing&amp;title=Google+Infrastructure+update+--+pagerank%2C+supplementals%2C+indexing" title="Post to StumbleUpon"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-su-big4.png" alt="[Post to StumbleUpon]" border="0" /></a>&nbsp; </p>]]></content:encoded>
			<wfw:commentRss>http://www.marketingfan.com/google-infrastructure-update-pagerank-supplementals-indexing/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Death of the SEO Copywriters &#8211; Spam Detection with Phrase Based Information Retrieval</title>
		<link>http://www.marketingfan.com/death-of-the-seo-copywriters-spam-detection-with-phrase-based-information-retrieval</link>
		<comments>http://www.marketingfan.com/death-of-the-seo-copywriters-spam-detection-with-phrase-based-information-retrieval#comments</comments>
		<pubDate>Thu, 01 Jan 1970 01:00:00 +0000</pubDate>
		<dc:creator></dc:creator>
				<category><![CDATA[bill slawski]]></category>
		<category><![CDATA[keyword stuffing]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[supplemental index]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[<p><a href="http://www.marketingfan.com/death-of-the-seo-copywriters-spam-detection-with-phrase-based-information-retrieval">Death of the SEO Copywriters &#8211; Spam Detection with Phrase Based Information Retrieval</a><br/><br/>By <a href="http://www.marketingfan.com">Marketingfan.com Internet Markting Blog</a></p>
Bill Slawski explains a recent patent explaining why spammy pages and low quality content goes to supplemental index recently a lot more often...]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.marketingfan.com/death-of-the-seo-copywriters-spam-detection-with-phrase-based-information-retrieval">Death of the <span class="caps">SEO</span> Copywriters &#8211; Spam Detection with Phrase Based Information Retrieval</a><br/><br/>By <a href="http://www.marketingfan.com">Marketingfan.com Internet Markting Blog</a></p>
Bill Slawski of <span class="caps">SEO</span>bytheSea has a <a href="http://www.seobythesea.com/?p=413">great post up</a> explaining a concept of how search engines (Google, man!) do a phrase based analysis &#8211; of your content to assign quality measures to it and possibly put it into the wastebasket or at least supplemental index.

<p>The idea is that quality documents have a different co-occurrence of certain phrases (&#8221;money-words&#8221;) than spammy or low quality articles you bought for two dollars each from that low-quality writer in India recently who wasn&#8217;t even aware of how to use Word properly, not to speak about creating quality content&#8230; </p>

<p>Certainly a &#8220;SEOed&#8221; article around a phrase, let&#8217;s say &#8220;President of the united states&#8221; would use that term in all variations, word order and such.</p>

<p>A quality article really talking about the President of the united states would probably mention other &#8220;unimportant&#8221; things like names of past presidents, non-important things like amorous adventures, hollywood careers or other generally bad habits of those big guys that nobody would place an Adwords bid on for example.</p>

<p>The search engines just create a <b>co-occurance matrix</b> for all phrases in the document and match those statistics against other quality documents.</p>

<p><blockquote><br />
From the foregoing, the number of the related phrases present in a given document will be known. A normal, non-spam document will generally have a relatively limited number of related phrases, typically on the order of between 8 and 20, depending on the document collection. By contrast, a <b>spam document</b> will have an excessive number of related phrases, for example on the order of between <b>100 and 1000 related phrases</b>. Thus, the present invention takes advantage of this discovery by identifying as spam documents those documents that have a statistically significant deviation in the number of related phrases relative to an expected number of related phrases for documents in the document collection.<br />
</blockquote></p>

<p>So short &#8211; that patent and the wonderful clear exlpanation by Bill outlines pretty well, that Google &#038; co DO have the means and technology to judge on content quality &#8230;</p>

<p>and that is the death for all <span class="caps">SEO </span>&#8220;copywriters&#8221; just focussing on keyword density, repetition and keyword stuffing.</p>


<p>What does that mean for you if you <span class="caps">HIRE </span>a writer for creating content?</p>

<p>DO <span class="caps">NOT </span>overdo your specifications concerning keyword phrases to use!</p>

<p>Especially in the last months I have seen content rank <span class="caps">GREAT </span>on Google (if on the right domains) for <b>related phrases</b> versus phrases that were really used in the content&#8230; you don&#8217;t need to have an exact mention of a keyword phrase for it to be found on Google anymore!</p>

<p><span class="caps">NO, </span>it could even harm you nowadays &#8211; that&#8217;s the next phase of overoptimization penalties &#8211; create good, natural content and <span class="caps">RANK</span>!</p><p align="left"><a class="tt" href="http://twitter.com/home/?status=Death+of+the+SEO+Copywriters+--+Spam+Detection+with+Phrase+Based+Information+Retrieval+http://ri.ms/l5tm" title="Post to Twitter"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-twitter-big4.png" alt="[Post to Twitter]" border="0" /></a>&nbsp; <a class="tt" href="http://delicious.com/post?url=http://www.marketingfan.com/death-of-the-seo-copywriters-spam-detection-with-phrase-based-information-retrieval&amp;title=Death+of+the+SEO+Copywriters+--+Spam+Detection+with+Phrase+Based+Information+Retrieval" title="Post to Delicious"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-delicious-big4.png" alt="[Post to Delicious]" border="0" /></a>&nbsp; <a class="tt" href="http://reddit.com/submit?url=http://www.marketingfan.com/death-of-the-seo-copywriters-spam-detection-with-phrase-based-information-retrieval&amp;title=Death+of+the+SEO+Copywriters+--+Spam+Detection+with+Phrase+Based+Information+Retrieval" title="Post to Reddit"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-reddit-big4.png" alt="[Post to Reddit]" border="0" /></a>&nbsp; <a class="tt" href="http://stumbleupon.com/submit?url=http://www.marketingfan.com/death-of-the-seo-copywriters-spam-detection-with-phrase-based-information-retrieval&amp;title=Death+of+the+SEO+Copywriters+--+Spam+Detection+with+Phrase+Based+Information+Retrieval" title="Post to StumbleUpon"><img class="nothumb" src="http://www.marketingfan.com/wp-content/plugins/tweet-this/icons/tt-su-big4.png" alt="[Post to StumbleUpon]" border="0" /></a>&nbsp; </p>]]></content:encoded>
			<wfw:commentRss>http://www.marketingfan.com/death-of-the-seo-copywriters-spam-detection-with-phrase-based-information-retrieval/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
