Getting indexed and ranking with slightly less unique content!
Many webmasters have contacted me recently with the same problem. It’s an old problem that is still very important in all search engines. Their websites weren’t indexed entirely or perticular pages couldn’t even rank for the most unique text phrases although these pages were indexed. In this post I will give some pointers on effortlessly making pages more unique.
Large database driven websites all have the same problem. Sometimes search engines do not index them fully. But when they do, many pages still don’t rank in the search results. This is caused by the uniqueness of the content and seeming importance of those pages. Let’s take an average large database driven website with for instance jobs. These sites contain large amounts of job descriptions that are all formatted in the same way. While the content is somewhat unique, the shear amount causes part of that website to be seen as duplicate.
So how does this duplicate content filtering in for instance Google work?
Duplicate content filtering isn’t a black or white issue. It has multiple shades of grey, that in the worst case penalize an entire site and in the best case just effects the ranking of a page slightly. Because all forms of duplication across pages can effect your ranking, it is very important to know how to avoid duplicate content. To ensure perceived search result quality, removing duplicates is high on the agenda of every search engine.
Real focus on a search term is best given by dedicating an entire page to those search terms. This means creating and filling pages can become a huge task. Computer generated or scraped text is a very easy way to create pages, but that is where duplicate content filters often kick in. When you want to rank for combinations between “jobs in” and every city you can think of (for instance “jobs in amsterdam”), you probably generate many copies of the same page and replace the city spot wherever you can. And that is exactly what search engines want to combat.
Duplicate area’s
Many search engines see a page as part of a website and they can distinguish between the header, footer, menu, content block, etc. In fixed blocks duplication is very common, because the header and main menu are usually the same across an entire site. In the content block duplication is less common, so any duplication there is something search engines look at more closely. While duplicate area’s on the entire page should be limited, the percentage of duplicate text in the content block is extremely important.
The more inportant the page, the more duplication is condoned
If your homepage and a page just below it are near duplicates from each other, they can still rank on the part that makes them unique (even on more competitive terms). When the near duplicates are located further down the navigation and they recieve little linkjuice, the chances of them not ranking or even beïng omitted from the index are ever increasing. Linkjuice transfer is very important and optimizing it can fix many duplication issues.

The illustration above shows 2 navigational structures from the homepage. When the homepage gets an extra link on it, pages further down recieve less linkjuice. Less linkjuice means a higher chance of getting caught by duplicate content filters. Put pages higher in your navigation or acquire external links directly to them when you want to make sure they rank in spite of duplication.
Unique mashups
The “jobs in …” example will be easily detected if the city is the only inserted text. So how can you make such a thing work without having to write loads of text? You create unique mashups!
A mashup is a collection of different types of collected content. When you write small pieces of unique text per page and collect all other content in small pieces from many different sources, search engines will love your pages!
In the “jobs in …” example: Write a fifty word intro about “jobs” per city you want to focus on. Add a list of about 10 job descriptions per city from your database. Scrape a piece of city information from a cityguide. Scrape extra pieces of additional information from other sources and finally randomize the order of those content blocks. Try to collect a total of about 300 words. Search engines are smart enough to detect this technique, but the people who use it, have been ranking for ages. The linkjuice to those pages, the amount of used sources and amount of unique text you write determine if you rank on all cities.
Keep the good content above the fold
Unique text is very labour intensive and quality text cannot be automated. But where do you need quality text? Just get your visitor to click a button before they start reading the entire page content and they won’t notice the low quality
Focus good usability and text quality on the top part of your page. People rarely scroll and read in detail if the function of the page is already clear and the navigation options are very obvious.
Lazy people can still score with automation, but I prefer using cheap copywriters!