Search Blog

<<  May 2008  >>

SMTWTFS
123
45678910
11121314151617
18192021222324
25262728293031

May 02, 2008

How to Get Your News on Google News Fast!

Short answer: API

A person from Google (who’s name I didn’t catch!) came to the NewsTools 2008 Conference this afternoon and gave a very helpful rundown of how to get your breaking news article onto Google News faster….
 
Here’s what to do:
Know what Google is looking for?
Google crawls pages faster that have these attributes:
  • Original content.
  • Multiple authors.
  • Proper attribution.
  • Quick response time.
(Frequency of update may also factor in – such as when articles have a substantial change. Updating every minute and changing one letter will not help.)
 
Most legitimate news sites make the cut!
 
The news crawl is different than the more general Web crawl, which can make the search on news.google.com different than the search from google.com. There are two ways to get have your news site crawled:
 
Through the standard news crawl, once your news site is validated by Google (according to the four things above), Google crawls it and looks for things like bylines and location, whether the site has images and more. Google News automatically figures out which pages are articles and what aren’t, etc., and puts your articles in Google News accordingly. This can apparently take a while.
 
OR (here’s the faster way!) you can create a really simple, automated sitemap page that gets down to the article level. Sitemaps get your content on Google News much faster! 
 
Sitemaps, Google says, are essentially feeds of semi-structured data for crawlers to ingest. It needs a simple URL – think www.newspaper.com/sitemap. The sitemaps need to be efficient – simple design with relevant information.  The Google sitemap API is at http://code.google.com/more/#products-products-sitemaps
 
Apparently that’s basically it…. More information on this (much more!) will be available later today on the Journalism That Matters – Silicon Valley (aka NewsTools 2008) notes wiki , which is at www.newstools.org.
 
You can find out if your site is getting indexed by searching for site:URL in the news.google.com search.


Posted by Beth Lawton at 5:22 PM | PermaLink | 1 comment

Subscription Options

You are not logged in, so your subscription status for this entry is unknown. You can login here.

Comments

Re: How to Get Your News on Google News Fast!
Hi Beth--

Good meeting you this week. I wanted to add to the above.

First, Google keeps calling it an "API" but it's really a schema.
There's a sitemap for general websites overall (which you linked to above), and one specific for news sites, which is referenced here:
http://code.google.com/support/bin/answer.py?answer=42738&topic=10419

Speaking as an engineer familiar with schemas (part of XML), their documentation is very light. It doesn't have any information on its evolution. I asked Dan Meredith during the session whether this was going to be an open process-- and he conceded that, up until now, Google News has only been working with their "major customers" on the next version.

Here's the placeholder on the JTM wiki:
http://www.mediagiraffe.org/wiki/index.php/Jtm-sv-drupal-Drupal_Day_archive_mobile_vieo_google_news_mechanics_policy

More to come. I'd like to see if we can bootstrap a community somehow.

Jon
Posted by Jon Garfunkel on May 4, 2008 at 11:19 AM

Post a Comment

* required fields
Name:   *
Email:   * your email address will not be publicly displayed.

Anti-spam key

Type in the text that you see in the above image:

Your comment:

Sorry, no HTML allowed!