Get Google Search Results With PHP – Google AJAX API And The SEO Perspective

If you’ve ever tried to write a program that fetches search results from Google, you’ll no doubt be familiar with the excrutiating annoyances of parsing the results and getting blocked periodically. Run a couple hundred queries in a row and bam! – your script is banned until proven innocent by entering an captcha. Even that would provide only a short reprieve, as you’d soon get blocked again.

Luckily there’s an official Google search API that will let you avoid that hassle. In this post you’ll find an example PHP script and a (mainly) SEO-oriented review of the API.

Using the AJAX API in PHP

I must confess that until yesterday I didn’t know you could use the Google AJAX search API in languages other than JavaScript. The documentation didn’t even mention the possibility when the API was first released. Well, it does now, and PHP is among the supported languages. Oh, the joy.

The API is already pretty well documented, so I won’t waste your time with another lengthy tutorial. Instead, here’s a simple example of how you could use it in PHP :

/**
 * google_search_api()
 * Query Google AJAX Search API
 *
 * @param array $args URL arguments. For most endpoints only "q" (query) is required.  
 * @param string $referer Referer to use in the HTTP header (must be valid).
 * @param string $endpoint API endpoint. Defaults to 'web' (web search).
 * @return object or NULL on failure
 */
function google_search_api($args, $referer = 'http://localhost/test/', $endpoint = 'web'){
	$url = "http://ajax.googleapis.com/ajax/services/search/".$endpoint;
	
	if ( !array_key_exists('v', $args) )
		$args['v'] = '1.0';
	
	$url .= '?'.http_build_query($args, '', '&');
	
	$ch = curl_init();
	curl_setopt($ch, CURLOPT_URL, $url);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	// note that the referer *must* be set
	curl_setopt($ch, CURLOPT_REFERER, $referer);
	$body = curl_exec($ch);
	curl_close($ch);
	//decode and return the response
	return json_decode($body);
}

$rez = google_search_api(array(
		'q' => 'antique shoes',
 ));

print_r($rez);

That’s it for the programming part.

So should we really throw away our lovingly crafted SERP scrapers and embrace the “official” API? Perhaps not. There are some peculiar things I’ve noticed after trying out the new API.

The Good

Lets start with the positive aspects. First, it looks like you can indeed safely use the API without getting blocked – I successfully ran about 1800 API queries in ~2 hours. Due to my crappy connection I was unable to test how it would behave if you turn it up to eleven and send hundreds of requests per second, but the rate limiter is definitely more lenient on API users than on plain SERP scrapers. This is a major plus for people who don’t like throttling their software to one request per minute or hunting for working proxies to get around bans.

The API also makes it easy to parse the results. All queries return JSON-encoded data, so you just json_decode() it and go. No need to invent complicated regexps that must be rewritten every time Google changes the HTML structure of the search results page.

The Bad

Of course, with a cliche megacorporation like Google it’s never all fun and games. You can only get 8 search results at a time, and no more than 64 results in total for any particular keyword. Whether this is a problem depends on what you intend to do with the API, but it’s certainly an unpleasant limitation.

The really peculiar – nay, insidious – thing is how the search results returned by the API differ from normal SERPs. A site that is #10 in a normal Google search may suddenly turn up as #1 in the API results. The typical #5 result may be moved to the second page. Basically, the API results look like they’ve been shuffled around a bit – the same URLs are returned but in slightly different order. Also, the “estimated result count” provided by the API is consistently much lower than what a normal search shows. All this makes the API useless for rank checking and similar SEO applications.

According to my tests you can’t just write off these discrepancies as a sideffect of geo-targeting.

It Depends

Overall, the API is either great or it kind-of sucks, depending on what you want to do with it.

At the risk of sounding like a conspiracy theorist, I must say the API seems to be cleverly engineered to be useful for “normal” purposes and somewhat useless for SEO. After all, only SEO workers really need accurate ranking data and more than 64 results per keyword phrase. Typical search engine users rarely move beyond the first page of results, so the limitations don’t hurt them. The various mashup makers that cater to the common user are also unaffected. It’s only the SEOs (and the rare academic researcher) that would be dissatisfied with the imposed constraints.

Of course, I’m sure you can still imagine a few interesting uses for the API 😉

Related posts :

85 Responses to “Get Google Search Results With PHP – Google AJAX API And The SEO Perspective”

  1. I already try your script.. and place in search of wp.. by replacing “shoes” with $s but I ve been banned by Google due to violation. Do you know … what are the reason..?

  2. I was wondering, if you can tell me how to use this to find the rankings for a site.

    For Example, when I run the script I would like the result to be something like:

    Search Term: ‘mgpwr’ with site: ‘mgpwr.co.uk’ is ranked: 1
    Search Term: ‘designer’ with site: ‘mgpwr.co.uk’ is ranked: 13
    Search Term: ‘freelance’ with site: ‘mgpwr.co.uk’ is ranked: 100

    see my point? I know the SOAP API now has been cancelled. I do need something similar to this: http://www.further.co.uk/tools/search-position-check/ but just google UK

    I appreciate all help 🙂

  3. White Shadow says:

    You could iterate over the returned results until you find one where the URL contains the domain name you’re looking for. To determine its position, just keep count of how many results you’ve already examined.

    The format and structure of the results are described on this page. Since the API only returns up to 8 results at a time, you’ll probably need to put the whole thing in a loop – check the first 8 results first, then the next 8, and so on.

  4. solange says:

    Hello everyone;

    I need a lot using this API in my system. But the principle would be a site, I want to get automatic words and sends them to the API and print the results in a document. Doc.
    Please someone can tell me if I do that?
    Abrs

  5. […] Get Google Search Results With PHP – Google AJAX API And The SEO Perspective 这篇文章中也说道这个问题: The really peculiar – nay, insidious – thing is how the search results returned by the API differ from normal SERPs. […]

  6. Adnan says:

    How to use this code in my web (step by step) Pls.

  7. Nicky says:

    Is it also possible to get results with the script with Google Safe Search on?

  8. travesti says:

    Thanks in advance for all of your help, I definitely appreciate it…

  9. SZA says:

    great article, i’ll try that code

  10. Lam Web says:

    Thanks, but i dont know how to extract data with json_decode() ?

  11. Ben says:

    When you guys write tutorials, you need to write them for dummies. There’s too much programming jargon to understand how this works.

  12. Steve says:

    It’s a blog entry about using PHP to retrieve data from an API. What parts of any of this do you guys assume should *not* relate to programming?

    It’s not about “dummies” or not, can’t expect 1 blog entry about a particular topic to teach you a language that you do not know.

  13. Lior says:

    So you don’t have a way to see the google results in order to create a tracking script in PHP? I kinda thought that i could use this API in order to create an “history” of the ranking results in the keywords i promote..

    Can you help me with that?

    Thanks

  14. Bodo says:

    I tried google search API 2 years ago, and finally switched to the yahoo (boss) one.

    Google is not only limitated, it also delivers sometimes NO results, while there exist real results (for example if only 1 or 2 results for that key exist in a normal google search, the api delivers 0)
    yahoo boss has different results, but the API delivers more than google AJAX.

    Try it out… I think it has even less limitations, lets you do 5000 requests a day.

  15. jeff says:

    hi;
    thanks for the script , works great. when i go to Google and search for a give term , I get a different number for the total pages found ( top left ‘About xxxxxx results’) that the value of [‘estimatedResultCount’] in the script . is there a problem with script ?

    thanks

  16. White Shadow says:

    I suspect that’s just Google being Google. As I said in the post, the results returned by the API can differ from what you get when you actually run the search yourself. There’s nothing wrong with the script itself; the big G just is just messing around (or geotargeting, or deliberately fuzzing the results to make them less useful to SEOs, etc).

  17. jeff says:

    WOW ! thanks so much White Shadow for the quick reply. tks

  18. Googlehater says:

    That’s a call to the DEPRECATED interface. They can turn off https://ajax.googleapis.com/ajax/services/search/web any time they like (and they will).

  19. bellimbusto says:

    yes, this is the old API, not even working (just tried). You can use new method (requests API key) http://code.google.com/intl/it-IT/apis/customsearch/v1/overview.html

  20. thanks for the pretty code, but how to use it? where i can paste this code? thanks

Leave a Reply