Get Google Image Search Results With PHP

Google Image Search doesn’t get as much time in the spotlight as the “normal” Web Search, but it’s still useful for things like finding suitable illustrations for an article (Flickr also comes to mind). Whatever you use it for, you can often get results faster with a bit of automation. So here’s a simple PHP script that can parse and return the results of any Image Search query. It’s strictly for education purposes though, as actually using it would probably constitute a violation of Google ToS ;)

PHP Script for Google Image Search

Note that this script requires the eHttpClient cURL class by 5ubliminal.

function googleImageResults($query, $page=1, $safe='off', $dc="images.google.com"){
    $page--;
    $perpage = 21;
    $url=sprintf("http://%s/images?q=%s&gbv=2&start=%d&hl=en&ie=UTF-8&safe=%s&sa=N",
        $dc,urlencode($query),$page*$perpage,$safe);
 
    $hc=new eHttpClient();
    $hc->setReferer("http://".$dc."/");
    $html=$hc->get($url);
    $code = $hc->getInfo(CURLINFO_HTTP_CODE);
    if ($code != '200') return false;
 
    if(!preg_match_all('/dyn.Img\((.+)\);/Uis', $html, $matches, PREG_SET_ORDER))
        return array();
    $results=array();
    foreach($matches as $match){
        if(!preg_match_all( '/"([^"]*)",/i', $match[1], $parts)) continue;
 
        if(!preg_match('/(.+?)&h=(\d+)&w=(\d+)&sz=(\d+)&hl=[^&]*&start=(\d+)(?:.*)/', 
		$parts[1][0], $url_parts)
	) continue;
        $refUrl = urldecode($url_parts[1]);
        $height = intval($url_parts[2]);
        $width = intval($url_parts[3]);
        $rank = intval($url_parts[5]);
        //check if we've already passed the last page of results 
        if($rank < ($page * $perpage + 1)) break;
        $imgUrl = urldecode($parts[1][3]);
        $refDomain = $parts[1][11];
        $imgText = $parts[1][6];
        $imgText = preg_replace('/\\\x(\w\w)/', '&#x\1;', $imgText);
        $imgText = strip_tags(html_entity_decode($imgText));
        $thumbUrl = $parts[1][14].'?q=tbn:'.$parts[1][2].$imgUrl;
 
        $one_result=array(
            'Rank' => $rank,
            'RefUrl' => $refUrl,
            'ImgText' => $imgText,
            'ImgUrl' => $imgUrl,
            'Height' => $height,
            'Width' => $width,
            'Host' => $refDomain,
            'ThumbUrl' => $thumbUrl,
        );
        array_push($results,$one_result);
    }
    return $results;
}

How To Use It

I think all the parameters are self-explanatory. The function will return an array of results if it’s successful, or an empty array if there are no results for the query. It can also return a boolean false in case of a really bad error (e.g. a “403 Forbidden” result).

Here’s an example -

$results = googleImageResults('headcrab', 1);
print_r($results);

The output looks something like this -

Array
(
    [0] => Array
        (
            [Rank] => 1
            [RefUrl]=>http://bjoern.amherd.net/2006/12/15/headcrab-chappe/
            [ImgText] => Headcrab-Chappe
            [ImgUrl]=>http://bjoern.amherd.net/wp-content/uploads/2006/12/headcrab.jpg
            [Height] => 297
            [Width] => 450
            [Host] => bjoern.amherd.net
            [ThumbUrl]=>http://tbn0.google.com/images?q=tbn:2drTKLkzK4KZQM:http://bjoern.amherd.net/wp-content/uploads/2006/12/headcrab.jpg
        )
 
    [1] => Array
        (
            [Rank] => 2
            [RefUrl]=>http://www.penny-arcade.com/comic/2004/11/15
            [ImgText] => The Common Headcrab
            [ImgUrl]=>http://www.penny-arcade.com/images/2004/20041115h.jpg
            [Height] => 423
            [Width] => 750
            [Host]=>www.penny-arcade.com
            [ThumbUrl]=>http://tbn0.google.com/images?q=tbn:A2d3zKEpYJe0FM:http://www.penny-arcade.com/images/2004/20041115h.jpg
        )
   ......

Notes And Caveats

  • The original eHttpClient class tends to throw some warnings due to omitted function parameters, so I use a slightly modified version in my own projects. Fixing the class is left as an exercise to the reader :P
  • As far as I know, you can’t specify the number of results per page for Image Search. It always returns 21 images, even though it only displays 18 of those to human visitors. Weird.
  • If you send a lot of queries in a short time you will get a temporary ban. Theoretically you could overcome this by using proxies and/or appropriate timeouts between searches - that is, if you could bring yourself to commit such an insiduous breach of the Terms of Service, which I’m not advocating in any way.
  • Whatever you do, don’t do this.

Disclaimer

Image search code provided AS-IS, with no warranty of anything. And so on. Good luck.

Related posts:

13 Responses to “Get Google Image Search Results With PHP”

  1. 1
    SlightlyShadySEO Says:

    heh and why not do that?
    Aside from slightly obnoxious comments coming through of course.

  2. 2
    White Shadow Says:

    That’s just reverse psychology or something ;) Personally, I got me a nice database with the help of your tutorial and this script.

  3. 3
    SlightlyShadySEO Says:

    Haha alright :)
    Apparently my sarcasm detection is a bit off today. Glad you enjoyed, and nice writeup yourself!

  4. 4
    5ubliminal Says:

    Yeah … my class throws warnings as I enjoy using php functions with undefined parameters which do throw warnings but allow me to play with them as I need to. Best is to dump them (warnings and notices) using:
    error_reporting(E_ALL^(E_WARNING|E_NOTICE));
    Not every warning is an error but they do look bad on sites :)

    Regards.
    PS: I’m glad you figured out how to use my class well. Many struggle too much with it :)

    @Shady: I did notice the sarcasm but a ;) at the end of the line would have made it more obvious :)

  5. 5
    5ubliminal Says:

    The photos on DA are awesome. Actually … your pussycats make’em look so nice but I’ll give you a bit of credit too.

  6. 6
    White Shadow Says:

    I prefer to set error_reporting to E_ALL when developing and get rid of all warnings & notices by changing the code. I think this leads to a more stable implementation, but that’s just IMHO. So I set up default values for the function parameters that your class treats as optional.

    BTW… I have more than 13 cats (seriously).

  7. 7
    5ubliminal Says:

    I write my C++ code warning free but, as PHP has no real rules regarding data types and so on, notices can seem silly sometimes so I just ignore them and make sure it all works.

    13 cats … wow … my folks have 2 of them :)

  8. 8
    björn | AMHERD | PHP mit Björn Says:

    [...] warum gerade mein Beitrag über die Headcrab so häufig besucht wird: Meine Website wird als PHP-Script-Ergebnis Beispiel missbraucht. Sachen [...]

  9. 9
    Stickytape Says:

    Hi, firstly thanks for (what I’m sure is) a great script. I’ve been having a look through 5ubliminal’s code as well as yours and I just can’t work out where I’m going wrong. I have 5ubliminal’s class directly above your code and have tested his class which works.

    I have literally copy and pasted your code but all I get returned is Array() when I call :

    $results = googleImageResults(’headcrab’, 1);
    print_r($results);

    as you recommend.

    Can you perhaps point me in the right direction? http://rafb.net/p/BM6bPe79.html is the sourcecode of what I’m using. Thanks very much in advance

  10. 10
    White Shadow Says:

    Google probably changed their code, so one of the regexps in the function no longer works. I’ve modified it and it works again, at least for me. The post has also been fixed, so you can just copy the new version.

    In particular, it was the first preg_match_all() that needed to be changed.

  11. 11
    Stickytape Says:

    Hi, just to let you know that the alteration works here also. Thank you very much for the really quick reply and wonderful class. When I get time to actually implement it on my own website and do with it what I want, I shall let you know.

    Thanks again

    Stickytape

  12. 12
    fnfzone Says:

    I think its a nice work if it work. I will check it when i will back to home.

    thanks

  13. 13
    fnfzone Says:

    It works fine.

    thanks

Leave a Reply