How To Retrieve Webpage Meta Tags With PHP’s get_meta_tags

I know that the burning question on all you Web geeks’ minds, the one that keeps you up at night, is “How can I grab the meta content from a website using PHP?” Either that, or “Does a duck’s quack echo?” For purposes of this discussion, though, I’m gonna’ go with the former.

As any Web guy or gal worth their salt knows, meta tags define each webpage’s title, description, and much more. They’re used as a webpage summary in search engine listings when they spider your site, and help to categorize your content. (Did I lose you? Go back to Meta Tags 101 and 102 before continuing.) Why would you want to retrieve them? Maybe you want to create your own website directory; this would be a quick way to pop out a nice website list without a ton of cut-and-paste. It could also help you spy on your competitors, to see what keywords you might want to focus on to try and beat them in the rankings game.

This example focuses on the title (using the <title> tag), description, and keywords tags, but could be modified to grab other data such as technical settings and so forth. Here’s the end result:

Title: Yahoo!
URL: http://www.yahoo.com/
Description: Welcome to Yahoo!, the world’s most visited home page. Quickly find what you’re searching for, get in touch with friends and stay in-the-know with the latest news and information.
Keywords: yahoo, yahoo home page, yahoo homepage, yahoo search, yahoo mail, yahoo messenger, yahoo games, news, finance, sport, entertainment

And now, the moment you’ve all been waiting for: the programming for this little gem. (Thanks to WebHole.net for the original code.) Here goes:

<?php

/*Modified from original source:
http://webhole.net/2010/02/21/how-to-extract-meta-data-from-a-page/
*/

function getMetaData($url){
// get meta tags
$meta=get_meta_tags($url);
// store page
$page=file_get_contents($url);
// find where the title CONTENT begins
$titleStart=strpos($page,'<title>’)+7;
// find how long the title is
$titleLength=strpos($page,'</title>’)-$titleStart;
// extract title from $page
$meta[‘title’]=substr($page,$titleStart,$titleLength);
// return array of data
return $meta;
}

$website = “http://www.yahoo.com/”;   // change this to the URL you want to scrape

$tags=getMetaData($website);

echo ‘<b>Title:</b> <i>’.$tags[‘title’] . ‘</i>’;
echo ‘<br />’;
echo ‘<b>URL:</b> <a href=”‘ . $website . ‘”>’ . $website . ‘</a><br />’;
echo ‘<b>Description:</b> ‘.$tags[‘description’];
echo ‘<br />’;
echo ‘<b>Keywords:</b> ‘.$tags[‘keywords’];

?>

Leave a Reply

Your email address will not be published. Required fields are marked *