Enumerating note data in Flickr using PHP
Flickr provides a comprehensive API to access its data. It also has a very neat notes facility that allows you to add notes to a region of an image which pop-up when you mouse over the relevant area. Flickr provides API calls to add, edit and delete notes but not to enumerate notes associated with an image. To do that you have to do a little more work.
Flickr provides the following APIs to access notes:
The first API takes the photo id, coordinates (appropriate for a 500 pixel wide image) and text of the note to add. The second takes the note id, new coordinates and text to update the note with. The third takes the note id of the note to delete. As you can see there’s no way, using the API, of getting the ids of notes already associated with an image. The API only allows you to edit and delete notes that you have created using the API.
However all is not completely lost. Flickr outputs very nice HTML with HTML elements having sensible class names and so on. The layout of a Flickr image page is basically as follows:
|
id=”notes”
id=”photo”
id=”meta”
id=”comments”
id=”primary-column” |
id=”sidebar” | id=”shortcuts” |
All of the elements are <div;> elements except for “notes” which is an un-ordered list. Each of the notes themselves is a list item. Below is a typical note list item. As you can see it looks some what complicated. This is because it includes the code needed to render the border of the note’s active area together with re-size corners. However we can ignore most of it.
<li class="box note note-byowner note-highlight" data-note-user_id="38990160" data-note-id="72157624923250854" style="left: 253px; top: 56px; width: 94px; height: 120px; z-index: 2; " id="yui_3_1_0_1_1285322792456209"> <span class="box-stroke box-stroke-outer" id="yui_3_1_0_1_1285322792456803"></span> <span class="box-stroke box-stroke-inner"></span> <span class="box-stroke box-stroke-main" id="yui_3_1_0_1_1285322792456784"></span> <span class="note-content" id="yui_3_1_0_1_1285322792456788"> <span class="note-text" id="yui_3_1_0_1_1285322792456792"> <span class="note-wrap" id="yui_3_1_0_1_1285322792456832">WIZ810MJ ethernet module.</span> </span> </span> <span class="resize resize-nw"><span></span> </span> <span class="resize resize-ne"> <span> </span> </span> <span class="resize resize-se"><span> </span> </span> <span class="resize resize-sw"> <span></span> </span> </li> |
The only elements of interest is the list item itself and the plain text it contains. The list item includes meta data which is useful to us. The data-note-id value is the note id value to use in the Flickr API, the style provides the coordinates and size of the note’s active area, and the plain text is the actual note description. This makes it reasonably easy to extract using “screen scraping”.
The Simple HTML DOM parser is a couple of PHP objects that parses HTML and allows you to access the DOM in a way similar to JQuery. It uses the MIT licence and is freely available from SourceForge. Using this parser we can extract the notes information from the Flickr web page for the image.
The URL for an image page on flickr is:
http://www.flickr.com/photos/<user id>/<photo id>/
The following example code displays the note information for notes related to an image in my Flickr stream.
<?php // Using the SimpleHTMLDOM parser. include_once('simple_html_dom.php'); // Create a new parser $html = new simple_html_dom(); // Parse the web page for the image we want the notes for $html->load_file('http://www.flickr.com/photos/39013214@N03/4976074521/'); // Flickr keeps all the notes in a <ul> with a class id of 'notes' $notes = $html->find('#notes', 0); // Each note is a list item $note_items = $notes->find('li'); // Iterate through the notes foreach ($note_items as $item) { // Get the id $id = $item->getAttribute('data-note-id'); // Get the coordinates foreach (explode(";", $item->getAttribute('style')) as $style_item) { list($key,$value) = explode(":", trim($style_item)); switch (trim($key)) { case 'left' : $left = $value; break; case 'top' : $top = $value; break; case 'width' : $width = $value; break; case 'height': $height = $value; break; } } // Get the text $text = $item->text(); // Output info echo "Id: " . $id . "\n"; echo "Coords:\n"; echo " Left : " . $left . "\n"; echo " Top : " . $top . "\n"; echo " Width : " . $width . "\n"; echo " Height: " . $height . "\n"; echo "Text: " . $text . "\n"; echo "\n"; } |
As you can see Simple HTML DOM parser makes things simple. The page for the image is loaded and parsed. Then the first element with the id of “notes” is found (there should be only one). Then an array of the list item elements to the “notes” element is obtained and iterated over. The data-note-id item is the note id; the style is parsed to extract the coordinates and size of the active area; and the plain text, i.e. the note text, is copied.
The output from the example code is as follows:
kring:simplehtmldom melanie$ php test.php Id: 72157624923250854 Coords: Left : 253px Top : 56px Width : 94px Height: 120px Text: WIZ810MJ ethernet module. Id: 72157624798614851 Coords: Left : 254px Top : 177px Width : 95px Height: 77px Text: Adapter to allow it to be plugged in to breadboard (the WIZ810MJ has 2mm pitch connectors) Id: 72157624923256462 Coords: Left : 205px Top : 204px Width : 49px Height: 67px Text: 3.3V regulation. Id: 72157624798617733 Coords: Left : 354px Top : 233px Width : 91px Height: 72px Text: From BBC micro 8 bit "User" parallel port. Id: 72157624923259400 Coords: Left : 103px Top : 299px Width : 242px Height: 64px Text: Boarduino to provide 5V power.
Using this information you can update the note using the Flickr API or display the image complete with notes on your own page.