Enumerating note data in Flickr using PHP

Flickr provides a comprehensive API to access its data. It also has a very neat notes facility that allows you to add notes to a region of an image which pop-up when you mouse over the relevant area. Flickr provides API calls to add, edit and delete notes but not to enumerate notes associated with an image. To do that you have to do a little more work.

Flickr provides the following APIs to access notes:

The first API takes the photo id, coordinates (appropriate for a 500 pixel wide image) and text of the note to add. The second takes the note id, new coordinates and text to update the note with. The third takes the note id of the note to delete. As you can see there’s no way, using the API, of getting the ids of notes already associated with an image. The API only allows you to edit and delete notes that you have created using the API.

However all is not completely lost. Flickr outputs very nice HTML with HTML elements having sensible class names and so on. The layout of a Flickr image page is basically as follows:

id=”notes”

id=”photo”

id=”meta”
id=”comments”

id=”primary-column”

id=”sidebar” id=”shortcuts”

All of the elements are <div;> elements except for “notes” which is an un-ordered list. Each of the notes themselves is a list item. Below is a typical note list item. As you can see it looks some what complicated. This is because it includes the code needed to render the border of the note’s active area together with re-size corners. However we can ignore most of it.

<li class="box note note-byowner note-highlight" data-note-user_id="38990160" data-note-id="72157624923250854" style="left: 253px; top: 56px; width: 94px; height: 120px; z-index: 2; " id="yui_3_1_0_1_1285322792456209">
<span class="box-stroke box-stroke-outer" id="yui_3_1_0_1_1285322792456803"></span>
<span class="box-stroke box-stroke-inner"></span>
<span class="box-stroke box-stroke-main" id="yui_3_1_0_1_1285322792456784"></span>
<span class="note-content" id="yui_3_1_0_1_1285322792456788">
<span class="note-text" id="yui_3_1_0_1_1285322792456792">
<span class="note-wrap" id="yui_3_1_0_1_1285322792456832">WIZ810MJ ethernet module.</span>
</span>
</span>
<span class="resize resize-nw"><span></span>
</span>
<span class="resize resize-ne">
<span>
</span>
</span>
<span class="resize resize-se"><span>
</span>
</span>
<span class="resize resize-sw">
<span></span>
</span>
</li>

The only elements of interest is the list item itself and the plain text it contains. The list item includes meta data which is useful to us. The data-note-id value is the note id value to use in the Flickr API, the style provides the coordinates and size of the note’s active area, and the plain text is the actual note description. This makes it reasonably easy to extract using “screen scraping”.

The Simple HTML DOM parser is a couple of PHP objects that parses HTML and allows you to access the DOM in a way similar to JQuery. It uses the MIT licence and is freely available from SourceForge. Using this parser we can extract the notes information from the Flickr web page for the image.

The URL for an image page on flickr is:

http://www.flickr.com/photos/<user id>/<photo id>/

The following example code displays the note information for notes related to an image in my Flickr stream.

<?php
 
// Using the SimpleHTMLDOM parser.
include_once('simple_html_dom.php');
 
// Create a new parser
$html = new simple_html_dom();
 
// Parse the web page for the image we want the notes for
$html->load_file('http://www.flickr.com/photos/39013214@N03/4976074521/');
 
// Flickr keeps all the notes in a <ul> with a class id of 'notes'
$notes = $html->find('#notes', 0);
 
// Each note is a list item
$note_items = $notes->find('li');
 
// Iterate through the notes
foreach ($note_items as $item) {
	// Get the id
	$id = $item->getAttribute('data-note-id');
 
	// Get the coordinates
	foreach (explode(";", $item->getAttribute('style')) as $style_item) {
		list($key,$value) = explode(":", trim($style_item));
		switch (trim($key)) {
			case 'left'  : $left   = $value; break;
			case 'top'   : $top    = $value; break;
			case 'width' : $width  = $value; break;
			case 'height': $height = $value; break;
		}
	}
 
	// Get the text
	$text = $item->text();
 
	// Output info
	echo "Id: " . $id . "\n";
	echo "Coords:\n";
	echo "  Left  : " . $left . "\n";
	echo "  Top   : " . $top . "\n";
	echo "  Width : " . $width . "\n";
	echo "  Height: " . $height . "\n";
	echo "Text: " . $text . "\n";
	echo "\n";
}

As you can see Simple HTML DOM parser makes things simple. The page for the image is loaded and parsed. Then the first element with the id of “notes” is found (there should be only one). Then an array of the list item elements to the “notes” element is obtained and iterated over. The data-note-id item is the note id; the style is parsed to extract the coordinates and size of the active area; and the plain text, i.e. the note text, is copied.

The output from the example code is as follows:

kring:simplehtmldom melanie$ php test.php
Id: 72157624923250854
Coords:
  Left  : 253px
  Top   : 56px
  Width : 94px
  Height: 120px
Text: WIZ810MJ ethernet module.

Id: 72157624798614851
Coords:
  Left  : 254px
  Top   : 177px
  Width : 95px
  Height: 77px
Text: Adapter to allow it to be plugged in to breadboard (the WIZ810MJ has 2mm pitch connectors)

Id: 72157624923256462
Coords:
  Left  : 205px
  Top   : 204px
  Width : 49px
  Height: 67px
Text: 3.3V regulation.

Id: 72157624798617733
Coords:
  Left  : 354px
  Top   : 233px
  Width : 91px
  Height: 72px
Text: From BBC micro 8 bit "User" parallel port.

Id: 72157624923259400
Coords:
  Left  : 103px
  Top   : 299px
  Width : 242px
  Height: 64px
Text: Boarduino to provide 5V power.

Using this information you can update the note using the Flickr API or display the image complete with notes on your own page.

Tags: ,

Leave a Reply