The document discusses Yahoo!'s Placemaker API and how it can be used to extract geographic location information and references from text. Some key points:
- Placemaker uses Yahoo!'s GeoPlanet data to identify places, people, and things within text and link them to their geographic coordinates.
- It returns details on places found, including WOEIDs (unique IDs), names, and coordinates. It also returns references to places found in the text.
- The document provides examples of calling the Placemaker API using PHP and parsing the XML response to extract location information from documents.
1. Place not Space; Geo without MapsFOWA London, October 2009Gary Gale, Yahoo! Geo Technologies
2. PLACES, PEOPLE and THINGS atibens on Flickr : http://www.flickr.com/photos/atibens/2616899638/
3. Knowing where our users are, and the places that are important to them Knowing the geographic context of everything we index, manage and publish Knowing geographic locations, and the names of places We Connect Places, People and Things
5. 85% of all data stored is unstructured This doubles every 3 months 80% of all data contains a geo reference Source: Gartner Group Mr Faber on Flickr : http://www.flickr.com/photos/mrfaber/247946146/
6. MINE THAT CONTENT tjblackwell on Flickr : http://www.flickr.com/photos/tjblackwell/3652375290/
11. Placemaker Parameters appid 100% mandatory inputLanguage en-US, fr-CA, … outputType XML or RSS documentContent text to geoparse documentTitle optional title documentURL URL to geoparse documentType MIME type of doc autoDisambiguate remove duplicates focusWoeid filter around a WOEID
13. Unique Permanent Global Language Neutral London = Londra = Londres = ロンドン United States = États-Unis = StatiUniti = 미국 Ensures that geography can be employed consistently and globally straup on Flickr : http://www.flickr.com/photos/straup/3504862388/
26. <reference> <woeIds>44418</woeIds> <start>1079</start> <end>1089</end> <isPlaintextMarker>1</isPlaintextMarker> <text><![CDATA[London, UK]]></text> <type>plaintext</type> <xpath><![CDATA[]]></xpath> </reference> <reference> <woeIds>44418</woeIds> <start>1116</start> <end>1126</end> <isPlaintextMarker>1</isPlaintextMarker> <text><![CDATA[London, UK]]></text> <type>plaintext</type> <xpath><![CDATA[]]></xpath> </reference> Two references for WOEID 44418 Two references for WOEID 44418
27. // turn into an PHP object and loop over the results $places = simplexml_load_string($placemaker, 'SimpleXMLElement', LIBXML_NOCDATA); if($places->document->placeDetails){ $foundplaces = array(); // create a hashmap of the places found to mix with // the references found foreach($places->document->placeDetails as $p){ $wkey = 'woeid'.$p->place->woeId; $foundplaces[$wkey]=array( 'name'=>str_replace(', ZZ','',$p->place->name).'', 'type'=>$p->place->type.'', 'woeId'=>$p->place->woeId.'', 'lat'=>$p->place->centroid->latitude.'', 'lon'=>$p->place->centroid->longitude.'’ ); } }
The placeDetails container defines a place – there’s one per place found – it holdsWOEID – the unique identifier for the placetype – the place type name for the placename – the fully qualified name of the placecentroid – the centroid coordinatesmatchType – the type of match (0=text/text & coordinates, 1=coordinates only)weight – relative weight of the place within the documentconfidence – confidence that the document mentions the place
WOEID – list of WOEIDs referencing the placestart & end – index of first and last character in the place reference or -1 if type is XPathisPlaintextMarker – flag indicating if the reference is plain texttext – the actual place referencetype – type of reference – plaintext, Xpath, Xpathwithcountsxpath – xpath of the reference
So now we have the places, their references and their WOEIDs we can easily hook into services which understand WOEIDsSuch as FlickrBecause not only does Flickr love you, it also knows about WOEIDs, as this YQL fragment shows
But what about those services that don’t speak WOEID fluently?Looking back at the place definitions, we have WOEIDs.Well each WOEID has metadata attributes associated with it, such as the centroid of a place with the longitude and latitudeAnd because geo should be technologically agnostic, so must we, so with these coordinates we can use other services, such as Google Earth