Notes on Freebase, the free database of everything
10 Sep
I recently spent a year working for a real estate search engine company that dealt with property listings in a number of different countries, ranging from England to Indonesia. While I was there I learnt a few things about location data which I think have some bearing on Freebase.
The most important thing is that concepts of location are culturally dependent. In Australia, like the US, we have States. In Canada it’s provinces. In the UK it’s counties. In Fiji, they don’t have any such administrative divisions, just a bunch of islands. If you’ve ever tried to order something online to ship to another country you’ll know what it’s like. “State? I don’t have a state! And what’s this zipcode thing?”
When the original version of the real estate search database was designed — long before my time — the people involved were only really thinking about Australia. They decided that every listing would have a suburb, a state, and a postcode of 4 digits. Obviously this soon started breaking when the company started spreading into other countries. When the new database design was made, just about the only thing they found common among all the culturally diverse ideas of location was this: Locations may contain, or be contained by, other locations. You can see this reflected in Freebase’s Location type — along with a few other attributes, such as “adjoins” and “area”.
It’s only when you start getting into culturally specific ideas of location that you see things like “capital city” or “postal code” or “governor” — attributes that reflect anything other than the pure geometry of the space.
I’ve been messing around a bit with Australian-specific location types, which include “Location” or, when appropriate, “Administrative Division”. You can see them here: Australian State, Australian Territory, and Australian Municipality (which I might rename to Local Government Area, I’m not sure.)
Tags:australia culture freebase location metaweb postal codes
5 Responses for "Freebase Barbie says, “Location is hard!”"
[...] Thoughts on the cultural associations of location data [...]
The base location names imported from wikipedia seem to be for the suburb. Having a suburb that’s split between councils contained by both seems to work, but it does make me wonder what the location hierarchy for Australia should look like. There’s generally not a correspondence between state electoral districts and local government areas, so that’s out. Should states contain towns, cities and suburbs directly, or should they contain areas, including e.g. Perth Metropolitan Area?
I know; I’m not quite sure what to do about that. There’s some discussion going on (I think in the forums in FB itself, somewhere, perhaps in Location?) about wishing there was some way to represent polygons in FB. That seems like just about the only way to represent a lot of Australian geography, since it’s not strictly hierarchical. Argh!
> wishing there was some way to represent polygons in FB.
I’ve done some experiments adding KML or GeoRSS /type/content objects to locations along with a bounding rect CVT….with mixed success. My goal was to write a “hello world” of sorts: a mashup that identified what locations applied to query using the bounding rect, then plotted the GeoRSS for those locations on a Google Map.
I’m sure someone more capable than I could really make this work.
Many of the interesting location queries I can think of don’t assume hierarchical relationships between the regions (voting districts x census regions x school districts, etc.)
And lets not get started with enclaves and exclaves!
Leave a reply