Notes on Freebase, the free database of everything
8 May
As many of you know (all, like, 12 subscribers or whatever there are here), I took a job with Metaweb, who run Freebase, in January this year. So since then I’ve been in charge of the Freebase blog and haven’t been blogging here.
I’ve been wondering what to do with this domain, and I’ve decided to keep it running and to use it for personal/unofficial notes on Freebase-related topics. While the Freebase blog is a sort of official channel, I’ll use freebasing.org for less formal topics.
So, I suppose a disclaimer is in order here: Views and opinions posted on this blog do not necessarily represent those of my employer.
Tags:28 Nov
A handful of new Freebase interest groups have been added lately:
Remember, if you join any of those groups, to subscribe to the RSS feed for discussions.
Tags:community freebase globalism metaweb mjt modelling13 Nov
I’m just pottering around filling out my new Historic Period type, and looking up the dynasties of Ancient Egypt. For each dynasty, I can note its start and end dates, which other dynasties preceded and succeeded it, and what important events occurred or people lived in the period.
The fun comes when you get to filling in the important people. Take Nefertiti for example. Spouse: Amenhotep I. Employment history: Consort. But who’s her employer? I put down Amenhotep, but I’m not quite sure that’s right. If Nefertiti, as consort, was employed by the pharaoh of the time, then is Prince Philip employed by Queen Elizabeth II?
I love it when Freebase gets mushy around the edges.
Tags:egypt freebase metaweb nefertiti6 Nov
Do you have any idea how little it takes to turn Freebase into a social networking application?
A few weeks ago I created a type called Freebase Interest Group and now I’m pleased to be able to announce that it’s been promoted into the public namespace as /freebase/freebase_interest_group
What’s an interest group?
It’s a group of people who have an interest in a subject, a project, or any other common area. You can create one simply by clicking on “Add a new Freebase interest group” on the type page. Currently there are groups for Australian Freebasers, Women of Freebase, Perl people on Freebase, and Users interested in argument modelling. The first three of those are ones I set up as part of the demo, but you should feel free to create anything that you want.
What does an interest group do?
It depends on the group. It might just be a way to find like-minded freebasers, or it might be Command HQ for a freebase project. Let’s say you wanted to rally all the trainspotters on freebase and get them to fill out data about trains: setting up a Trainspotters interest group would be one way to go about it. You’d convince other trainspotters to join the group, fill out the topic’s article with your plan of action, note any types or topics of interest, and discuss the ongoing work.
How do I keep up with my interest groups?
You can subscribe to the RSS feed for the discussion on any interest group. While looking at the topic page for a given interest group, you should see the orange RSS logo in your browser. Just subscribe to “Discussions about (your interest group)” and any discussions will show up in your usual feed reader.
Alternatively, you can “watch” discussions, and they’ll show up on your Freebase homepage.
Tags:community freebase metaweb23 Oct
You know how when you look at a topic, some fields are in the main area of the screen just near the image, and some are on the right hand side?
I used to think that anything that permitted multiple values went on the right, but it turns out that’s not so. Here’s the skinny on what’s really happening, thanks to Jeff:
You can actually specify which column a property appears in in the schema editor. Left is “horizontal” (because multiple values are displayed next to each other, separated by a comma), right is “vertical” (because multiple values are displayed above and below each other). The default layout is that properties for which the expected type is a primitive type (date/time, integer, text, etc.) display on the left, and everything else displays on the right.
(from this discussion about countries’ official languages.)
So, to summarise:
System primitives on the left.
Horizontal multiples on the left.
Vertical multiples on the right.
Non-primitives, even if single, on the right.
8 Oct
On Wikipedia, one of the great tensions is between deletionists and inclusionists. The debate is over whether Wikipedia should include everything, or only things that are important enough for a “real” encyclopedia.
On Freebase, we don’t have this problem. As far as I can tell, Freebase is about as inclusionist as it’s possible to get. I usually express this to people by saying, “As far as I can tell, they want 6 billion person topics — or probably twice that, to allow for dead people.”
On the other hand, the first tension I’m seeing emerge on Freebase is that between normalisers and normal people, and I’m seeing it play out in a few discussions that are going on at the moment.
In the comments on my post about gender I linked to a thread on the data modelling mailing list: Should some of the Person attributes be moved to Organism?
Here’s what I said in comments:
Aristotle: There was a related discussion thred: Should some of the properties of Person be moved to Organism?
Someone facetiously commented, “I immediately began wondering which celebrities would get marked as hermaphroditic or asexually reproducing”, which of course is one of the risks of over-generalising.
I think there’s a tension in Freebase, taking a very broad view, between models that tend towards abstraction and those that tend towards how people think day-to-day. We encountered this when Jeff P was modelling Government, and had an abstraction which was kind of theoretically valid across many government forms, but left some of us (esp. those from countries whose governments use the Westminster system) scratching our heads and saying, “but nobody in Australia/Canada/UK *talks* like that!”
In the case of sex/gender of human beings, there’s probably 99% of cases where people just want to be able to say male/female, and confusing them with a list that includes “intersex” and “asexually reproducing” would just boggle them. Male/Female/Other is a reasonably well-precedented pattern for use in forms etc (though I’d want to get some trans people’s views on that to make sure it’s what they’d want) while not being overwhelmingly technical.
As mentioned, we had a similar problem with governments. Jeff Prucher had put together an abstract model that was theoretically workable but which didn’t match well with the way I thought about government. I found myself having to twist my brain to make things fit. I’d get there eventually, but I couldn’t enter the data naturally and smoothly. It was like having to translate into a different language on the fly.
Jeff’s since put together a second government test domain which seems to work better for a wider range of government styles. The differences are relatively small, as far as I can tell without being able to look at the first model (it’s been made private again), but the difference to me as an end user are like night and day.
From what I recall of the two government models and the changes between them, and from other normalizers-vs-normals discussions that I’ve seen, I’d like to propose some techniques that seem to minimise the conflict:
Anyone got any other suggestions?
Tags:2 Oct
If you go to edit a Person in Freebase, there’s a field for “gender”. The current choices are “male” or “female”, and it’s a fundamental Freebase system type, which means that you can’t add new ones. Nor can you choose more than one.
Regardless of whether you buy into it or not (and you won’t be surprised to hear that I do), there are many people, groups, and indeed entire cultures that have more complex models of gender than a simple binary model.
I’m not actually an expert on this, just a reasonably well-informed layperson, but obviously it’s way more complicated than the simple male/female binary currently expressed on Freebase.
My feeling is that as a first step, “Gender”, the system type, should have an “Other” option (if not a more complete list). The next step, I suspect, is to create a “Transgender Person” type, which allows for gender identity to change over time.
I’m playing with some of this in a personal domain if you’d like to come comment or take a look.
Tags:freebase gender genderqueer intersex metaweb queer transgender1 Oct
MQL, the Metaweb Query Language, is what we use to talk to any Metaweb database such as Freebase. It’s the guts of any Metaweb API, such as mjt or my own Metaweb CPAN module for Perl, but you can also use it standalone in the Query Editor.
I want to run a series of tutorials on MQL, taking a cookbook approach. Essentially, as I learn stuff, I’ll post it here in tutorial form.
First step, open up the Query Editor in your browser. The empty box on the left is where you type your MQL. Then you click “Read” and the response will show up on the right.
Here’s a simple query to cut and paste:
{
"query":[{
"type":"/people/person",
"name":null
}]
}
This query is asking for a list of all objects of type “/people/person”, and saying that we want to know their names. The overall format of the query uses the JavaScript Object Notation, or JSON. For the most part, the format consists of name/value pairs, and some syntax for grouping lists of those pairs together.
Let’s take our example and break it down further.
“query”:[{ ... }]
All MQL queries need this wrapper around them, to group the query parameters together.
(If you’re using an API rather than the Query Editor, you may also need another envelope around this; see the API docs for details.)
“type”:”/people/person”
Here we limit our search results by the type of object we’re interested in. The type is expressed in the form “domain/type”, which is how the system refers to types under the hood. If you’re not sure of the underlying system name for a type, you can find it by browsing the Data page and getting to the type you want, then looking at the URL. The last two segments of the URL will be the domain and type, eg. “http://www.freebase.com/view/filter/people/person”
“name”:null
This says, “We want to know the name, but we don’t have any particular constraints on it.” So the results we get will list all the names of every object returned.
Now you can click the “read >>” button and see the results. It should look something like this:
"code":"/api/status/ok",
"result":[{
"name":"Jack Abramoff",
"type":"/people/person"
},{
"name":"Bob Ney",
"type":"/people/person"
},{
"name":"David Safavian",
"type":"/people/person"
},{
"name":"Kåre Kristiansen",
"type":"/people/person"
},
… and so on. You can click on different views, eg. “Tree View”, “JSON View”, etc. The one I’ve pasted is the JSON view.
So that’s the most basic form of query. Now you can try some queries of your own:
"id":null
"limit":10
"sort":"name"
What I’ve covered in this tutorial corresponds roughly to the most basic parts of section 3 of the API docs. If you want to read more, that’s where to do it.
Tags:api freebase json metaweb mql programming query tutorial30 Sep
The other day, while organising the Melbourne, Australia Freebase gathering I found myself wanting to know all the Freebase users in Melbourne and, ideally, their email addresses.
Well, the first part is easy enough. Here’s the MQL for it:
{
"query":[{
"location":"Melbourne, Australia",
"name":null,
"type":"/freebase/user_profile"
}]
}
The email address, however, proved impossible. Freebase doesn’t store email addresses anywhere publicly available, and I guess I can see why. The potential for spam is greater even than putting an email address on an ordinary website, because the spammers wouldn’t even have to crawl the site. They could just programmatically ask for a list of email addresses:
{
"query":[{
"name":null,
"type":"/internet/email_address"
}]
}
Or, if they were actually trying to market something to people who cared, say women of a certain age group:
{
"query":[{
"name":null,
"type":"/internet/email_address",
"person": {
"sex":"female",
"a:date_of_birth>":1970,
"b:date_of_birth<":1980,
}
}]
}
(Code is an example only, untested and untestable, and almost certainly syntactically incorrect in the date constraints because I don’t think dates work like that.)
Now I, for one, would actually prefer to get spam about drugs for menstrual cramps rather than ones offering me a larger penis, but most other people seem to disagree, and consider it a huge invasion of privacy to think of spammers being able to search on their personal details.
On the other hand, there are some valid reasons why you might want to store email addresses in Freebase. Contacting Freebase users is one reason: I would quite like other Freebasers to be able to email me if they want to get in touch privately, and I expect that many of those Melbourne users would’ve liked to know about the upcoming meeting. You’d want the publicity of a user’s email address to be configurable by them, and private by default, but it’s by no means an obscure use case.
Another reason is that some types might fundamentally be about email. I was modelling CPAN Authors a while ago, and one of the very few data points recorded about contributors to CPAN is their email address. That information is already available in a public database of sorts, so there’s no real expectation of privacy there. Or what about mailing lists? A model for “mailing list” might reasonably want to store “posting address”, “subscription address”, and “admin address”. Or it might be nice to store email contacts for political representatives, so that their constituents can email them about issues of importance.
I’m sure there are a thousand reasons to want to record email addresses in Freebase, and there’s only one reason not to… but it’s a biggie. I wonder what solutions to this problem will emerge?
Tags:anti spam email freebase metaweb privacy spam25 Sep
I’m in the early stages of planning a Melbourne Freebase meetup/gathering/hack/play/etc session, to be held at Horse Bazaar in the CBD, probably on October 9th. More details to follow, but if you happen to be in Melbourne and are interested in attending, drop me a line at skud@infotrope.net (or comment here).
Tags:freebase meetups metaweb user groups