On Wikipedia, one of the great tensions is between deletionists and inclusionists. The debate is over whether Wikipedia should include everything, or only things that are important enough for a “real” encyclopedia.

On Freebase, we don’t have this problem. As far as I can tell, Freebase is about as inclusionist as it’s possible to get. I usually express this to people by saying, “As far as I can tell, they want 6 billion person topics — or probably twice that, to allow for dead people.”

On the other hand, the first tension I’m seeing emerge on Freebase is that between normalisers and normal people, and I’m seeing it play out in a few discussions that are going on at the moment.

In the comments on my post about gender I linked to a thread on the data modelling mailing list: Should some of the Person attributes be moved to Organism?

Here’s what I said in comments:

Aristotle: There was a related discussion thred: Should some of the properties of Person be moved to Organism?

Someone facetiously commented, “I immediately began wondering which celebrities would get marked as hermaphroditic or asexually reproducing”, which of course is one of the risks of over-generalising.

I think there’s a tension in Freebase, taking a very broad view, between models that tend towards abstraction and those that tend towards how people think day-to-day. We encountered this when Jeff P was modelling Government, and had an abstraction which was kind of theoretically valid across many government forms, but left some of us (esp. those from countries whose governments use the Westminster system) scratching our heads and saying, “but nobody in Australia/Canada/UK *talks* like that!”

In the case of sex/gender of human beings, there’s probably 99% of cases where people just want to be able to say male/female, and confusing them with a list that includes “intersex” and “asexually reproducing” would just boggle them. Male/Female/Other is a reasonably well-precedented pattern for use in forms etc (though I’d want to get some trans people’s views on that to make sure it’s what they’d want) while not being overwhelmingly technical.

As mentioned, we had a similar problem with governments. Jeff Prucher had put together an abstract model that was theoretically workable but which didn’t match well with the way I thought about government. I found myself having to twist my brain to make things fit. I’d get there eventually, but I couldn’t enter the data naturally and smoothly. It was like having to translate into a different language on the fly.

Jeff’s since put together a second government test domain which seems to work better for a wider range of government styles. The differences are relatively small, as far as I can tell without being able to look at the first model (it’s been made private again), but the difference to me as an end user are like night and day.

From what I recall of the two government models and the changes between them, and from other normalizers-vs-normals discussions that I’ve seen, I’d like to propose some techniques that seem to minimise the conflict:

  • Talk like a lay-person. The use of good type and attribute names, ones that don’t require translation by users, makes a model friendlier to “normals”. Even if you make no other design changes, this can make a big difference.
  • Support the vast majority. Like Perl’s motto of “make the easy things easy and the hard things possible”, you should first try to support the vast majority of cases in a way that’s easy to use, then deal with the outliers. Make your main type support the vast majority, and make a second type for the rare misfits and co-type as necessary.
  • Love your CVTs. The extensive use of Compound Value Types seems to be a common feature of models that feel natural to normals while still being elegant data modelling solutions.

Anyone got any other suggestions?

Tags:
Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Reddit
  • Slashdot
  • StumbleUpon