Using freebase to help with dbpedia searches
I wrote this response to a thread on a mailing list, but I can't find anyplace where sourceforge has my reply online (I did receive it in an email). I would have expected it on this archived thread.
So here it is again, at a place I can link to.
On 21 Apr 2008, at 14:40, robl wrote:SELECT * FROM pages WHERE page_title LIKE "Queen%Elizabeth"
This would perform a case insensitive match on Queen(anything)
Elizabeth
(at least in mySQL).
...
Is there quick way to do what I want ? Are there any indexes I could
apply to improve things (I have already created the indexes
specified at
http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/
kidehen@openlinksw.com's%20BLOG%20%5B127%5D/1298)
?
Or do I need to create a conventional SQL table of resource names and
then do a SQL LIKE query on those ?
You might also want to check out freebase. Here's the approach I'm about
to attempt, myself. Start with a reconciliation query:
http://sandbox.freebase.com/dataserver/reconciliation/?name=Queen+Elizabeth&types=%2Fpeople%2Fperson&responseType=html
- the reconciliation service handles misspellings and other variations
- s/html/json/ for the machine readable version
Then look at the freebase page or perform a query:
http://www.freebase.com/view/en/elizabeth_ii_of_the_united_kingdom
That page has this link:
http://en.wikipedia.org/wiki/index.html?curid=12153654
On that page, we have
<a href="http://en.wikipedia.org/wiki/Elizabeth_II_of_the_United_Kingdom">article</a>
Maybe freebase can just hand us that link instead of the curid one. I
haven't gotten to that part of my code yet. I don't know how often the
last word of the freebase URI is in sync with the WP one, but that seems
like it would be the least reliable. Following freebase's designated WP
link is probably more robust.
Finally, take the wiki name, and make a dbpedia URI:
http://dbpedia.org/page/Elizabeth_II_of_the_United_Kingdom
You probably noticed that elizabeth_ii_of_the_united_kingdom wasn't the
first result for 'Queen Elizabeth' of type /people/person. I'm not sure
if freebase considers that a bad result page or not. The reconciliation
service is new, so now's probably a great time to tell them how
important good results are to you :)
- New comment