Diarybot
I started a new project using jabber and SIOC. It's a jabber bot that records sioc:Post entries in an RDF store.
I also used wokkel, twisted, and rdflib.
The idea is that you make a new bot account and have some users subscribe to it. Anything they say to the bot is logged (the diary part), and the bot will send out nag messages if it hasn't heard from anyone in a while. Each users' posts are broadcast to all the other users.
Kelsi and I tried it out, and I noticed an unintended usage pattern: when I posted something, she followed up with an elaboration. This is natural since she gets my message as a broadcast from the bot, and answering it makes a new post. Someday when I get around to finding or writing a SIOC viewer for this stuff, it would be nice to connect the entries that were made very close together.
Here's what the log looks like in n3:
@prefix : <http://rdfs.org/sioc/ns#> .
@prefix XML: <http://www.w3.org/2001/XMLSchema#> .
@prefix dc: <http://purl.org/dc/terms/> .
<http://example.com/forum> :container_of [
a :Post;
dc:created "2009-09-19T20:59:33.33-07:00"^^XML:dateTime;
dc:creator "drewp@jabber.bigasterisk.com/Coccinella@dash";
:content "good evening diary" ],
[
a :Post;
dc:created "2009-09-19T22:18:00.12-07:00"^^XML:dateTime;
dc:creator "drewp@jabber.bigasterisk.com/Coccinella@dash";
:content "weather is looking fine" ] .
SIOC says to use sioc:has_creator which points to a sioc:User, but I haven't gotten all those connections done yet. The code also doesn't know what forum URI to use for the bot. The log store is only appended to, but the code currently rewrites the whole thing every time, which is a waste (and a risk!).
I'm thinking about adding a separate mode where each person writes his own local diary with one bot, and there's no broadcasting or sharing of messages between the users. In other words, it would be like you made separate bot accounts bot_user1, bot_user2, etc, except you wouldn't have to actually make all those accounts.
Newsbruiser atom feeds fixed
Something was wrong with the atom feed for one of my newsbruiser instances; and I think the working feed had some problems under bloglines. I'm trying to get off newsbruiser anyway, so I finally wrote an RDF exporter program for it. With an RDF version of the entries, it's really easy to make an atom feed (and hopefully I got it right).
My exporter program imports the actual newsbruiser code, fakes enough startup and config material to get going, and then adds statements about the notebook and entries to an rdflib graph. Man, does newsbruiser spend a ton of code on config and option nonsense.
My blog is currently 963 statements; and Kelsi's blog is 1011 statements. I just store them as NT files. They're about 300kB each.
The atom generator code uses rdflib but not newsbruiser. It's this SPARQL query:
SELECT ?e WHERE {
?blog sioc:links_to ?e .
?e td:modified ?mod .
} ORDER BY desc(?mod) LIMIT 5
with one more pattern if I'm trying to make a feed for a particular category. Then I use lxml's ElementMaker like the slides told me to.
SPARQL endpoint for movie showtimes
I put up my first SPARQL endpoint, which is a standard way to allow queries about structured data. The dataset is a tiny collection of movie showtimes for the current day and for movies playing near my house.
I built it in response to this plea for more fresh data. My data is not especially exciting, but I did want to have a SPARQL endpoint for that project anyhow. Now I need to make the rest of the project use its own endpoint, to completely decouple the data-gathering code from the display code.
Here is a table of some other endpoints, although I think it's silly to try to maintain a wiki page of such things. That's what automatic search engines are for.
Even better is this recent slideshow that describes several SPARQL queries on interesting real-world data sources.
SHDH 29
I went to SHDH at Sandbox Suites last night. Here I am talking with Joel about the signin system:
I am trying to get the SHDH crew to use FOAF and RDF as the way to keep track of guests and their interests. FOAF includes one nice idea which is to treat email addresses sort of like passwords and only store their hashes. I wrote a demo of how to take an email address, hash it, use an RDF search engine to find documents with that hash, and search those documents for a full name to go with the email.
getNameFromEmail code and docs
Then I looked at CruiseControl for a while. It might be cool, but it unfortunately falls in the class of tools where the configuration is so hard, you might be able to use standard unix tools to get most of what you want a lot faster. Nagios is also like this (but munin does ok).
SIOC competition
I won 3rd place in the SIOC Data Competition. One of the challenging things was finding the announcement email among spam that made it to my inbox:
Microsoft Windo (3.7K) Winning number................................YM/09788/60
National Lotter (1.7K) Winning Notice !
Breslin, John (1.7K) Congratulations! 3rd Prize in the boards.ie SIOC Data Competition
admin@national- (8.9K) Final Winning Notification
vinyl907@netvig (2.6K) Email Address Have Won
united nations (3.5K) PART-PAYMENT VALUED $8.3M UNITED STATES DOLLARS
UEFA LEAGUE PRO (3.2K) END OF THE YEAR AWARD!!!
Here's my entry, which has links to its own source code and the other tools I used.
Graphs from sparql results
This is a response to Download SPARQL results directly into a spreadsheet
So far you've motivated seeing the results of a query in a table and making a graph from them. I'd like to have both of those capabilities in a webapp. E.g. I should be able to embed a live graph in my own page like this:
<img src="http://sparqlgrapher.com/svg/example.com/query=SELECT+?date+?price+{...}">
Visiting my hypothetical sparqlgrapher.com directly would give you a UI to layout and customize the graph. When you're done, you'd take that url and embed it elsewhere (or just take a copy of the image, if you want a one-off).
rdflib vs jena graph creation APIs
I actually looked at the jena RDF API today, and I was interested to see how graph creation compares to rdflib's style, which is the one I normally use.
From the Jena introduction (minus the model setup and some comments):
String personURI = "http://somewhere/JohnSmith";
String givenName = "John";
String familyName = "Smith";
String fullName = givenName + " " + familyName;
Resource johnSmith
= model.createResource(personURI)
.addProperty(VCARD.FN, fullName)
.addProperty(VCARD.N,
model.createResource()
.addProperty(VCARD.Given, givenName)
.addProperty(VCARD.Family, familyName));
An rdflib python port of that:
johnSmith = URIRef("http://somewhere/JohnSmith")
givenName = "John"
familyName = "Smith"
fullName = givenName + " " + familyName
graph.add((johnSmith, VCARD['FN'], Literal(fullName)))
name = BNode()
graph.add((johnSmith, VCARD['N'], name))
graph.add((name, VCARD['Given'], Literal(givenName)))
graph.add((name, VCARD['Family'], Literal(familyName)))
If I were making a new version of the rdflib API, here's what I'd consider:
- Don't expose the strings of URIRefs very easily. It should be somewhat hard to examine or operate on a URIRef's string value. This is first to encourage good practices (no more "print u.split('/')[-1]" or "if 'foo' in u:") but more importantly to allow for optimizations in query engines. A backend should be able to return URIRefs containing its internal ids while a query is running, and then any URIs that make it to the result set can be looked up (if needed). Actually getting the string form of a URIRef would still be possible of course, but it might require an explicit method call. Most people's URIRef serializations are in __repr__ calls and output formats, so this change shouldn't be that noticable.
- Literal can still be a string subclass; BNode should not be.
- Like jena, don't require Literal() on strings unless they need a lang or datatype. People forget Literal() sometimes anyway, which rdflib sometimes handles. The rest of the time it corrupts your database. "hello" is pretty clearly the same as Literal("hello"), so I think it's fine to support "hello" the same way that rdflib now supports 5 to be a xsd:int literal. Another choice would be to error quickly on strings, which would be a good move if people were providing URIRefs as strings accidentally.
- Support graph construction APIs with named methods, like graph.node(uri1).addProperty(pred1, "value1"). These are nice for new users since they bring terminology in quickly, and some of the condensed forms seem cool. I don't like jena's 'createNode' method name, though, since as far as your graph is concerned, nothing got created. The fact that a java Node object was created is not important. Other possibilities:
- graph.get(s).addEdge(p1, o1).addEdge(p2, BNode().addEdge(p3, o3)) # (the BNode makes a little temp graph for a moment)
- graph.add(s.edge(p1, o1).edge(p2, o2)) # (another temp graph, as we evaluate the inner expression)
New home page
I played with a bunch of New Fangled Web Technologies and redid my home page. Almost everything is dynamically derived from data sources that I presumably keep up to date for other reasons. The foaf part and projects list part aren't done yet. I also haven't removed all the zope pages yet, unfortunately. (Zope turned out not to be a good system for making a low-maintenance site that lasts for 10+ years.)
I hope to have a DOAP document for each project, which will make them easy to list on my home page as well as other project-list systems.
Using freebase to help with dbpedia searches
I wrote this response to a thread on a mailing list, but I can't find anyplace where sourceforge has my reply online (I did receive it in an email). I would have expected it on this archived thread.
So here it is again, at a place I can link to.
On 21 Apr 2008, at 14:40, robl wrote:SELECT * FROM pages WHERE page_title LIKE "Queen%Elizabeth"
This would perform a case insensitive match on Queen(anything)
Elizabeth
(at least in mySQL).
...
Is there quick way to do what I want ? Are there any indexes I could
apply to improve things (I have already created the indexes
specified at
http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/
kidehen@openlinksw.com's%20BLOG%20%5B127%5D/1298)
?
Or do I need to create a conventional SQL table of resource names and
then do a SQL LIKE query on those ?
You might also want to check out freebase. Here's the approach I'm about
to attempt, myself. Start with a reconciliation query:
http://sandbox.freebase.com/dataserver/reconciliation/?name=Queen+Elizabeth&types=%2Fpeople%2Fperson&responseType=html
- the reconciliation service handles misspellings and other variations
- s/html/json/ for the machine readable version
Then look at the freebase page or perform a query:
http://www.freebase.com/view/en/elizabeth_ii_of_the_united_kingdom
That page has this link:
http://en.wikipedia.org/wiki/index.html?curid=12153654
On that page, we have
<a href="http://en.wikipedia.org/wiki/Elizabeth_II_of_the_United_Kingdom">article</a>
Maybe freebase can just hand us that link instead of the curid one. I
haven't gotten to that part of my code yet. I don't know how often the
last word of the freebase URI is in sync with the WP one, but that seems
like it would be the least reliable. Following freebase's designated WP
link is probably more robust.
Finally, take the wiki name, and make a dbpedia URI:
http://dbpedia.org/page/Elizabeth_II_of_the_United_Kingdom
You probably noticed that elizabeth_ii_of_the_united_kingdom wasn't the
first result for 'Queen Elizabeth' of type /people/person. I'm not sure
if freebase considers that a bad result page or not. The reconciliation
service is new, so now's probably a great time to tell them how
important good results are to you :)
Notes from the talks at Semantic Web: Are Scalable Graph Data Applications Possible?
Notes from the talks at Semantic Web: Are Scalable Graph Data Applications Possible?
I was looking forward to more Oracle demos and roadmap-type discussion, but instead the highlight was allegro.
jeff from oracle says:
fraud detection is using graphs
we're generating data faster than we can process it
business value comes from: reduce cost of operations; aid decision-making; improve the transparency of business operations (e.g. for businesses that need to meet regs)
nice slide on DB approaches, broken into disk/ram, native/layered, etc
siderean is a company doing in-mem, multiple machine storage
david from mulgara
key to web scaling is the late binding of address to resource. Allows the information mgmt technique of the web to scale well
the next gen mulgara version, which the team was meeting about this week in SF, will use lots of disk, perhaps 40G ram, and store 100B (?) triples
vertica:
SQL DBMS, focus on analytics
50+ customers: verizon, comcast, level3
came from the cstore project, MIT
MIT library catalog is rdf, 50M triples: Barton Dataset
uniprot protein dataset is 262M triples. vertica serves that dataset for public querying
jans aasman, franz inc:
23 years old company, 2 yr with a triple store
customers do 'event handling and activity recognition'
50 customers, plus free download
monterey aquarium doing Marine Metadata Interoperability Project
los alamos is studying who reads what publications, graph structures in readership
sun doing baetle, the bug tracking one
japan telecom KDDI is doing spam and fraud detection with allegro. they need to determine what is spam across their busy network. they create new spamassassin rules over time.
OFFIS using rdf for info about power grid usage
allegro loads 1e9 quads in 8 hours
has sesame interface
supports xml schema datatypes, e.g. range queries on dates. Literals can be stored as their own numbers
'social network analytics library' for degrees, cliques, group stats
quick loads from oracle for temporary dbs used for analytics (coming soon)
RDFS++ reasoner for the usual inferences
temporal reasoning (allen's temporal logic, for intervals)
their time/space handling helps with event search. one query involves a place and radius, person connections, other event details
police, e.g., need temporal reasoning
"homeland security is interested in every type of imaginable event"
"find all meetings that happened in december within 5 miles of berkeley that was attended by the most important person in Jans' friends and friends of friends"
they have a custom query language for their various datatypes and their capabilities, like (geo-box-around !geoname:Berkeley ?event 5 miles)
even 3 months of american phone call records is already petabytes
jans' thesis was about car driving behavior
GPS (maybe plus phone) leads to rich data about people- work, purchasing, etc
My question, which I didn't get to ask: How do the approaches compare in terms of latency for very small queries? Many of my queries are not batched together well, or my app needs to make a lot of decisions during the graph traversal.
Goals for a wiki system
Some goals for a better wiki system:
- HTML markup only! No restructured text, ad-hoc formats, etc. The lock-in is so awful in all of today's wikis. WYSIWYG editors are mature now- users should be able to pick WYSIWYG or full html.
- no training or external docs should be required to use all the features
- don't lose my scroll position when I want to edit some text I'm looking at
- encourage linking and creation of RDF data, probably with rdfa
- immediately exploit those links and metadata with easy searching/filtering/etc. Use FOAF data to limit my search to pages written by people I know, etc.
- keep the ownership clear (this is the opposite of some wiki philosophies). Spam cleanup should be very simple.
- encourage edits and especially annotations
A common case seems to be "add a new page and list it in some existing TOC section". Another one is "add a new section (paragraph or more) to this page". Editing words within an existing section that you didn't write, that might be rare.
I still like tinymce, although nelix_ isn't a fan.
Wikis that I use (that I'm trying to be better than) are: twiki, zwiki, confluence.
Related: rdf blog engine ideas
Notes from Intelligence at the Interface
Event: http://sdforum.org/index.cfm?fuseaction=Calendar.eventDetail&eventId=13012&nodeID=1
tom gruber, tomgruber.org
Progress in the user experience on the web, if we look at what the user has to do and what the rest of the system has to do:
- breadcrumbs (just links, user does everything)
- -then-> portals (user picks yahoo, yahoo does more work) -then-> search (user queries, search engine does work) -then-> room service (agents)
examples:
'sandy' is an email reminder assistant
'farecast' for airfare. suggests alternate cheaper flights, trends. Looked cool from the screenshot and description.
Tom remarked at the end that finally, intelligence and computation will be able to be what we compete with, instead of just having "brand bullies" :)
And, "each time AI does a job well, it always disappears"
twine, nova
Remember when you started using delicious? it took 5 mins to learn most of the functionality, but then several days to notice that this is really worthwhile and it's going to help a lot. I expect a similar, but stronger, effect from twine. You learn the mechanics of checking information in, then after doing it for a while you notice which of your former laborious tasks have melted away. I also have high hopes for systems connected to twine. It's like a more polished version of piggybank. And they're going to add in recommendations, which may bring the 'smarts' closer to what magitti or calo is doing.
check Nova's blog for slideshow
semweb says, put metadata in the data so new software can reuse the past work (naturally!)
seems very close to that friendlist thing from that other blogger i read, i forget the exact name
builds a 'semantic interest profile' about you. picks people/places/organizations/topics you're interested in
create a 'twine' (like squidoo lens, page about a topic). The twines had surprising urls: like http://twine.com/twine/my-house, right at the global level. Are the urls different depending on who's logged in? Or does Nova's own stuff just go to the top? :)
A bookmarklet opens a transparent frame right on top of an external page you want to tag. From there, it's like delicious, but gathers a bit more data automatically.
When he used the bookmarklet on an amazon page, twine pulled some more fields from the page about the book
on the marked pages, twine finds words and topics and makes the links
edit-in-place UI to fix the fields of the data it found; add more fields. like freebase
they do some auto-summary of text from a wikipedia page
query is like newegg power search (or most semweb stuff for that matter), pick a type, add your filters
email in your own items to your 'recent items' list, just like a ticketing system would accept new tickets. URLs in the mail get crawled and those sites show up in your items too. (calo had a more turbocharged version of this, where they'd go hunting for info about everything and build big profiles about users and stuff)
goal of twine is organization. is this automating my tasks? the users will reveal what is valuable to automate.
PARC magitti
Finally, some novel UI work on a phone-based UI. It looked really nice-- low on sparkles and icons, high on usability. The app itself (recommendations and guides for your leisure time) seems good, and it was amazing to see a Japanese paper-printing company looking for ways to get into new media. Feels like the only stories I hear in the USA are about old companies putting their effort in keeping their old businesses going (e.g. big oil). Anyway, there was some cool personal activity prediction stuff like where they look at your messages and your past trends to guess what you want to do -right now-. I hope to get into exactly that kind of thing on my home automation project.
the name = magic + (something) + digital grafitti
19-25 year olds have 2x as much free time as other youth (japan, at least)
important for them to know what everyone else is thinking
predicts what to do, e.g. 'eat' (when it thinks you're hungry based on time, place, your emails, your explicit queries). Nice.
it reads emails only to guess what kind of activity you're currently doing. 11% of the test email dataset had information related to leisure activities (which is all magitti cares about). That seems low to me. Maybe that's all the ones they were able to correctly process (or maybe there's something I'm not estimating right about the emails of 20-somethings in Japan)
look at your past behavior to learn your patterns of eat/see/shop/... They can make plots based on day-of-week and time. This is what I want for my home automation.
ppl want to use the phone UI with one hand. 6 big buttons surrounding the content
pie menu on the phone. 4 quadrants only, sometimes more narrow ones for the border buttons. They looked really usable.
see yelp-style ratings on businesses, takes your star rating as you look at the page. collaborative recommendation stuff
- the action buttons were arranged like this:
- 'M' [camera capture] [settings] (some content here) [any [eat]] [your location] 'clock'
hit the lower-left one to change your activity from 'any' to something else. Even if you dont say anything, they still list good ideas from their best matches of your activity, place, reviews, etc
you can force the activity ('shopping for clothes') and it refilters.
- From the QA session: "what does the next 10 years of AI look like?"
- answer: "busy"
yahoo
The phone-photo-tag part of this demo gave the most feeling of "you are looking into the future of technology" of all the presentations tonight. The UI was not elaborate. Mainly, it's that your phone camera is helping you tag your photos in real time (like delicious, except it knows your position and millions of past flickr tags too) and it's readily presenting you with other photos of interest. Everyone using this would essentially be running their own little version of justin.tv (photos, not video). The heavily-assisted tagging helps you organize your photos, and therefore organize your memories. Valuable! The speaker mentioned an example of looking up where you last had dinner with that friend. Since it was so easy at the time, you would have taken a photo and tagged it with the friend and the restaurant. Problem solved.
flickr photo locations plus tags shows popular tags on the map. 'tagmaps' from yahoo research berkeley. pretty cool to zoom in and out. using 4M photos, last year's data
upcoming version has 30M photos. Sometimes, these tags annotate world maps better than the pros do.
autotag your vacation photos by using the place of the photo
see the 'fireeagle' project for how web apps can know your location
i dont have live notes about the best demos, since I had to change seats to see the screen. The phone app that shows various feeds of pics included "wallet" (the photos you often show people), "my wife" (the photos she's taking now), "any flickr photos tagged with 'happy' near this location".
when reviewing all the tags on flickr, they consider the time too so as to figure out which things that are actually events ('bluegrass festival') and not places ('the mission'). This is like a topic I got into at a semweb meetup once: with just the tags on delicious, could you produce the names of all the states and their capitols? (I think yes)
CALO
The calo express part of the demo was pretty nice. It's a much smarter desktop search that would easily beat whatever you're using now. Especially what I'm using now, which is nothing (and I've tried a few OSS projects a little). Things took a turn for the industrial-strength-awesome when it got into the meeting planning and recording features, mainly for the amount of tech they're throwing at the problem. The AI testing stuff was also amazing, and it helped connect the project back to real life: if they don't make a certain amount of progress in their AI evalutions, they don't get funded for the next year.
This is a big research project that covers CPOF (recently in a Wired article) and has some kind of cross funding and sharing with many other projects, including twine.
cognitive assistant that learns and organizes
SRI, darpa
includes Command Post of the Future
builds 'relational model' of user's world. not sure if it's rdf
guesses what emails are about, what tasks they go with. you give feedback
'meeting understanding'. remote people are in everyone's headsets. CALO writes transcript, action items, Q/A pairs.
when he comes to a mtg, calo knows what all the people have been doing
has some kind of chat bot for scheduling a meeting (and other tasks, apparently). you use limited natural language
AI uses 'probable beliefs', revises them as new facts come in. 'probabilistic consistency engine' can update knowledge with new facts.
each year, they test the system (like an SAT test) and it has to improve. questions like "what to do when tom can't make a meeting: A. reschedule; B. tell tom; ...". They compare the baseline untrained CALO to an instance with 16 users for 2 weeks, and note whether calo does better at the test after that learning.
they have a full self-contained office environment, and a lite version (used by DARPA). lite one has almost no interface
the lite version does: google desktop search PLUS nlp (!). calo found someone's home page, pulled number and address and job title. Noted the person's publications and web pages to see what the person does.
followup query: "people with expertise in learning" then ".. that work at SRI" to narrow it down
A query for "slides about iris" finds individual slides in past presentations. then you search for similar slides to a near-match. Apparently the normal desktop searches look for keywords and stuff in a whole .ppt, which is obviously not as useful.
make a new presentation just based on title. digs up all relevant slides
'preppak' for a meeting. finds all documents that are required or recommended for the meeting
in the meeting, you can watch the transcript, which knows the person since everyone wears a mic. Testing within the government
calo is a personal assistant, doesn't share much with groups. some things (e.g. meeting schedule) are shared. you dont reveal all your meeting time prefs, but the calos negotiate it
Watching the X screen power state
http://cvs.bigasterisk.com/viewcvs/room/sys.dpms?rev=1.1&view=auto
New program to watch whether my screens are powered on or DPMS-sleeping. I also track the idle time and the currently-focused window, since I happened to find code for those while I was working out the DPMS. The result is a little RDF graph:
@prefix _4: <http://dash/>.
@prefix _5: <http://bigasterisk.com/computerIdleState/power/>.
@prefix idle: <http://bigasterisk.com/computerIdleState/>.
_4:console idle:focusClass "rxvt";
idle:focusName "XTerm";
idle:focusWindowName "drewp@dash:/my/proj/room <>";
idle:lastNonIdle "1196593439.27"^^<http://www.w3.org/2001/XMLSchema#float>;
idle:power _5:On.
(I know the URLs and date formats are poor right now)
The program should be easy to run if you're on X and you have rdflib, py2.5, and python-xlib.
This results format is part of my new plan to have each program regenerate entire graphs of whatever they measure. I'm thinking of sending the graphs around with jabber, using pubsub to send them only when they change. That would be unlike https://stpeter.im/?p=1328 which uses SPARQL as you'd expect.
For example, if the user (or DPMS timer) turned off the screen, the last triple in my example would change to the _5:Suspend node. Other listeners who have subscribed to the computerIdleState graph will get an updated version of it.
The reason I started tracking screen state was simply to measure how many hours my 300W monitors are on per month. Either this program, or some listener one, will have to log that data somewhere. Of course, there are obvious uses for logging idle time patterns too, and that measurement probably wants a bit more compression. (Example app: tell my friends on jabber that I'm out, but my average return time for Tuesdays is 9:45pm.)
I really have to move this old home automation project off CVS and onto darcs. I don't mind the conversion, but I want to keep at least some of the cvsweb urls working since I think I've pasted them into a lot of postings all over the web.
RDF reasoning for home automation
Updated: fixed FuXi link
I'm trying to do my home automation with RDF and reasoning. RDF is the unified way to write all the configurations, and I'm hoping to use a logic engine (maybe FuXi or Euler) to write the control systems. Hopefully those will make it easy for humans or computers to edit the setup.
I look forward to being able to ask an N3 proof system "why is the porch light on?" and having it tell me "the web said the sun has set by now, you tripped a motion sensor within the last 15 minutes, and there was no other light shining in this area, therefore I turned on the porch light".
Tonight I cobbled together the first working version of some home automation components talking RDF. A bluetooth dongle constantly searches for devices, and if it finds one, it states that [the bluetooth sensor] [senses] [the URI for the device]. Here's that program.
(BTW, avoid bluetooth chips by Integrated System Solution Corp and prefer ones by Cambridge Silicon Radio. The ISSC one I got has the lousy address 11:11:11:11:11:11 that's hard to change since I'm not using windows. Also, this bluetooth intro is really good.)
Next in my home automation system, a reasoning program hears about new statements and executes the right logic to produce more statements about what should happen. This program is a stub for now- it just turns the presence of my phone into a statement to power the door lock. But devices.n3 suggests what some of the logic might eventually look like.
Finally, an output program has been watching for statements about pins on the parallel port it controls. The reasoning program said to put power on bit2, so this program sets the output accordingly. On that pin is a circuit with an optoisolator, a triac, a transformer, and the electric strike that releases the door.

When the real logic is in place, the proof system should be able to say "I unlocked the door because someone friendly was nearby, because Drew is friendly and Drew carries a phone with the bluetooth address I saw".
Data table with tabulator
Here's how to use tabulator to render a simple data table.
My test data might be a bit confusing since the terms overlap with tabulator terms. I'm trying to compare query runtimes of various queries on different databases. The result I'm trying to produce is a table showing how long each database took on each query.
Here's my mockup data in n3:
@prefix : <http://example.org/> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . :result rdfs:label "test result" . :db rdfs:label "database" . :time rdfs:label "elapsed time" . <> :result [a :Result; :query :q1; :db :rdflibBdb; :time ".5"], [a :Result; :query :q2; :db :rdflibBdb; :time ".6"], [a :Result; :query :q3; :db :rdflibBdb; :time ".7"], [a :Result; :query :q1; :db :db2; :time ".8"], [a :Result; :query :q2; :db :db2; :time ".9"], [a :Result; :query :q3; :db :db2; :time ".11"] .
Note the line which associates the 6 results with this document. Without that link, tabulator won't put the results in its outline.
I used cwm to create an XML version of that data, which you can view that data in tabulator with the following link. [Update: there was no need to convert; tabulator can read n3 thanks to a version of the cwm parser translated to js with pyjs!]
Tabulator has a query-building interface where you click on predicates and other nodes to constrain your result rows, but I couldn't figure out how to make the table I wanted. Instead, I used the SPARQL tab at the bottom and wrote my own query:
SELECT ?query ?bdb ?db2
WHERE
{
?v1 <http://example.org/db> <http://example.org/db2> .
?v0 <http://example.org/db> <http://example.org/rdflibBdb> .
<http://bigasterisk.com/post-rdf/timing-results6.rdf> <http://example.org/result> ?v0 .
?v1 <http://example.org/query> ?query .
?v0 <http://example.org/time> ?bdb .
?v0 <http://example.org/query> ?query .
?v1 <http://example.org/time> ?db2 .
}
In english, that says "find queries with results for the two databases, and report their times in columns named after the databases". You can load tabulator with my datafile and that query together:
tabulator with mockup data and query
You have to click the radiobutton next to 'Query' to see the results.
Now I'll actually write my database benchmark, and I'll have it output result sets for each db. I should be able to combine the result sets together and display them in a table with the method described above. The biggest issue with abusing tabulator in this way is that I have to grow my query for each new database I test. Also, that query won't display a row unless it has results from all databases. It would be nice to have all cells optional, so I can still see a row if it only has a result from one database.
unicode rdf symbol
I discovered today that Unicode comes with an rdf symbol (almost): ༜
That's ༜, "TIBETAN SIGN RDEL DKAR GSUM".
Use this if your font doesn't show the character.
RDF literals as subjects
Any proposal about allowing RDF literals as subjects, especially one that's for language purposes (in this case it's direction support), needs to address why RDF's current design has the exceptional 'language' attribute on literals. If your proposal is so good, why didn't RDF allow arcs from literals in the first place and avoid the langage/datatype special cases altogether?
I really know nothing about the direction support issue, but if it's one of the last few language-specific issues and it really ought to be separate from the 'langauge' attribute, I am inclined to prefer one more special case on literals than a total redo of the constraints on rdf graphs. My main concern with literals as subjects is that people will treat them like "casual" URIs that aren't universally unique.
posted gasuse
http://gasuse.bigasterisk.com now runs the latest gasuse code, including some SVG line graphing. The data is fixed RDF (read from xml). Next comes authentication so I can start adding new records in the field from my cell phone.
Atom feed of this blog