DBpedia is an extract of structured information from wikipedia. The structured data can be retrieved using an SQL-like query language for RDF called SPARQL.
There is already an R package for this kind of queries named SPARQL
.
There is an S4 class Dbpedia
part of my datamart
package that aims to support the creation of predefined parameterized queries.
Here is an example that retrieves data on German Federal States:
> library(datamart) # version 0.5 or later
> dbp <- dbpedia()
# see a list of predefined queries
> queries(dbp)
[1] "Nuts1" "PlzAgs"
# lists Federal States
> head(query(dbp, "Nuts1"))[, c("name", "nuts", "gdp")]
name nuts gdp
1 Hamburg DE6 94.43
2 Baden-Württemberg DE1 376.28
3 Mecklenburg-Vorpommern DE8 35.78
4 Rheinland-Pfalz DEB 107.63
5 Thüringen DEG 49.87
6 Berlin DE3 101.4
It is straightforward to extend the Dbpedia
class for further queries. More challenging in my opinion is
to figure out useful queries. Some examples can be found
at Bob DuCharme's blog,
in the article by Jos van den Oever at kde.org,
in a discussion on a mailing list
and a tutorial at the W3C,
at Kingsley Idehen's blog
and at DBpedia's wiki.
No comments:
Post a Comment