Author: Martyna Pawletta (KNIME)
Today: Ontologies – or let’s see if we can serve pizza via the semantic web and KNIME Analytics Platform. Will they blend?
Ontologies study concepts that directly relate to “being” i.e. concepts that relate to existence and reality as well as the basic categories of being and their relations. In information science, an ontology is a formal description of knowledge as a set of concepts within a domain. In an ontology we have to specify the different objects and classes and the relations - or links - between them. Ultimately, an ontology is a reusable knowledge representation that can be shared.
Fun reference: The Linked Open Data Cloud has an amazing graphic showing how many ontologies (linked data) there are available in the web.
The Semantic Web and the collection of related Semantic Web technologies like RDF (Resource Description Framework), OWL (Web Ontology Language) or SPARQL (SPARQL Protocol and RDF Query Language) offer a bunch of tools where linked data can be queried, shared and reused across applications and communities. A key role in this area is played by ontologies and OWLs.
So where does the OWL come into this? Well, no - we don’t mean the owl as a bird here – but you see the need of ontologies right? An OWL can have different meanings and this is one of the reasons why creating ontologies for specific domains might make sense.
Ontologies can be very domain specific and not everybody is an expert in every domain - but it’s a relatively safe bet to say that we’ve all eaten pizzas at some point in time - so let’s call ourselves pizza experts. Today’s challenge is to extract information from an OWL file containing information about pizza and traditional pizza toppings, store this information in a local SPARQL Endpoint, and execute SPARQL queries to extract some yummy pizza, em - I mean data. Finally, this data will be displayed in an interactive view which allows you to investigate the content.
Topic. KNIME meets the Semantic Web.
Challenge. Extract information from a Web Ontology Language (OWL) file.
Access Mode / Integrated Tool. KNIME Semantic Web/Linked Data Extension.
The ontology used in this blog post and demonstrated workflow is an example ontology that has been used in different versions of the Pizza Tutorial run by Manchester University. See more information on Github here.
Reading and querying an OWL file
In the first step the Triple File Reader node extracts the content of the pizza ontology in the OWL file format and reads all triples into a Data Table. Triples are a collection of three columns containing a subject(URI), a predicate(URI) and an object(URI or literal), short: sub, pred, obj. The predicate denotes relationships between the subject and the object. As shown in the screenshot below (Fig.1), in the example we see that the Pizza FruttiDiMare is a subClassOf the class NamedPizza and has two labels: a preferred and an alternative one.
Figure 1. Screenshot showing the output of the Triple File Reader node containing a subject, predicate and object column.
Once the Triple File Reader is executed, a SPARQL Endpoint can be created using the Memory Endpoint together with the SPARQL Insert node. This allows the execution of SPARQL queries. Note that our Triple File Reader does not officially support the OWL format. KNIME can read RDF files and consequently because OWL files are very similar we can read these files too. However not all information is necessarily retrieved as OWL can have additional parameters.
The example in Figure 2 shows a SPARQL query node that contains a query to extract a basic list with all pizzas included in the owl file.
A recommendation here: if the ontology you want to query is new to you – I would highly recommend exploring the structure and classes first quickly in another tool like Protége. This makes it easier later to create and write SPARQL queries.
Figure 2. Example workflow that shows how to read an OWL file, insert extracted triples into a SPARQL endpoint and execute a SPARQL query to extract all kinds of pizzas from the pizza ontology.
The SPARQL query node has a checkbox on the top right (see Fig. 2) saying “Preserve N-Triples format”. Selecting this makes a difference in terms of what the output data will look like. The N-Triples format needs to be kept if the triples will be inserted into an endstore.
The example below shows the effect of not checking (top) or checking (bottom) the N-triples checkbox. In case of URIs the angled brackets are not preserved, in terms of literals quotes and type (here @en) will be removed if nothing has been selected.
There are different ways in KNIME to visualize data. In the case of ontologies it’s really depending on what you are aiming to do. Here we will extract a bit more information than in the first example and create an interactive view within a component that allows us to explore the content of the pizza ontology.
Additionally to the pizza labels now using two SPARQL query nodes (see Fig. 3), further information like toppings per pizza type or its spiciness was extracted. Also, we query for pizza toppings that are a subclass of the class VegetableToppings and create a flag if the topping is a vegetable or not using the Constant Value Column node.
Figure 3. Example workflow showing how the basic example from Fig 2. Can be extended and an interactive view created.
Finally we create an interactive view where the extracted data can be explored (see Fig.4). To open the interactive view, right click the “Interactive View” Component + select Interactive View.
Figure 4. Interactive view showing extracted data
Is it real?!
When I first looked at the dataset using the view I saw the “Sloppy Giuseppe” Pizza and directly had to google it as it was something completely new to me. I saw the toppings but was wondering if this is something really Italian? This brought me to the idea of adding another feature here in addition to the tables and charts.
If you now click on the Pizza name, a new window will open showing Google search results for that specific pizza type. I did this using the String Manipulation node, which creates a link. To make sure the link opens in a new window and not in your current view the “target=_blank” option needs to be included.
We showed today how to extract data from an OWL file, create a SPARQL Endpoint and SPARQL query. Finally we generated a view where the content can be explored.
After playing with such yummy data… hungry now? Let’s order a pizza then 😉
The example workflow shown in this article, Exploring a Pizza Ontology with an OWL file, can be downloaded from the KNIME Hub here.
Will They Blend? Experiments in Data & Tool Blending
In the Will They Blend blog series we experiment with the most interesting blends of data and tools. Whether it’s mixing traditional sources with modern data lakes, open-source devops on the cloud with protected internal legacy tools, SQL with noSQL, web-wisdom-of-the-crowd with in-house handwritten notes, or IoT sensor data with idle chatting, we’re curious to find out: will they blend? Want to find out what happens when website texts and Word documents are compared?
Follow us here and send us your ideas for the next data blending challenge you’d like to see at email@example.com.