Semantic Technologies and Applications COMP5860M
John Stell
Room 9.15, School of Computing
j.g.stell@leeds.ac.uk
Lecture 1: January 2020
1
Today
Aim to introduce the module
Module organisation
My research and how it relates to the module idea of the semantic web
Idea of ontology
Examples of ontology use
2
Module Syllabus
Knowledge and data formalisms; ontology engineering; semantic enrichment and retrieval.
Applications including linked data, semantic data browsers, smart social spaces
(e.g. semantic wikis, semantic blogs, social networking).
There will be a practical component with hands-on experience on applying semantic web technologies in a specific domain (e.g. decision making, learning, health, e-business, digital libraries)
3
Learning Activities
Lectures
Monday 4–5
Tuesday 4–5 Coursework
Practical experience with key aspects
Lab sessions to help with coursework – NOT THIS WEEK
VLE Minerva
Reading Materials
4
Semantic Technology and Applications
Lecture 01
Coursework:
2 pieces, total 40% of module
Examination:
2 hours, open book, 60% of module
5
Semantic Technology and Applications Lecture 01
Books:
Allemang, D. & Hendler, J. (2011). Semantic Web for the Working Ontologist, Elsevier.
Hitzler, P., Krotzsch, M. & Rudolf, S. (2010). Foundations of Semantic Web Technologies, Chapman & Hall.
Halpin, H. (2012). Social Semantics: The Search for Meaning on the Web, Springer.
Heath, T. & Bizer, C. (2011). Linked Data: Evolving the Web into a Global Data Space, Morgan & Clypool.
Links:
Semantic Web portal on W3C: +https://www.w3.org/standards/semanticweb/+
6
My Research: Space in Digital Humanities
At Dalemain, about three miles from Penrith, a Stream is crossed, called Dacre, which, rising in the moorish country about Penruddock, flows down a soft sequestered Valley, passing by the ancient mansions of Hutton John and Dacre Castle.
. . . and from some of the fields near Dalemain, Dacre Castle, backed by the jagged summit of Saddleback, and with the Valley and Stream in front of it, forms a grand picture. There is no other stream that conducts us to any glen or valley worthy of being mentioned, till you reach the one which leads you up to Airey Force, and then into Matterdale, before spoken of.
7
My Research: Qualitative Space
8
My Research: Logic in Knowledge Representation
Part
P(x,y) ≡ ∀z z C x → z C y
Proper part PP(x,y)≡P(x,y)∧x ̸=y
Overlap
O(x,y)≡∃z P(z,x)∧P(z,y)
External connection EC(x,y)≡xCy∧O̸ (x,y)
Non-tangential proper part
NTPP(x,y) ≡ PP(x,y) ∧ ̸ ∃z EC(z,x) ∧ EC(z,y)
9
10
11
12
What has that got to do with semantic technology?
modelling using relationships between things
logical specification of properties
classification of things (code as artwork?)
digital description of meaning
13
https://www.eighteenthcenturypoetry.org/
An example of a ‘Knowledge Graph’ from this site (Google popularized this term, but it’s a little vague)
We don’t study 18thC poetry in this module! but we do study some of the the technology used to study it today.
14
15
16
dct:creator
https://www.dublincore.org/specifications/dublin-core/dcmi- terms/terms/creator/
Definition: An entity responsible for making the resource.
Type of Term : Property
Comment: Recommended practice is to identify the creator with a URI. If this is not possible or feasible, a literal value that identifies the creator may be provided.
17
Example CIDOC-CRM www.cidoc-crm.org
“The CIDOC Conceptual Reference Model (CRM) is a theoretical and practical tool for information integration in the field of cultural heritage. It can help researchers, administrators and the public explore complex questions with regards to our past across diverse and dispersed datasets. ”
It provides “definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation and of general interest for the querying and exploration of such data.”
18
Example CIDOC-CRM www.cidoc-crm.org
“. . . intended to promote a shared understanding of cultural heritage information by providing a common and extensible semantic framework for evidence-based cultural heritage information integration. It is intended to be a common language for domain experts and implementers to formulate requirements for information systems and to serve as a guide for good practice of conceptual modelling. In this way, it can provide the ”semantic glue” needed to mediate between different sources of cultural heritage information, such as that published by museums, libraries and archives.”
19
Example CIDOC-CRM www.cidoc-crm.org
“. . . consists of the CRMbase standard which provides the basic classes and relations devised for the cultural heritage world. This base ontology is complemented by a series of modular extensions to the basic model. Such extensions are designed to support different types of specialized research questions and documentation such as bibliographic documentation or geoinformatics. The CIDOC CRM extensions are developed in partnership with the research communities in question.”
20
E20 – – – – – – Biological Object
E21 – – – – – – – Person Fragment of CIDOC-CRM Class Hierarchy
E22 – – – – – – Human-Made Object
E24
–
–
–
–
–
Physical Human-Made Thing
E22
–
–
–
–
–
–
Human-Made Object
E25
–
–
–
–
–
–
–
–
Human-Made Feature
E78
–
–
–
–
Curated Holding
E26
–
–
–
–
–
Physical Feature
E27
–
–
–
–
–
–
Site
E25
–
–
–
–
–
–
Human-Made Feature
E90
–
–
–
–
Symbolic Object
E73
–
–
–
–
–
Information Object
E29
–
–
–
–
–
Design or Procedure
E31
–
–
–
–
–
Document
E32
–
–
–
–
–
–
–
Authority Document
E33
–
–
–
–
–
–
Linguistic Object
E34 – – – – – – – Inscription
E35 – – – – – – – Title 21
crm:E33 Linguistic Object
This class comprises identifiable expressions in natural language or languages.
Instances of E33 Linguistic Object can be expressed in many ways: e.g. as written texts, recorded speech or sign language. However, the CRM treats instances of E33 Linguistic Object independently from the medium or method by which they are expressed.
Expressions in formal languages, such as computer
code or mathematical formulae, are not treated as
instances of E33 Linguistic Object by the CRM.
http://www.cidoc-crm.org/Entity/e33-linguistic-object/version-6.2
22
crm:E73 Information Object
This class comprises identifiable immaterial items, such as a poems, jokes, data sets, images, texts, multimedia objects, procedural prescriptions, computer program code, algorithm or mathematical formulae, that have an objectively recognizable structure and are documented as single units.
An E73 Information Object does not depend on a specific physical carrier, which can include human memory, and it can exist on one or more carriers simultaneously.
http://www.cidoc-crm.org/Entity/e73-information-object/version-6.2.2
23
In this module
Examples, e.g. CRM, are used to illustrate ideas, but you are not expected to become experts in the technical details of specialized ontologies
We are interested in
key features found in all ontologies (e.g. classes and properties)
how are ontologies constructed?
how are they evaluated?
critical views on how useful they are
how ontologies are represented and manipulated
knowledge graphs, taxonomies, things other than formal ontologies, history and development
concepts needed in many ontologies (e.g. space and time)
24
Semantic Web Vision
“The Semantic Web is an extension of the current web in which information is
given well-defined meaning,
better enabling computers and people to work in cooperation”.
Tim Berners-Lee, James Hendler, Ora Lassila. Scientific American, May 2001
25
Interoperability Problem
26
Semantic Web Vision
27
Basis of the Semantic Web?
Idea: extend Web beyond HTML pages to support machine-understandable data.
Aim: be able to use the Web like a global distributed knowledge store which applications can use automatically
Some basic means to achieve this:
Uniform Resource Identifiers (URIs): a global identification mechanism
Resource Description Framework (RDF):
basic data model and XML serialisation for publishing data
Web Ontology Language (OWL):
beyond RDF for more expressive knowledge representation
28
Semantic Web Tower and STA Content
29
Q. What does the acronym “OWL” stand for?
A. Actually, OWL is not a real acronym. The language started out as the “Web Ontology Language” but the Working Group disliked the acronym “WOL”. We decided to call it OWL. The Working Group became more comfortable with this decision when one of the members pointed out the following justification for this decision from the noted ontologist A.A. Milne who, in his influential book “Winnie the Pooh” stated of the wise character OWL:
“He could spell his own name WOL, and he could spell Tuesday so that you knew it wasn’t Wednesday…”
https://www.w3.org/2003/08/owlfaq
Do you trust this website about who is a noted ontologist?
Can’t we just use AI to understand ordinary web pages, instead of putting data into special format?
30
SNOMED CT
www.england.nhs.uk/digitaltechnology/digital-primary-care/snomed-ct/
“SNOMED CT is a structured clinical vocabulary for use in an electronic health record.
It is the most comprehensive and precise clinical health terminology product in the world, forming an integral part of the electronic care record. It represents care information in a clear, consistent, and comprehensive manner.
The move to a single terminology, SNOMED CT, for the direct management of care of an individual, across all care settings in England, is recommended by the National Information Board. . . . ”
31
SNOMED CT . . . /digital-primary-care/snomed-ct/
“The benefits of using SNOMED CT in electronic care records:
vital information can be shared consistently within and across health and care settings
comprehensive coverage and greater depth of details and content for all clinical specialities and professionals
it includes diagnosis and procedures, symptoms, family history, allergies, assessment tools, observations, devices
clinical decision making is supported
it facilitates analysis to support more extensive clinical audit
and research
reduced risk of misinterpretations of the record in different care settings”
32
SNOMED CT https://termbrowser.nhs.uk/
33
Pizza
e Meaning Of Subclass All individuals that are members of the class TomatoTo mowl-power.cs.man.ac.uk/protegeowltutorial/resources/
mbers of the class VegetableTopping and members of the class PizzaTopping as we ha ProtegeOWLTutorialP4 v1 3.pdf
t TomatoTopping is a subclass of VegetableTopping which is a subclass of PizzaToppin 34
hp ev
a
Pizza
mowl-power.cs.man.ac.uk/protegeowltutorial/resources/
35
Ending
Ontologies model the world
Classes and relationships are key aspects
To determine appropriate classes and relationships depends on the application area and usually needs effort and is hard
Ontologies can be represented and manipuated using RDF, OWL, SPARQL and software such as Protege (all of which we study in more detail in the module)
Some of the vision of the Semantic Web has not been realized, but many semantic technologies are important in the real world.
36