Avoid Another IoT Tower of Babel!

Lindsay Frost, Duncan Bees, and Martin Bauer
September 18, 2018

 

Every reader of this newsletter knows that the world is facing ever-higher pressure to use technologies such as Internet of Things (IoT) and Artificial Intelligence (AI) to help solve problems due to global trends like increased urbanization, failing road and building infrastructures, greatly increased care for the elderly, etc. etc. Everyone also knows the story of an older infrastructure project that did not go well: the Tower of Babel [1]. There, an attempt to reach heaven by building a single tower, motivated by pride, was punished by the people being condemned to speak in mutually incomprehensible languages and being dispersed around the world.

IoT today is like that post-Babel world, where each disparate group started again to build its own vertical tower, each solution striving to be the first or the best, with data collected in a plethora of proprietary formats, and with huge variations in how the meaning of the data is defined and exchanged. Simply transporting the data using Internet Protocol channels over 3G, 5G, broadband WLAN or fibre is not the problem. The problem is giving or selling the data to the recipient, if it is unclear what the information means, how it was collected, when, where, with what quality, under what license, and so on and on.

Figure 1: Tower of Babel, Pieter Bruegel the Elder, 1563 [2].
Figure 1: Tower of Babel, Pieter Bruegel the Elder, 1563 [2].

 

The Linked Data approach for sharing data on the web, famous for the Berners-Lee “Five Stars” description [3] of best practice (i.e. open license, machine-readable, non-proprietary format, open standards for identification and definition, linked to other widely accepted definitions) does help, but has not yet been sufficiently widely adopted in IoT.

There are, however, many initiatives to re-use and link to appropriate definitions, wherever a community of users see an urgent need. For example, www.schema.org is a collaborative approach for metadata in webpages, supported by internet search-engine companies. Another example is the Open Metadata Registry [4], which provides a means to register and discover machine-readable vocabularies: currently about 460. The Basel University helpfully lists about 2800 various other sets of definitions [5], called ontologies. In the European Union, with so many borders and languages for information to cross, the EU parliament in 2007 stepped in to require national governments to at least use a common set of geo-spatial and environmental data definitions: that INSPIRE Directive [6] has triggered many activities.

But in a post-Babel world, insisting that all partners use a common vocabulary, as attractive and efficient as that would be, is slow to succeed. Another approach is to provide a way for each partner to point (link to) the definitions and licensing etc used in their data. This makes the metadata details and assumptions, implicit within the original “vertical silo” of the collecting organisation, accessible to third parties and therefore makes reliable sharing of the information possible. This greatly simplifies integration work in e.g. Smart City projects. It would also enormously reduce time wasted “cleaning up” raw data for the “data lakes” currently extolled as training grounds for AI recommendation systems.

This ‘link to definitions’ approach is taken by the ETSI Industry Specification Group for cross-cutting Context Information Management, called for short ETSI ISG CIM [7]. An ISG is a special working group of ETSI that is set up to allow participation by non-ETSI members, but with IPR and transparency of operations handled according to usual ETSI rules.

ETSI ISG CIM sees a need for an API (called by the group NGSI-LD) to exchange the definitions/metadata/context, not only for IoT information, but also for Linked Data from numerous municipal and national repositories, for diverse input from mobile devices, plus handling of requests from users (or their Apps) to access the data and for software to check the provenance (source, licensing, “freshness” etc.) of information. In real world systems, the tracking of usage of data is also important, not only for commercial billing or government usage-based budgeting, but also for exploring patterns of usage, planning of improvements and checking of consistency. Figure 2 shows the six kinds of information currently considered and indicates the role of NGSI-LD in enabling context information discovery and exchange.

Figure 2: ETSI ISG CIM view of integration of six kinds of information.
Figure 2: ETSI ISG CIM view of integration of six kinds of information.

 

The use cases considered by ETSI ISG CIM (Smart City, Smart Agriculture and other systems), plus best practice in semantic web applications, requires the NGSI-LD API to be agnostic to the kind of data architecture used (centralized, distributed, federated) and to enable migration from one architecture to another. This has in turn implied being extremely flexible about the kind of “object identifier” used in the system, so Uniform Resource Indicators (URIs) [8] are used. These need to be unique text strings, but do not require special hierarchical naming (familiar from URLs) and also do not mandate that the resource must be reachable via the public internet – although of course that is always an option.

There was also a strong desire to leverage expertise in the semantic web and the developer communities by, firstly, basing the exchange protocol on JSON-LD [9] and, secondly, choosing a formatting which is also “human friendly”. A preliminary version of the specification was published for comment in March 2018 [10]. A simple example is shown in Figure 3, showing that a person ‘Bob’ has observed at a certain time that a certain vehicle, of type ‘Mercedes’, is parked at a certain location ‘Downtown1’. The actual definitions of all those terms are to be found in re-usable form at the location referenced by the JSON-LD term ‘@context’.

 

{
"id": "urn:ngsi-ld:Vehicle:A4567",
"type": "Vehicle",
"brandName": {
"type": "Property",
"value": "Mercedes"
},
"isParked": {
"type": "Relationship",
"object": "urn:ngsi-ld:OffStreetParking:Downtown1",
"observedAt": "2017-07-29T12:00:04",
"providedBy": {
"type": "Relationship",
"object": "urn:ngsi-ld:Person:Bob"
}
},
"@context": [
"http://uri.etsi.org/ngsi-ld/coreContext.jsonld",
"http://example.org/cim/commonTerms.jsonld",
"http://example.org/cim/vehicle.jsonld",
"http://example.org/cim/parking.jsonld"
]
}

Figure 3: Example of a NGSI-LD statement.


The next steps for the ETSI ISG CIM work are to enhance the specification’s query capabilities, security, and other aspects, with publication as a document GS CIM 009 Version 1.1.0 expected by December 2018. An ‘NGSI-LD Primer’, introducing usage of the specification, will also be published. Additional feedback from the IoT community is very welcome (e.g. using NGSI-LD@etsi.org). The next public presentations will occur at the ETSI IoT Week conference in October [11].

References

[1] See https://en.wikipedia.org/wiki/Tower_of_Babel
[2] Pieter Bruegel the Elder. “The Tower of Babel”, 1563, Oil on wood panel, 1,14m (44.9 in) by 1.55m (61 in). Public domain image, via Wikimedia Commons, Accessed 26th August 2017 at https://commons.wikimedia.org/wiki/File:Pieter_Bruegel_the_Elder_-_The_Tower_of_Babel_(Vienna)_-_Google_Art_Project_-_edited.jpg
[3] Berners-Lee, Tim, ‘Linked Data’, Published 27th July 2006. Accessed 26th August 2018 at https://www.w3.org/DesignIssues/LinkedData.html
[4] Open Metadata Registry, see http://metadataregistry.org/about.html . Accessed 26th August 2018.
[5] Basel Register of Thesauri, Ontologies & Classifications (BARTOC) is a database of Knowledge Organization Systems and KOS related Registries, developed by the Basel University Library, Switzerland. See http://bartoc.org/en
[6] Directive 2007/2/EC of the European Parliament and of the Council of 14 March 2007 establishing an Infrastructure for Spatial Information in the European Community (INSPIRE). Accessed 26th August 2018 at  https://eur-lex.europa.eu/legal-content/en/TXT/?uri=CELEX:32007L0002
[7] European Telecommunications Standards Institute (ETSI) Industry Specification Group (ISG) for cross-cutting Context Information Management (CIM). See https://portal.etsi.org/CIM
[8] Berners-Lee, Tim, R. Fielding, L. Masinter. ‘Uniform Resource Identifier (URI): Generic Syntax’. Published January 2005. Accessed 27th August 2018 at  https://tools.ietf.org/html/rfc3986. See also https://tools.ietf.org/html/rfc7320
[9] JSON-LD specification is currently in version 1.0, published in January 2014 at https://www.w3.org/TR/json-ld/  Note that improvements are being considered in the W3C JSON-LD Working Group https://www.w3.org/2018/json-ld-wg/ 
[10] ETSI ISG CIM ‘Context Information Management Application Programming Interface (API)’. Published March 2018 at https://docbox.etsi.org/ISG/CIM/Open/ISG_CIM_NGSI-LD_API_Draft_for_public_review.pdf
[11] ETSI IoT Week is an evolution of the annual M2M/IoT workshops, for anyone involved in standards-enabled IoT technologies and deployments. See https://www.etsi.org/etsi-iot-week-2018

 


 

Lindsay FrostLindsay Frost is Chief Standardization Engineer at NEC Laboratories Europe. He was elected chairman of ETSI ISG CIM in February 2017, elected to the Board of ETSI in November 2017 and is ETSI delegate to the sub-committee of the EC Multi-Stakeholder Platform (Digitizing European Industry) and to the CEN-CENELEC-ETSI Sector Forum on Smart and Sustainable Cities and Communities. He began his career as a research manager in experimental physics facilities in Australia, Germany and Italy, before joining NEC in 1999 where he has managed R&D teams for 3GPP, WiMAX, fixed-mobile convergence and WLAN. Contact him at Lindsay.Frost@neclab.eu

 

Duncan BeesDuncan Bees carries out technical and business projects in telecommunications, IoT, media streaming, and broadband infrastructure in Vancouver, Canada. He has led wireless baseband signal processing development teams, product planning for communications semiconductors, strategic planning for the Digital Living Network Association, and was Chief Technology and Business Officer of the Home Gateway Initiative (HGI). He holds the degrees of Master of Electrical Engineering (digital signal processing) from McGill University, and a Bachelor of Applied Science from the University of British Columbia. Contact him at duncan@duncanbeestechnologies.com

 

Martin BauerMartin Bauer is Senior Researcher at NEC Laboratories Europe. He has been working on activities related to Internet of Things, Context Management and Semantics for more than 15 years. Recently he has been working on IoT-related European projects in the area of smart city and autonomous driving, as well as IoT-related standardization activities, in particular oneM2M and ETSI ISG CIM. He is the AIOTI WG03 sub-group chair for the semantic interoperability topic. He holds a doctorate degree in Computer Science from Stuttgart University and Master of Science degrees in Computer Science from both the University of Oregon and Stuttgart University. Contact him at Martin.Bauer@neclab.eu