The STAR Research Demonstrator cross searched extracts of excavation datasets from different database schemas, including Raunds Roman, Raunds Prehistoric, Museum of London, Silchester Roman (LEAP IADB) and Stanwick sampling. These datasets have been mapped to the core ontology (the CRM-EH extension of the CIDOC CRM) and RDF semantic web representations of key elements (contexts, groups, samples, finds and their properties) have been extracted.
The Demonstrator ran until March 2017 when the underlying server was decommissioned. However there are videos demonstrating each of the controls in the demonstrator (see links below) and the Internet Archaeology summary paper gives details of various scenarios explored with the Demonstrator and includes a video of it in operation (section 3.1 – Hearths).
Natural Language Processing information extraction techniques have been applied to some key concepts in the grey literature (an extract of the OASIS corpus operated by the Archaeology Data Service), producing semantic metadata in the same CRM-EH based representation as the extracted data. This allows a unified searching of the different datasets and the grey literature in terms of semantic structure of the core CRM-EH ontology.
A previous Pilot system supported free text search (with interactive query expansion via STAR terminology services), and browsing of the ontology. The current Demonstrator focuses upon (SPARQL-based) semantic search, drawing on use cases and feedback on the Pilot system from project workshops. While a free text search on Note fields is included, the main emphasis is on structured semantic search via controlled identifiers.
The Demonstrator seeks to hide the complexity of the underlying ontology. An interactive query builder offers search (and browsing) for Samples, Finds, Contexts or interpretive Groups with their properties and relationships (including stratigraphic relationships where present in the datasets). As the user selects from the interface, an underlying semantic query is automatically constructed in terms of the corresponding ontological entities.
A set of browser based interactive controls was developed for searching and browsing the extracted archaeological data. The controls are designed to be browser agnostic and the Demonstrator will run in most commonly available internet browsers. Each control is populated via web service calls querying the remotely held data. The controls are combined to form the overall demonstration search interface. The following links serve to demonstrate various aspects of the full search system.
The intended users of the STAR Demonstrator are the digital archaeology research community. It is not intended as an operational system, rather as a research demonstration of semantic cross search of different datasets and grey literature via a core ontology. Previously cross search was not possible; each dataset remained in its own ghetto and no link was made to grey literature. The user interface is intended to demonstrate the research aims and combines various interface elements that would be separated and given more space in an operational interface. Specialised spatial and temporal elements could also be added – see STAR work on time periods.
The server is selected as appropriate for demonstration use. Currently, the first 100 results are returned. While the user interface permits complex queries to be constructed, there are known performance issues with SPARQL generally and particularly with free text queries – here on the Notes fields (free text Note search performs best as part of a semantic search). Thus occasionally a query may time out (returning an error message). We are investigating alternative SPARQL platforms where such problems may be ameliorated.
The STAR Demonstrator operates over extracts from the following excavations datasets: Raunds Roman, Raunds Prehistoric, Museum of London Archaeology (MOLA), Silchester Roman (IADB) and Stanwick sampling. The Demonstrator also operates over an extract of archeological grey literature from the Online AccesS to the Index of archaeological investigationS (OASIS), operated by the Archaeology Data Service. The data held within the STAR system are extracts from these databases provided solely for project demonstration purposes and have been experimentally mapped to the ontologies and terminology systems employed by STAR. Thus the displayed data should neither be regarded as necessarily current nor complete and does not constitute any definitive publication of the original datasets.
The STAR project gratefully acknowledges these data contributions, together with the various terminology resources (vocabularies, thesauri and CM-EH ontology) supplied by English Heritage. Please refer to the contacts acknowledged below for further information on the original datasets.
Gill Campbell (Stanwick sampling), Phil Carlisle (EH NMR Thesauri), Vicky Crosby (Raunds Roman), Keith May (Raunds Prehistoric, CRM-EH extenstion of CIDOC CRM), English Heritage; Pete Rauxloh, Museum of London Archaeology (MOLA), Museum of London Archaeology; Mike Rains (IADB), York Archaeological Trust, Julian Richards (OASIS), Archaeology Data Service.