STAR (Semantic Technologies for Archaeological Resources) is an AHRC funded project (2007-2010), in collaboration with English Heritage, applying semantic and knowledge-based technologies to the digital archaeology domain. The project aims to develop new methods for linking digital archive databases, vocabularies and the associated grey literature, exploiting the potential of a high level, core ontology and natural language processing techniques.Doug Tudhope is PI and Ceri Binding is the Research Fellow on the project. Andreas Vlachidis and Renato Souza are project members within the Hypermedia Research Unit. Keith May is a project collaborator from English Heritage.
The project has now concluded but further research continues to build on STAR outcomes.
STAR outputs include a Research Demonstrator, SKOS-based Semantic Terminology Services and other applications, together with various data outputs, including RDF representations and NLP indexing of grey literature.
An early case study on STAR has been published by the AHRC ICT Methods Network.
Increasingly within archaeology, the Web is used for dissemination of datasets. This contributes to the growing amount of information on the ‘deep web’, which a recent Bright Planet study estimated to be 500 times larger than the ‘surface web’. However Google and other web search engines are ill equipped to retrieve information from the richly structured databases that are key resources for humanities scholars. Important archaeological results and reports are also appearing as grey literature, before or instead of traditional publication. Typically these are not indexed or made available for searching other than as ordinary web documents. It is difficult using conventional search engines to link these to datasets or indeed to search them using terminology other than that employed by the authors.
Cultural heritage and memory institutions generally are seeking to expose databases and repositories of digitised items, previously confined to specialists, to a wider academic and general audience. The mapping from lay (or related subject area) terminology to technical vocabularies in a particular domain is a critical problem. There is a need for tools to help formulate and refine searches and navigate through the information space of concepts used to describe a collection. Different people use different words for the same concept or may employ slightly different concepts and this ‘vocabulary problem’ is a barrier to widening scholarly access.
The sector has a rich tradition of employing Knowledge Organisation Systems (KOS – such as thesauri) to assist semantic interoperability. However, such vocabulary tools are often not fully integrated into searching and indexing systems and online practice has tended to mimic traditional print environments. The full potential of these knowledge resources in online environments has not been tapped.
Scholarly cross domain research often involves multi-concept expressions of the research question or information need. Conventional tools do not facilitate the necessary generalisation of the search statement when an exhaustive search is required.
To investigate the potential of semantic terminology tools for widening and improving access to digital archaeology resources, including disparate data sets and associated grey literature.
Connecting archaeological data and grey literature via semantic cross search (open access)
Vlachidis, A., Tudhope, D., Binding, C. & May, K. 1 Jul 2011. Internet Archaeology. 30
Negation detection and word sense disambiguation in digital archaeology reports for the purposes of semantic annotation (final version)
Vlachidis, A. & Tudhope, D. 2015. Program-Electronic library and information systems. 49, 2, p. 118-134 17 p.
A knowledge-based approach to Information Extraction for semantic interoperability in the archaeology domain (author version)
Vlachidis, A. & Tudhope, D. 2016. Journal of the Association for Information Science and Technology. 67, 5, p. 1138-1152 15 p.