Welsh Natural Language Toolkit Phase 2 (WNLT2)

Background

WNLT2 was a follow on project (2016-2017) from the Welsh Natural Language Toolkit Welsh Government funded project  under the Welsh-language technology and digital media grant.  Daniel Cunliffe was PI, with co-I  Douglas Tudhope and RA Daniel Williams

The aim of WNLT2 is to extend the accessibility of the WNLT toolkit to a wide range of developers and applications.

WNLT2 further develops the suite of WNLT open source software modules that enable Welsh Language computational linguistic and textmining applications via a set of core Natural Language Processing (NLP) tools. 

Objectives

The particular objectives of WNLT2 are to

  • Develop a standalone version of WNLT with its own GUI, an API and a command line interface
  • Develop a tweet analysis module so that Welsh-language tweets can be analysed
  • Expand the Named Entity Extraction (NER) module to cover a wider range of entities
  • Demonstrate an example NER application over the Papurau Bro community newspapers

Final workshop

A workshop was held at USW (Trefforest) to discuss the results with participants including representatives from Welsh Universities and SMEs, the GATE NLP community. The presentations and final user guide are available:
WNLT2 Toolkit 

WNLT user guide 2.3

Outputs

The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework. They are disseminated under the GNU Lesser General Public License (LGPL).  

The WNLT2 modules can be download from Sourceforge at 
https://sourceforge.net/projects/wnlt-project/