Open Mining INfrastructure for TExt and Data
Recent years witness an upsurge in the quantities of digital research data, offering new insights and opportunities for improved understanding. Text and data mining is emerging as a powerful tool for harnessing the power of structured and unstructured content and data, by analysing them at multiple levels and in several dimensions to discover hidden and new knowledge. However, text mining solutions are not easy to discover and use, nor are they easily combinable by end users. OpenMinTeD aspires to enable the creation of an infrastructure that fosters and facilitates the use of text mining technologies in the scientific publications world, builds on existing text mining tools and platforms, and renders them discoverable and interoperablethrough appropriate registriesand a standards-based interoperability layer, respectively. It supports training of text mining users and developers alike and demonstrates the merits of the approach through several use cases identified by scholars and experts from different scientific areas, ranging from generic scholarly communication to literaturerelated tolife sciences, food and agriculture, and social sciences and humanities. Through its infrastructural activities, OpenMinTeD’s vision is to make operational a virtuous cycle in which a) primary content is accessed through standardised interfaces and access rules b) by well-documented and easily discoverable text mining services that process, analyse, and annotate text c) to identify patterns and extract new meaningful actionable knowledge, which will be used d) for
structuring, indexing, and searching content and, in tandem, e) acting as new knowledge useful to draw new relations between content items and firing a new mining cycle. To achieve its goals, OpenMinTeD brings together different stakeholders, content providers and scientific communities, text mining and infrastructure builders, legal experts, data and computing centres, industrial players, and SMEs.