The lexicons we support
342 and counting
UKC -A diversity aware conceptualization of the world
A conceptualization of what people perceive as “the world” which enables the integration of the lexicons as they are used in the world languages and knowledge domains.
Vision and Mission
The Universal Knowledge Core (UKC) is a psycholinguistic principles based multilingual, high quality, large scale, and diversity aware machine readable lexical resource.
The key design principle underlying the UKC is to maintain a clear distinction between the language(s) used to describe the world as it is perceived and what is being described, i.e., the world itself. The Concept Core (CC) is the UKC representation of the world and it consists of a semantic network where nodes are language independent concepts. Each concept is characterized by a unique identifier which distinguishes it from any other concept. The semantic network consists of a set of semantic relations between nodes which relate the meanings of concepts, where these relations are an extension of those used by the Princeton WordNet (PWN) (e.g., hyponym, meronym).
We talk of the Language Core (LC), meaning the component that, in the UKC stores the words, senses, synsets, glosses and examples for all the languages supported by the UKC. In the LC each synset is univocally associated with one language and, within that language, with at least one word. Synsets are linked to concepts, and there is the constraint that each synset is linked to one and only one concept.
So far, the UKC has evolved in two ways: one is a combination of importing of freely available resources, e.g., WordNets or dictionaries of high quality (more details), and second is a collaborative platform of linguistic experts and native speakers to continuously build and maintain a lexical resource for individual language. As of June 2018, it contains 342 languages, 1,679,508 words, 2,512,704 senses and more than 120,000 concepts.
Language Coverage per continent