A large-scale multilingual database of derivational and inflectional morphology

MorphyNet in a nutshell
MorphyNet is a big morphological database and semantic network. In particular, MorphyNet:
- provides >13 million inflections and >700 thousand derivations for 15 languages: Catalan, Czech, English, Finnish, French, German, Hungarian, Italian, Mongolian, Polish, Portuguese, Russian, Serbo-Croatian, Spanish, and Swedish;
- these morphological data were extracted from Wiktionary and are thus of very high quality (precision evaluated as >98%);
- NEW (December 2022): an additional 90 thousand derivations in 271 languages were inferred automatically from the combination of MorphyNet and the Universal Knowledge Core, with a quality evaluated to be >95%;
- NEW (December 2022): about 130 thousand derivations (source and target words) are semantically disambiguated against concepts of the Universal Knowledge Core and against Princeton WordNet synsets, thus forming a semantic network of derivational relations;
- is available for free under CC BY-SA 3.0.
Our SIGMORPHON 2021 paper, MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology, provides details on the generation of MorphyNet data.
How to download
Click here to access the download section on GitHub. MorphyNet is released under a CC BY-SA 3.0 licence. Please cite the paper below if you use the resource.
How to cite
Please cite our paper if you use MorphyNet:
Khuyagbaatar Batsuren, Gábor Bella, and Fausto Giunchiglia. MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology. Proceedings of SIGMORPHON 2021.
Comparison to UniMorph and UDer
MorphyNet is similar to UniMorph in that it contains inflectional data extracted from Wiktionary. With respect to UniMorph, it covers a smaller number of languages but for those languages it provides better coverage (by about 50% overall).
With respect to UDer, MorphyNet covers a somewhat different set of languages and, over those covered by both resources, has a minor overlap (less than 10% of MorphyNet).