MorphyNet

A large-scale multilingual database of derivational and inflectional morphology

MorphyNet in a nutshell

MorphyNet is a big morphological database that:

  • currently covers 15 languages: Catalan, Czech, English, Finnish, French, German, Hungarian, Italian, Mongolian, Polish, Portuguese, Russian, Serbo-Croatian, Spanish, and Swedish;
  • provides both derivational and inflectional data;
  • was, for most of it, extracted from Wiktionary and is thus of high quality;
  • is available for free.

Our SIGMORPHON 2021 paper, MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology, provides details on the generation of MorphyNet data.

How to download

Click here to access the download section on GitHub. MorphyNet is released under a CC BY-SA 3.0 Please cite the paper below if you use the resource.

How to cite

Please cite our paper if you use MorphyNet:

Khuyagbaatar Batsuren, Gábor Bella, and Fausto Giunchiglia. MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology. Proceedings of SIGMORPHON 2021.

Comparison to UniMorph and UDer

MorphyNet is similar to UniMorph in that it contains inflectional data extracted from Wiktionary. With respect to UniMorph, it covers a smaller number of languages (focusing on the major ones) but for those languages it provides better coverage (by about 50% overall).

With respect to UDer, MorphyNet covers a somewhat different set of languages and, over those covered by both resources, provides only a minor overlap (less than 10% of MorphyNet).

All in all, MorphyNet is more complementary than a competitor to the resources above.