Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am looking for something like un-stemming. Is there a way to get all possible list of words which have share a common stem. Something like

>>> get_leaf_words('play')
>>> ['player', 'play', 'playing' ... ]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
190 views
Welcome To Ask or Share your Answers For Others

1 Answer

Solution to the above question: https://github.com/gutfeeling/word_forms ! Thanks to @Divyanshu Srivastava

>>> from word_forms.word_forms import get_word_forms
>>> get_word_forms("president")
>>> {'n': {'presidents', 'presidentships', 'presidencies', 'presidentship', 'president', 'presidency'},
     'a': {'presidential'},
     'v': {'preside', 'presided', 'presiding', 'presides'},
     'r': {'presidentially'}}
>>> get_word_forms("elect")
>>> {'n': {'elects', 'electives', 'electors', 'elect', 'eligibilities', 'electorates', 'eligibility', 'elector', 'election', 'elections', 'electorate', 'elective'},
     'a': {'eligible', 'electoral', 'elective', 'elect'},
     'v': {'electing', 'elects', 'elected', 'elect'},
     'r': set()}


Previous Answer:

Reverse stemming is not possible, as most of the stemmers create the base word using some rule-set applied on the original word.

But there is revere lemmatization which is called realization (or "surface realization").

You can use some of the publically available lemmatization datasets/dictionaries to do that.

Example: https://raw.githubusercontent.com/richardwilly98/elasticsearch-opennlp-auto-tagging/master/src/main/resources/models/en-lemmatizer.dict [Apache OpenNLP]

I could not find a direct library in Python but found one in Java (pynlg)

Furthermore: If you have enough original words, you can create a reverse dictionary for lemmatization OR stemming!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...