-
-
Notifications
You must be signed in to change notification settings - Fork 124
replaced_words is not correct #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I believe this is because I missed updating I have tried it on my side with following code import pkg_resources
from symspellpy import SymSpell
sym_spell = SymSpell(max_dictionary_edit_distance=2, prefix_length=7)
dictionary_path = pkg_resources.resource_filename(
"symspellpy", "frequency_dictionary_en_82_765.txt"
)
bigram_path = pkg_resources.resource_filename(
"symspellpy", "frequency_bigramdictionary_en_243_342.txt"
)
sym_spell.load_dictionary(dictionary_path, term_index=0, count_index=1)
sym_spell.load_bigram_dictionary(bigram_path, term_index=0, count_index=2)
input_term = (
"whereis th elove GPS hehad dated forImuch of thepast who "
"couqdn'tread in sixtgrade and 16 microstru cture him"
)
suggestions = sym_spell.lookup_compound(
input_term, max_edit_distance=1, ignore_non_words=True
)
for suggestion in suggestions:
print(suggestion)
for k, v in sym_spell.replaced_words.items():
print(f"origin: {k}, modify: {v.term}, edit_distance: {v.distance}") and managed to get the following output
and it seems to address the issue |
Thanks. It works. Could I add one more question? Is there a way to get the start, end index of the origin word? |
Unfortunately there's no way to do that in symspellpy right now, you'll have to implement some custom post processing functions in your project for that |
Thanks for your reply, and thanks for the package. |
We can see Ngumpak Dalem is changed to ngumpakdalem. But when I print the replaced_words.
Seems "origin: ngumpak, modify: n tumpak, edit_distance: 2" is not as expected.
The text was updated successfully, but these errors were encountered: