-
Notifications
You must be signed in to change notification settings - Fork 26
Jupyter notebook crashes when using dam_lev with transpose_costs #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have the same problem |
I'm not sure that we'll be able to look into the issue anytime soon, but if you are able to discover the issue, we would certainly take a look at a PR. 😄 |
The problem might be with the line endings. Linux uses \n, while Windows uses \r\n. |
This should solve it. The problem was caused by negative indexing which caused the memory error on Windows. I presume this didn't cause a crash on Linux and that's why it could work? |
I'm using dam_lev in a jupyter notebook (5.4.0). Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 10:22:32) [MSC v.1900 64 bit (AMD64)]. My OS is Windows 10. Running the code below, I get the error:
Process finished with exit code -1073741819 (0xC0000005).
I looked this up and it is an access violation (memory?). The code works fine in Linux and if I don't use transpose_costs it runs in windows also. I've checked that I have all of the required versions of numpy and cython.
Would you suggest anything? Build it myself? Use it in cython?
Thanks,
Bob
=============================================================
import numpy as np
from weighted_levenshtein import lev, osa, dam_lev
ins_costs = np.ones(128, dtype=np.float64)
del_costs = np.ones(128, dtype=np.float64)
sub_costs = np.ones((128, 128), dtype=np.float64)
tp_costs = np.ones((128, 128), dtype=np.float64)
insert costs that should be nearly free
ins_costs[ord('-')] = 0.1
ins_costs[ord('%')] = 0.1
ins_costs[ord(' ')] = 0.1
ins_costs[ord('.')] = 0.1
ins_costs[ord('/')] = 0.1
ins_costs[ord('#')] = 0.1
ins_costs[ord('&')] = 0.1
ins_costs[ord('(')] = 0.1
ins_costs[ord(')')] = 0.1
ins_costs[ord('+')] = 0.1
ins_costs[ord('?')] = 0.1
ins_costs[ord(',')] = 0.1
ins_costs[ord("'")] = 0.1
insert costs that should be nearly free
del_costs[ord('-')] = 0.1
del_costs[ord('%')] = 0.1
del_costs[ord(' ')] = 0.1
del_costs[ord('.')] = 0.1
del_costs[ord('/')] = 0.1
del_costs[ord('#')] = 0.1
del_costs[ord('&')] = 0.1
del_costs[ord('(')] = 0.1
del_costs[ord(')')] = 0.1
del_costs[ord('+')] = 0.1
del_costs[ord('?')] = 0.1
del_costs[ord(',')] = 0.1
del_costs[ord("'")] = 0.1
substitutions that should cost less than 1
sub_costs[ord('C'), ord('S')] = 0.5
sub_costs[ord('S'), ord('C')] = 0.5
sub_costs[ord('O'), ord('0')] = 0.1
sub_costs[ord('0'), ord('O')] = 0.1
transpositions that should cost less than 1
tp_costs[ord('I'), ord('E')] = 0.1
tp_costs[ord('E'), ord('I')] = 0.1
tp_costs[ord('A'), ord('E')] = 0.2
tp_costs[ord('E'), ord('A')] = 0.2
print(dam_lev('ABNANA', 'BANANA', transpose_costs=tp_costs,
substitute_costs=sub_costs,
insert_costs=ins_costs,
delete_costs=del_costs))
The text was updated successfully, but these errors were encountered: