Skip to content

case where the path is the same in the links.txt #339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
anandijain opened this issue Dec 18, 2024 · 2 comments
Open

case where the path is the same in the links.txt #339

anandijain opened this issue Dec 18, 2024 · 2 comments

Comments

@anandijain
Copy link

hey again, love dlm! and thanks for your help in the past

https://static.case.law/us/1/CasesMetadata.json
https://static.case.law/us/2/CasesMetadata.json
https://static.case.law/us/3/CasesMetadata.json
...

the problem i ran into was

[2024-12-18 00:13:52] Skipping CasesMetadata.json because the file is already completed [299.39KiB]                 
[2024-12-18 00:13:52] Skipping CasesMetadata.json because the file is already completed [299.39KiB]                 
[2024-12-18 00:13:52] Skipping CasesMetadata.json because the file is already completed [299.39KiB]           
...      

the problem is that these files are actually distinct, but i couldn't find a way to tell dlm the names that i wanted to use or to canonicalize them to be unique in some way.

i dont want to complicate this beautifully simple package too much, but this case might be useful to handle

@agourlay
Copy link
Owner

agourlay commented Jan 2, 2025

Thanks for opening this!

I believe this is a recurring issue that could be handled by dlm natively.

My first idea would be to use the segments in the URL to name the files.

e.g.

https://static.case.law/us/1/CasesMetadata.json
https://static.case.law/us/2/CasesMetadata.json
https://static.case.law/us/3/CasesMetadata.json

to

us-1-CasesMetadata.json
us-2-CasesMetadata.json
us-3-CasesMetadata.json

I guess there are edge cases requiring additional chars sanitization to make sure the names are valid 🤔

WDYT?

@anandijain
Copy link
Author

great solution to me! means not adding more args

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants