Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URLs are not added to list of documents to be scanned #951

Open
luzidl opened this issue Dec 17, 2024 · 1 comment
Open

URLs are not added to list of documents to be scanned #951

luzidl opened this issue Dec 17, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@luzidl
Copy link

luzidl commented Dec 17, 2024

Following issue is really affecting the usage of the tool at the moment. Adding URLs from Websites will skip URLs that are having partially the same structure. See example below:

E.g. I added the URL

https://www.bkw.ch/de/energie/energiebeschaffung-fuer-geschaeftskundinnen-und-geschaeftskunden/energy-relax

then added the URL

https://www.bkw.ch/de/energie/energiebeschaffung-fuer-geschaeftskundinnen-und-geschaeftskunden/energy-relax-in-tranchen

but the latter will never be added to the list of documents to be scanned. Everything that has the first portion of the URL seems to be not added, too! (means this portion: https://www.bkw.ch/de/energie/energiebeschaffung-fuer-geschaeftskundinnen-und-geschaeftskunden/).
I also experienced that for websites, that have a page counter at the end, eg. page=1 ... n.

Can you please fix this bug as soon as possible?

Thanks in advanced!

@kartikpersistent kartikpersistent added the bug Something isn't working label Dec 17, 2024
@jexp
Copy link
Contributor

jexp commented Dec 18, 2024

For the time being you can also save the webpages to PDF and upload them as a workaround.

We're looking into it.

Sometimes the actual text of the webpage is not added because it's generated by javascript and not in the actual HTML but it doesn't seem to be the case here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants