Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible duplication bug after deleting previous processed images #1648

Open
igot8here opened this issue Dec 18, 2024 · 3 comments
Open

Possible duplication bug after deleting previous processed images #1648

igot8here opened this issue Dec 18, 2024 · 3 comments
Labels

Comments

@igot8here
Copy link

Hydrus version

v600

Qt major version

Qt 6

Operating system

Windows 10

Install method

Extract

Install and OS comments

No response

Bug description and reproduction

I encountered this problem before with a much bigger image set and wanted to recreate it with a smaller one.

I imported a few images of a character from the gallery using the danbooru search tag and the gelbooru search tag. Out of 54 images, 25 were 'already in db' probably due to being completely identical to other images from the other booru and so I was left with 29 images in the inbox. I then went used the duplicate page and went through a standard duplicate filtering process. No problems so far.

I then deleted everything, all the images in inbox, trash etc...

I then redid the whole process and it all went the same until I tried to go through the duplicate filtering process. As you can see in the first image, there are identical images. These are the same images I filtered out when I first did the process previously.

32435436 (1)

However, in the duplicate page I can't launch the filter nor did the search in the preparation tab find anything. I tried refreshing and it didn't work.

32435436 (2)
32435436 (3)

When I first encountered this problem, increasing the search distance worked even though the images were completely different. I assume the same thing would happen here. I can't check as I'm in a different PC. So the search function still works but for some reason it just doesn't consider the pair of files shown above.

Log output

No response

@igot8here igot8here added the bug label Dec 18, 2024
@hydrusnetwork
Copy link
Owner

Thank you for this report. I talked with another guy last week about a similar thing, and I may have fixed the problem in v603. I think that re-importing a file (in recent weeks? maybe longer?) may have not (re)integrated a file into the similar files search system correctly.

You still have these files that are out of the system, so let's fix them:

  • hit up database->files maintenance->manage scheduled jobs and go to the 'add new work' tab
  • run a search for 'system:filetype is image' and click the special 'run this search' button
  • set the job type to 'check for membership in the similar files search system' and hit 'add job'
  • if you like, go to the 'scheduled work' tab and see your new jobs and hurry them along

Fingers crossed that will fix you up. Let me know if it does or does not work, and if it does the trick I think I will schedule that job for everyone for the v604 update to make sure everyone's holes are fixed here.

@igot8here
Copy link
Author

igot8here commented Jan 3, 2025

Fingers crossed that will fix you up. Let me know if it does or does not work, and if it does the trick I think I will schedule that job for everyone for the v604 update to make sure everyone's holes are fixed here.

I was away for the last couple of weeks so I only had the chance to test it now with v603. It doesn't seem to work, I followed the bullet points and waited for a couple of hours in case the process is happening in the background but nothing changed. As the image below shows there's just nothing in scheduled work despite the pop up saying job added. I assume it just finished things quickly but just in case, I waited.

Screenshot (17288)
Screenshot (17289)

I also tried other job options to check if it's just "check for membership in the similar files search system" job type not appearing and it seems other jobs appear.

Screenshot (17291)

I do have "exclude previously deleted files" unticked in import options as the images just wouldn't show up in the first place when reimporting them in gallery after previously deleted.

@hydrusnetwork
Copy link
Owner

Thank you for this update. You are using the UI correctly and your understanding of what should be happening is correct. I'd normally expect the job to take a few seconds to clear out, so you not seeing anything is odd. Maybe the small number of files means it is being cleared out in 0.1s in the first block somehow.

I have re-read your report properly and I am sorry to say think I read it wrong the first time around. I was thinking of the other guy's issue and did not recognise that you had processed these files before deleting and re-importing them.

The issue I fixed was:

  • import some files that are potential dupes
  • delete them
  • reimport them
  • they do not appear as potential dupes any more

Your problem is (let me know if this is not correct):

  • import some files that are potential dupes
  • set them as duplicates with the filter
  • delete them
  • reimport them
  • they do not appear as potential dupes

This second situation is intended behaviour, as duplicate status is designed to survive file deletion. If you would like to re-process the duplicate relationships of files that you have previously set as duplicates, I think you'll want to chase down this ugly menu path:

image

If this is the correct understanding of your problem, I apologise again--I read your report too quickly. If I still do not understand, please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants