Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I've written a script that identifies duplicate and near-duplicate images based on some criteria. The results are placed in a dictionary where the keys represent an image and the values are duplicate images. For example, for image0 there are duplicate images 1-5. Now, I'm trying to make a list of candidates to delete based on my dictionary. I'd like to keep the first image that appears in the dictionary (image0), delete images 1-5, and then skip keys 1-5 because those images have already been removed. How would I do this? Or is there a better way to go about identifying candidates for deletion?

Example Dictionary:

{0: [1, 2, 3, 4, 5],
 1: [0, 2, 3, 4, 5],
 2: [0, 1, 3, 4, 5],
 3: [0, 1, 2, 4, 5],
 4: [0, 1, 2, 3, 5],
 5: [0, 1, 2, 3, 4],
 6: [7, 8, 9, 10, 11],
 7: [6, 8, 9, 10, 11],
 8: [6, 7, 9, 10, 11],
 9: [6, 7, 8, 10, 11],
 10: [6, 7, 8, 9, 11],
 11: [6, 7, 8, 9, 10],
 12: [13, 14, 15, 16, 17],
 13: [12, 14, 15, 16, 17],
 14: [12, 13, 15, 16, 17],
 15: [12, 13, 14, 16, 17],
 16: [12, 13, 14, 15, 17],
 17: [12, 13, 14, 15, 16],
 18: [19, 20, 21, 22, 23],
 19: [18, 20, 21, 22, 23],
 20: [18, 19, 21, 22, 23],
 21: [18, 19, 20, 22, 23],
 22: [18, 19, 20, 21, 23],
 23: [18, 19, 20, 21, 22],
 24: [25, 26, 27, 28, 29],
 25: [24, 26, 27, 28, 29],
 26: [24, 25, 27, 28, 29],
 27: [24, 25, 26, 28, 29],
 28: [24, 25, 26, 27, 29],
 29: [24, 25, 26, 27, 28],
 30: [31, 32, 33, 34, 35],
 31: [30, 32, 33, 34, 35],
 32: [30, 31, 33, 34, 35],
 33: [30, 31, 32, 34, 35],
 34: [30, 31, 32, 33, 35],
 35: [30, 31, 32, 33, 34],
 36: [37, 38, 39],
 37: [36, 38, 39],
 38: [36, 37, 39],
 39: [36, 37, 38],
 40: [41, 42, 43],
 41: [40, 42, 43],
 42: [40, 41, 43],
 43: [40, 41, 42],
 44: [45, 46],
 45: [44, 46],
 46: [44, 45]}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
812 views
Welcome To Ask or Share your Answers For Others

1 Answer

The logic of what you want to do is very straightforward. You can keep a set of delete candidates. You iterate over the keys in the dict. For each key, you look for it in the set. If it's there, then you don't want to process it because it has already been determined to be a key you want to delete. If it isn't there, then the value in that key's dictionary contains a list of keys that you want to delete, and so you add all of those keys to the list of delete candidates.

If you really want a list as the result, at the end you can convert the set to a list.

Here's the code to do that:

dups = set()

for i in data:
    if i not in dups:
        dups = dups.union(data[i])

print(list(dups))

Result:

[1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 25, 26, 27, 28, 29, 31, 32, 33, 34, 35, 37, 38, 39, 41, 42, 43, 45, 46]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...