list intersection algorithm implementation only using python lists (not sets)

Question

Welcome To Ask or Share your Answers For Others

list intersection algorithm implementation only using python lists (not sets)

asked Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

I've been trying to write down a list intersection algorithm in python that takes care of repetitions. I'm a newbie to python and programming so forgive me if this sounds inefficient, but I couldn't come up with anything else. Here, L1 and L2 are the two lists in question, and L is the intersection set.

Iterate through L1
Iterate through L2
If element is in L1 and in L2
add it to L
remove it from L1 and L2
iterate through L
add elements back to L1 and L2

I'm 100% sure this is not the algorithm used within Mathematica to evaluate list intersection, but I can't really come up with anything more efficient. I don't want to modify L1 and L2 in the process, hence me adding back the intersection to both lists. Any ideas? I don't want to make use of any built in functions/data types other than lists, so no import sets or anything like that. This is an algorithmic and implementation exercise, not a programming one, as far as I'm concerned.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

203 views

1 Answer

深蓝 · Answer 1 · 2022-01-31T07:23:28+0000

Here is a faster solution:

def intersect_sorted(a1, a2):
  """Yields the intersection of sorted lists a1 and a2, without deduplication.

  Execution time is O(min(lo + hi, lo * log(hi))), where lo == min(len(a1),
  len(a2)) and hi == max(len(a1), len(a2)). It can be faster depending on
  the data.
  """
  import bisect, math
  s1, s2 = len(a1), len(a2)
  i1 = i2 = 0
  if s1 and s1 + s2 > min(s1, s2) * math.log(max(s1, s2)) * 1.4426950408889634:
    bi = bisect.bisect_left
    while i1 < s1 and i2 < s2:
      v1, v2 = a1[i1], a2[i2]
      if v1 == v2:
        yield v1
        i1 += 1
        i2 += 1
      elif v1 < v2:
        i1 = bi(a1, v2, i1)
      else:
        i2 = bi(a2, v1, i2)
  else:  # The linear solution is faster.
    while i1 < s1 and i2 < s2:
      v1, v2 = a1[i1], a2[i2]
      if v1 == v2:
        yield v1
        i1 += 1
        i2 += 1
      elif v1 < v2:
        i1 += 1
      else:
        i2 += 1

It runs in O(min(n + m, n * log(m))) time where n is the minimum of the lengths and m is the maximum. It iterates over both lists at the same time, skipping as many elements in the beginning as possible.

An analysis is available here: http://ptspts.blogspot.ch/2015/11/how-to-compute-intersection-of-two.html

Categories

list intersection algorithm implementation only using python lists (not sets)

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags