如何使用ZIP并行处理迭代器

介绍

列表推导使通过应用表达式轻松获取源列表并获取派生列表。例如，假设我想将列表中的每个元素乘以5。在这里，我通过使用一个简单的for循环来做到这一点。

a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
multiply_by_5 = []
for x in a:
multiply_by_5.append(x*5)
print(f"Output \n *** {multiply_by_5}")

输出结果

*** [5, 10, 15, 20, 25, 30, 35, 40, 45, 50]

通过列表理解，我可以通过指定表达式和要循环的输入序列来实现相同的结果。

# List comprehension
multiply_by_5 = [x*2 for x in a]
print(f"Output \n *** {multiply_by_5}")

输出结果

*** [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

现在，让我们说您有几个要添加的列表。

# 1 . Create a List of Numbers
list1 = [100, 200, 300, 400]
list2 = [500, 600, 700, 800]

# 2. Add the two lists to create a new list
list3 = []

# Using a Loop.
for i in range(len(list1)):
added_value = list1[i] + list2[i]
list3.append(added_value)
print(f"Output \n*** {list3}")

输出结果

*** [600, 800, 1000, 1200]

现在重要的是附加值的派生列表（在我们的情况下为list3）中的项目通过其索引与源列表中的项目直接相关。

现在，就压缩而言，这是针对相同列表整数的zip解决方案。在这种情况下，有两个整数列表，一个包含100、200、300和400，一个包含500、600、700和800。当然，我们可以定义它们并将它们分配给变量。而且它们不必是列表。

它们可以是其他序列，例如元组，等等。

因此，我们要做的是将这些元素中的元素对压缩在一起，这样list1中的100个元素和list2中的500个元素将被压缩在一起，依此类推。对于每个元组，当我们遍历它们时，我们会将元组解压缩为变量a和b。

list4 = []
list4 = [(a + b) for a, b in zip(list1, list2)]
print(f"Output \n*** {list4}")

输出结果

*** [600, 800, 1000, 1200]

现在，上面的解决方案看起来真的很酷，但是在将它们应用到代码中之前，您需要了解一个严重的问题。

如果输入迭代器的长度不同，则zip内置函数的行为会很奇怪。让我们尝试一下。

# add a new number to the list
list1.append(1000)
print(f"Output \n*** Length of List1 is {len(list1)} , Length of List2 is {len(list2)}")

# run the zip against the list for addition.
list5 = [(a + b) for a, b in zip(list1, list2)]
print(f"*** {list5}")

输出结果

*** Length of List1 is 9 , Length of List2 is 4
*** [600, 800, 1000, 1200]

现在，当我们从列表3中打印出每个添加的数字时，您会注意到添加到列表1的数字丢失了，即使我们将其附加在列表1中并且不在列表1中也不会显示在zip的输出中。

这就是zip的工作方式。它使您保持元组状态，直到任何一个迭代器都用尽为止。因此，即使list1与list2相比还有更多的路要走，但它会先被耗尽，然后退出循环。

令人惊讶的是，没有任何例外通知您。因此，在生产中必须非常小心拉链。

您可以从itertools中的python最长的zip函数中选择此问题。

这个zip最长的是，即使其中一个迭代器已用尽，它也会继续前进。

from itertools import zip_longest

list6 = []
for a, b in zip_longest(list1, list2):
if b is None:
print(f" << do your logic here >> ")
elif a is None:
print(f" << do your logic here >> ")
else:
list6.append(a + b)
print(f"Output \n*** {list6}")

<< do your logic here >>
<< do your logic here >>
<< do your logic here >>
<< do your logic here >>
<< do your logic here >>

输出结果

*** [600, 800, 1000, 1200]

结论：

如果要并行迭代多个迭代器，则zip函数非常方便。
当传递不同长度的迭代器时，zip函数的工作原理有所不同。
如果您要使用不同长度的迭代器，请使用zip_longest。

基础教程