Compare the memory usage of itertools.chain(list1, list2) versus list1 + list2. Why is chain preferred for large datasets?
Python interview question for Advanced practice.
Answer
list1 + list2 (Concatenation) creates a new list containing copies of references from both source lists. If l1 and l2 have 1 million items each, the result is a new 2 million item list. This triples the memory usage (l1 + l2 + result) and is an O(n+m) operation. itertools.chain(l1, l2) returns an iterator. It essentially yields items from l1 until exhausted, then yields from l2. It does not allocate memory for the combined sequence. It uses O(1) extra memory. This is critical for processing large streams of data.
Explanation
chain works with infinite iterators, whereas + would hang indefinitely.