Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have two dataloaders and I would like to merge them without redefining the datasets, in my case train_dataset and val_dataset.

train_loader = DataLoader(train_dataset, batch_size = 512, drop_last=True,shuffle=True)
val_loader = DataLoader(val_dataset, batch_size = 512, drop_last=False)

Wanted result:

train_loader = train_loader + val_loader 
question from:https://stackoverflow.com/questions/65621414/how-to-merge-two-torch-utils-data-dataloaders-with-a-single-operation

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.0k views
Welcome To Ask or Share your Answers For Others

1 Answer

Data loaders are iterators, you can implement a function that returns an iterator which yields the dataloaders' content, one dataloader after the other.

Given a number of iterators itrs, it would iterate over each iterator and in turn iterate over each iterator yielding one batch at a time. A possible implementation would be as simple as:

def itr_merge(*itrs):
    for itr in itrs:
        for v in itr:
            yield v

Here is an usage example:

>>> dl1 = DataLoader(TensorDataset(torch.zeros(5, 1)), batch_size=2, drop_last=True)
>>> dl2 = DataLoader(TensorDataset(torch.ones(10, 1)), batch_size=2)

>>> for x in itr_merge(dl1, dl2):
>>>   print(x)
[tensor([[0.], [0.]])]
[tensor([[0.], [0.]])]
[tensor([[1.], [1.]])]
[tensor([[1.], [1.]])]
[tensor([[1.], [1.]])]
[tensor([[1.], [1.]])]
[tensor([[1.], [1.]])]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...