fastMRI

About volume sampler

Hi, I have a question about VolumeSampler in fastmri.data.volume_sampler.py.
We can see the following code in VolumeSampler.
‘’’
need to send equal number of samples to each process - take the max
self.num_samples = max([len(l) for l in indices])

add extra samples to match num_samples
indices = indices + indices[: self.num_samples - len(indices)]
‘’’
Is this process for matching num_samples necessary? It seems to me that VolumeSampler is also performed well without this additional code.

Hello @osteology, this code was inserted per this GitHub issue: https://github.com/facebookresearch/fastMRI/issues/48.

Some users have encountered problems here, and I think it’s a well-known issue with NCCL. Note that the update matches the PyTorch DistributedSampler implementation:

https://pytorch.org/docs/stable/_modules/torch/utils/data/distributed.html#DistributedSampler

If you’re using the MriModule PyTorch Lightning module, the extra slices should be handled here: