Question about multi GPU inference

Is there any information on multi GPU inference? Does it just work automatically?

I see something about 8 40GB A100s being used for a single prediction in the docs. 

Does NVLink matter?