.. (לתיקייה המכילה) | ||
MoCo paper performs shuffling between GPUs to prevent communication via BatchNorm. Are we supposed to implement it? If so, how? | |
You are allowed to request up to 2 GPUs on the course server. Thus you can implement this feature and use 2 GPUs for shuffling. If you are able to achieve good results without this feature, this is OK too. Sinve 2 GPU wait time may be long, we advise to use single-GPU instance for debugging and initial implementation and only use 2 GPUs at later stages. |