To wrap your optimizer in a distributed optimizer allowing replicas to communicate with each other during the optimize
import engineml.tensorflow as emlimport tensorflow as tfopt = tf.train.AdamOptimizer()opt = eml.optimizer.distribute(opt)
When training your model across multiple replicas, you are scaling the effective batch size. In order to compensate for this larger batch size, it is a best practice to also scale your optimizer's learning rate.
Linear Scaling Rule: When the minibatch size is multiplied by k, multiply the learning rate by k.
learning_rate = 0.01learning_rate = eml.optimizer.scale_learning_rate(learning_rate)