A sigmoid function has a distinctive S shape and it is a differentiable real function for any real input value. Its range is between 0 and 1. It is an activation function in the following form:
Its first derivative, which is used during backpropagation of the training step, has the following form:
However, a sigmoid function can cause the gradient vanishing problem or saturation of the gradient. It is also known to have a slow convergence. Therefore, in practical use, it is not recommended to use a sigmoid as the activation function, ReLU has become more popular.