Linnan Zhang
You may often see some people use an extra step called pre-train prior to actual training of the model. Submit a report about what the pre-train in CNN is and why we use it.
We can conduct a prior probability distribution model over the parameter of the model that we are going to use for the data. The result is simply about whether a prior is strong or weak.
A weak prior is a prior distribution with high entropy, such as a Gaussian distribution with high variance. If this is the case, data could move the parameter within the model more freely.
A strong prior is a prior distribution with low entropy, meaning we can see how parameter moves.
A parameter with infinite strong prior is completely irrelevant for the data.
By conducting this extra step, we could have a better picture of our model with how significant of the parameters. It helps save us time to better modify the model to suit the data.