If you re-test the load then you have effectively increased the sample size. For magnum loads or any load that heats the barrel too much I would just suggest waiting to let the barrel cool to the same point to prevent heat from changing the results.
I would not consider a sample of 3 to represent with any significant level of confidence the performance of a large number of rounds. You can get pretty deep into the statistics involved but to predict the performance of a large number of rounds, say 100, you need to test more than 3.
I am not saying SD is not a valid way to measure a loads consistency, simply that low ES is a tougher standard to meet, like using a 5 shot group instead of 3. My reason for using ES is mostly this; SD is a statistical way to predict deviation from the norm given a certain data set, ES is the actual observed velocity spread. I place more value on my actual measured numbers than on a prediction.
I don't mean to say that ES is the whole story, I suppose you could have a load with very consistent velocities but once in awhile throws one shot 100fps high or low...both the SD and ES might be fine, but I would not trust it.
The bottom line I think is you have to test and test and test to be sure that you won't get some unpleasant surprise on the shot that counts.