Abstract Background Serum follicle stimulating hormone (FSH) levels are used clinically to evaluate infertility, pituitary and gonadal disorders. With increased frequency of research collaborations across institutions, it is essential that inter-laboratory validation is addressed. Methods An inter-laboratory validation of three commercial FSH immunoassays was performed with human serum samples of varying frozen storage length (2 batches of 15 samples each) at -25 degree C. Percentage differences and Bland-Altman limits of agreement were calculated. Results The inter- and intra-laboratory consistency of FSH values with the same assay manufacturer was much higher after shorter-term storage (frozen for less than 11 months, mean percentage degradation less than 4%) than after long-term storage (2-3 years, mean percentage degradation = 23%). Comparing assay results from different manufacturers, there was similar overall long term degradation as seen with the same manufacturer (-25%), however the degradation was greater when the original FSH was greater than 20 mIU/mL relative to less than 10 mIU/mL (p < 0.001 trend test). Conclusion The findings suggest that degradation of serum samples stored between 11 months and 2-3 years at -25 degrees C can lead to unstable FSH measurements. Inter-laboratory variability due to frozen storage time and manufacturer differences in assay results should be accounted for when designing and implementing research or clinical quality control activities involving serum FSH at multiple study sites.