Remote photoplethysmography is susceptible to motion artifacts and individual physiological variations in complex environments. This paper proposes a remote heart rate estimation method based on frequency regulation and multi-scale spatio-temporal modeling. To address artifact noise issues, a frequency-regulated normalization module is designed to emphasize the dominant heart rate frequency while suppressing noise. To address the issue of individual physiological variations, the proposed method introduces a multi-level spatio-temporal feature fusion module to comprehensively capture physiological information through multi-scale convolutions and cross-layer integration. Subsequently, a dynamic weighting spatio-temporal feature module is introduced during spatio-temporal modeling to enhance long-term dependency modeling. Experimental results demonstrate that the proposed method achieves superior performance in cross-dataset evaluation. When trained on the PURE dataset and tested on the UBFC-rPPG dataset, the mean absolute error decreases from 1.31 to 1.28. Conversely, when trained on the UBFC-rPPG dataset and tested on the PURE dataset, the mean absolute error further decreases from 0.97 to 0.82. These results significantly outperform existing state-of-the-art methods, demonstrating the strong generalization capability and outstanding performance of our model across datasets. From the perspectives of frequency-regulated and multi-scale spatio-temporal modeling, this work enriches the modeling methodology for remote photoplethysmography pulse wave-based heart rate estimation, enhancing the stability and usability of remote heart rate estimation under complex interference and cross-scenario conditions.