Objective To explore potential host-derived biomarkers for active pulmonary tuberculosis and optimize early diagnosis and treatment monitoring strategies for tuberculosis. Methods Integrated analysis of two whole blood gene expression datasets (GSE19444 and GSE42830) from the GEO database (37 tuberculosis patients and 50 healthy controls) was performed. Differentially expressed genes (DEGs) were screened using GEO2R, and gene ontology enrichment, pathway analysis, and protein-protein interaction network construction were conducted using Metascape and STRING to identify key gene modules. Core genes were identified based on MCODE scores (≥10) and T-tests/ANOVA, and their expression trends and diagnostic efficacy were validated in independent datasets (GSE152532, GSE19435). The sensitivity and specificity of candidate biomarkers were assessed using receiver operating characteristic (ROC) curves. Results A total of 70 common DEGs were identified (59 upregulated and 11 downregulated), significantly enriched in type Ⅱ interferon signaling pathways and innate immune responses. Module analysis revealed 16 core genes (e.g., RTP4, GBP4, TRIM22) forming a high-confidence interaction network. Validation showed that the three genes were significantly overexpressed in tuberculosis patients (P<0.05) and dynamically downregulated with treatment. ROC curve analysis indicated that RTP4 had the best diagnostic efficacy with an area under curve (AUC) of 0.952 (sensitivity 100%, specificity 79%), while the three-gene combination (TBscore) had an AUC of 0.933 (sensitivity 100%, specificity 81%). Conclusion RTP4, GBP4, and TRIM22 are potential diagnostic biomarkers for active pulmonary tuberculosis, and their combination model (TBscore) exhibits high sensitivity and specificity, with dynamic expression levels reflecting treatment efficacy, providing new targets for precise diagnosis and treatment of tuberculosis.