Predicting Lung Disease Using Machine Learning: A Comparative Study of Logistic Regression, k-Nearest Neighbors, and Naive Bayes
Main Article Content
Abstract
Machine Learning (ML) has emerged as a transformative technology across various domains, revolutionizing industries such as healthcare, finance, and manufacturing. This paper provides a comprehensive review of recent advancements in ML, focusing on key algorithms, applications, and challenges. Specifically, it explores the use of common ML algorithms—Logistic Regression (LR), k-Nearest Neighbors (KNN), and Naive Bayes (NB)—for predicting lung disease based on a dataset containing 1,000 instances. The study demonstrates how these algorithms are applied to classify lung disease cases, highlighting their effectiveness and comparing performance metrics. Despite the advancements, challenges such as the need for large labeled datasets, model interpretability, computational complexity, and biases in predictions remain. The paper also discusses future research directions, including few-shot learning, explainable AI, transfer learning, edge computing, and ethical AI, which aim to address current limitations and unlock new potentials for ML in healthcare and beyond.
Article Details
Section

This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in the International Journal of Future Research and Innovation (IJFRI) are licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). Authors retain copyright and grant IJFRI the non-exclusive right to publish, distribute, and archive the work.
How to Cite
References
[1] Jung, Tony, and Neeraj Vij. "Early diagnosis and real-time monitoring of regional lung function changes to prevent chronic obstructive pulmonary disease progression to severe emphysema." Journal of Clinical Medicine 10, no. 24 (2021): 5811.
[2] Kaplan, Alan, Hui Cao, J. Mark FitzGerald, Nick Iannotti, Eric Yang, Janwillem WH Kocks, Konstantinos Kostikas et al. "Artificial intelligence/machine learning in respiratory medicine and potential role in asthma and COPD diagnosis." The Journal of Allergy and Clinical Immunology: In Practice 9, no. 6 (2021): 2255-2261.
[3] De Menezes, Fortunato S., Gilberto R. Liska, Marcelo A. Cirillo, and Mário JF Vivanco. "Data classification with binary response through the Boosting algorithm and logistic regression." Expert Systems with Applications 69 (2017): 62-73.
[4] Obid, Salimov Jamshid, Xudoyqulov Diyorbek Shakar O‘G‘Li, and Avazov Asadbek Egamberdi O‘G‘Li. "Non-Parametric Methods. K-Nearest Neighbors Model." International Journal of Advance Scientific Research 3, no. 12 (2023): 18-25.
[5] Krishnaiah, V., Gugulothu Narsimha, and N. Subhash Chandra. "Diagnosis of lung cancer prediction system using data mining classification techniques." International Journal of Computer Science and Information Technologies 4, no. 1 (2013): 39-45.
[6] Das, Arun, and Paul Rad. "Opportunities and challenges in explainable artificial intelligence (xai): A survey." arXiv preprint arXiv:2006.11371 (2020).
[7] Indaryono, Nicolaus Advendea Prakoso, Rd Rohmat Saedudin, and Faqih Hamami. "COMPARISON ANALYSIS OF RANDOM FOREST AND NAÏVE BAYES ALGORITHMS FORRAINFALL CLASSIFICATION BASED ON CLIMATE IN INDONESIA." SITEKNIK: Journal of Information Systems, Engineering and Applied Technology 1, no. 2 (2024): 102-109.
[8] Prakash, S., B. Kalaiselvi, and K. Sivachandar. "Recognizing Fake Documents by Instance-Based ML Algorithm Tuning with Neighborhood Size." Journal of Applied Data Sciences 6, no. 2 (2025): 1214-1228.
[9] Ramírez, Christian A. Mejía, and Mario Graff Guerrero. "Comparison of some Logistic Regression Methodologies in Supervised Classification for Functional Data." In 2024 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), vol. 8, pp. 1-9. IEEE, 2024.
[10] Sathyanarayanan, S., and B. Roopashri Tantri. "Confusion matrix-based performance evaluation metrics." African Journal of Biomedical Research (2024): 4023-4031.
[11] https://www.kaggle.com/code/syedali110/lungs-disease-prediction-xgboost-adaboost-and-rf