Random forest method for interpreting results obtained by bioluminescence analysis of saliva in personalized diagnostics

UDC: 
519.254:519.257
Authors: 

G.V. Zhukova1, P.А. Martyshuk1, Е.R. Afer1, А.N. Shuvaev1, N.А. Rozanova2, D.V. Sergeev2, V.А. Kratasyuk1, 3

Organization: 

1Siberian Federal University, 79 Svobodnyi Av., Krasnoyarsk, 660041, Russian Federation
2Research Center for Neurology, 80 Volokolamskoe highway, Moscow, 125367, Russian Federation
3Institute of Biophysics, Siberian Department of the RAS, separate division of the Federal Research Center Krasnoyarsk Scientific Center of the Siberian Department of the Russian Academy of Sciences, 50 Akademgorodok, build. 50, Krasnoyarsk, 660036, Russian Federation

Abstract: 

Development of personalized medicine and biotechnologies is directly linked to obtaining relevant data, which largely depend on individual characteristics of examined patients. Permissible ranges of analyzed indicators that are commonly used in conventional medicine do not always describe a patient’s health adequately. It seems necessary to search for such data analysis techniques, which allow considering variable individual peculiarities of patients’ bodies and their lifestyles.

The aim of this study is to determine whether it is possible to use the Random Forest method for biomedical data analysis in order to achieve correct interpretation of results obtained by personalized diagnostic tests. Bioluminescent testing is used as an example since it estimates effects produced by various characteristics of examined patients and their living conditions. The method allows minimizing risks of incorrect diagnosis and adjusting monitoring schemes for specific patients.

This study relies on using the results obtained by diagnosing workloads of railway workers using the bioluminescence method. A patient’s health is assessed by examining effects produced by a patient’s saliva on intensity of the bi-enzyme system luminescence: NAD(P)H:FMN oxidoreductase + luciferase. This analysis is integral and responses to many factors, each of which can influence the analysis result. Effectiveness of various methods for data analysis is assessed on an example group made of traffic controllers employed by the Krasnoyarsk Branch of Russian Railways JSC. Both statistical methods and the Random Forest machine learning algorithm were used for data analysis.
As a result, our study has revealed that it is advisable to use the Random Forest method for assessing significance of some biochemical saliva indicators to predict health of railway workers. The method makes it possible to identify the most significant factors and create graphs to show partial influence exerted by various factors on the target variable. This study allows optimizing the system for health diagnostics using integral bioluminescence analysis. The Random Forest method can become a component of a personalized bioluminescent biosensor for assessing effects produced by stress and workloads on the body.

Keywords: 
personalized diagnostics, machine learning, data analysis, multifactorial analysis, saliva, bioluminescence, biosensor, signal systems
Received: 
30.06.2025
Approved: 
30.06.2025
Accepted for publication: 
30.06.2025

You are here