Factors That Can Trigger Depression

An Application of Machine Learning to National Health Survey Data in Brazil


  Haydée M. C. Silveira Batista Federal University of Rio de Janeiro
  • Andrea Federal University of Rio de Janeiro
  • Brenda Federal University of Rio de Janeiro
  • Nelson Francisco Federal University of Rio de Janeiro
  Ana Centro Federal de Educação Tecnológica Celso Suckow da Fonseca - CEFET




Depression, PNS, Machine learning, Logistic regression, Modeling


According to data from the last National Health Survey (PNS), conducted in 2013 by the Brazilian Institute of Geography and Statistics (IBGE) in partnership with the Ministry of Health, 7.6% of people aged 18 and over received diagnosis of depression. Therefore, based on this research, the purpose of this study was to identify factors that may be relevant to a possible diagnosis of depression, using machine learning techniques. The binary logistic regression model was chosen as the machine learning technique, with progressive and regressive methods for selecting variables and a model built by the researcher, generating seven different models. The model’s performance evaluation was made by comparing some metrics such as Cox-Snell R2 and Nagelkerke R2, which presented remarkably close results. Based on these models, 37 explanatory variables were selected which were applied to a new logistic regression model. The results showed that some variables significantly increased the chance of a positive diagnosis of depression as well as some variables were indicative of a reduction in the chances of this diagnosis.


    Doctorate student in Computational Systems / Civil Engineering at the Federal University of Rio de Janeiro – Ufrj / Coppe. Master’s in Management Systems by Fluminense Federal University – UFF. Degree in Business Management by Cândido Mendes University - UCAM. Fluent in English, French, Spanish and Portuguese.

    Doutor em Engenharia Civil pela Universidade Federal do Rio de Janeiro em 1977. Desde 1990 é PROFESSOR TITULAR da Universidade Federal do Rio de Janeiro. Atua em áreas interdisciplinares da Engenharia e Engenharia de Petróleo, com ênfase em Sistemas Computacionais. Em seu currículo Lattes os termos mais freqüentes na contextualização da produção científica, tecnológica e artístico-cultural são: Data Mining, Estruturas, Offshore Structures, Redes Neurais, Neural Networks, Análise Não-Linear, Large Scale Computation, Métodos Computacionais, Método dos Elementos Finitos e Structural Analysis. Membro sênior do IEEE e ACM. Em 1985 participou do Projeto de Computação Paralela e em 1995 da criação do Núcleo de Computação de Alto Desempenho da COPPE. Desenvolve modelos para sistemas complexos, grandes massas de dados e integra ideias e ferramentas computacionais. 


SILVEIRA BATISTA, Haydée M. C.; PAIM, Andrea Borges; SIQUEIRA, Brenda Santos; EBECKEN, Nelson Francisco Favilla; DIAS, Ana Claudia. Factors That Can Trigger Depression: An Application of Machine Learning to National Health Survey Data in Brazil. P2P & INOVAÇÃO, Rio de Janeiro, RJ, v. 7, n. 2, p. 164–185, 2021.

