Mobile QR Code QR CODE : Journal of the Korean Society of Civil Engineers

  1. ์ •ํšŒ์›โ€ค์•„์ฃผ๋Œ€ํ•™๊ต ๊ฑด์„ค์‹œ์Šคํ…œ๊ณตํ•™๊ณผ, ๊ณตํ•™์„์‚ฌ (Ajou Universityโ€คhjjs1201@ajou.ac.kr)
  2. ์ข…์‹ ํšŒ์›โ€ค๊ต์‹ ์ €์žโ€ค์•„์ฃผ๋Œ€ํ•™๊ต ๊ฑด์„ค์‹œ์Šคํ…œ๊ณตํ•™๊ณผ ๊ต์ˆ˜, ๊ณตํ•™๋ฐ•์‚ฌ (Corresponding Authorโ€คAjou Universityโ€คconc@ajou.ac.kr)



๋จธ์‹ ๋Ÿฌ๋‹, ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด, ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ, ๊ต๋Ÿ‰ ์•ˆ์ „๋“ฑ๊ธ‰, ์œ ์ง€๊ด€๋ฆฌ
Machine learning, Decision tree, Random forest, Safety grade of bridges, Maintenance

1. ์„œ ๋ก 

์‹œ์„ค๋ฌผ ๊ณต์šฉ์—ฐ์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ ๋…ธํ›„ํ™”๋œ ์‹œ์„ค๋ฌผ์˜ ์•ˆ์ „์„ฑ ํ™•๋ณด๋ฅผ ์œ„ํ•ด ์˜ˆ๋ฐฉ์  ์œ ์ง€๊ด€๋ฆฌ ์‹œ์Šคํ…œ ๊ตฌ์ถ•์— ๋Œ€ํ•œ ์ค‘์š”์„ฑ์ด ๊ฐ•์กฐ๋˜๊ณ  ์žˆ๋‹ค. ๊ตญํ† ๊ตํ†ต๋ถ€ ์ „๊ตญ๊ต๋Ÿ‰ํ‘œ์ค€๋ฐ์ดํ„ฐ(MOLIT, 2021b)์— ๋”ฐ๋ฅด๋ฉด Fig. 1๊ณผ ๊ฐ™์ด ๊ณต์šฉ์—ฐ์ˆ˜ 30๋…„ ์ด์ƒ์ธ ๋…ธํ›„ ๊ต๋Ÿ‰์˜ ์ˆ˜๊ฐ€ 2030๋…„์—๋Š” ์ „์ฒด ๊ต๋Ÿ‰์˜ 42.5%๋ฅผ ์ฐจ์ง€ํ•˜๋Š” ๋“ฑ ๋ณด์ˆ˜๋ณด๊ฐ•์ด ํ•„์š”ํ•œ ๊ต๋Ÿ‰๋“ค์ด ์ ์ฐจ ์ฆ๊ฐ€ํ•  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋œ๋‹ค. ๋”ฐ๋ผ์„œ ๊ต๋Ÿ‰์˜ ์•ˆ์ „์„ฑ ๋ฐ ์‚ฌ์šฉ์„ฑ ์ €ํ•˜์™€ ๋ณด์ˆ˜๋ณด๊ฐ• ๋น„์šฉ ์ฆ๊ฐ€์— ๋Œ€๋น„ํ•œ ์„ ์ œ์  ์œ ์ง€๊ด€๋ฆฌ๊ฐ€ ํ•„์š”ํ•˜๋‹ค.

โ€œ์‹œ์„ค๋ฌผ์˜ ์•ˆ์ „ ๋ฐ ์œ ์ง€๊ด€๋ฆฌ์— ๊ด€ํ•œ ํŠน๋ณ„๋ฒ•โ€(์ดํ•˜ ์‹œ์•ˆ๋ฒ•) (MOLIT, 2021c)์— ๋”ฐ๋ผ ์ œ1~3์ข…์‹œ์„ค๋ฌผ์— ํ•ด๋‹นํ•˜๋Š” ๊ต๋Ÿ‰์€ ์•ˆ์ „๋“ฑ๊ธ‰์— ๋”ฐ๋ผ ์ ๊ฒ€ ๋ฐ ์ง„๋‹จ์„ ์ฃผ๊ธฐ์ ์œผ๋กœ ์‹ค์‹œํ•ด์•ผ ํ•œ๋‹ค. ์ ๊ฒ€ ๋ฐ ์ง„๋‹จ์˜ ์ข…๋ฅ˜๋กœ๋Š” ์ •๊ธฐ์•ˆ์ „์ ๊ฒ€(Periodic safety inspection), ์ •๋ฐ€์•ˆ์ „์ ๊ฒ€(Full safety inspection), ๊ทธ๋ฆฌ๊ณ  ์ •๋ฐ€์•ˆ์ „์ง„๋‹จ(Full safety examination)์ด ์žˆ๋‹ค. ์ •๊ธฐ์•ˆ์ „์ ๊ฒ€์€ ์™ธ๊ด€์กฐ์‚ฌ ์ˆ˜์ค€์˜ ์ ๊ฒ€์ด๋ฉฐ, ์ •๋ฐ€์•ˆ์ „์ ๊ฒ€์€ ๋ฉด๋ฐ€ํ•œ ์™ธ๊ด€์กฐ์‚ฌ์™€ ํ•จ๊ป˜ ๊ฐ„๋‹จํ•œ ์ธก์ • ๋ฐ ์‹œํ—˜์„ ์‹ค์‹œํ•˜๋Š” ์ ๊ฒ€์ด๋‹ค. ํ•œํŽธ, ์ •๋ฐ€์•ˆ์ „์ง„๋‹จ์—์„œ๋Š” ์ •๋ฐ€ํ•œ ์™ธ๊ด€์กฐ์‚ฌ์™€ ๋”๋ถˆ์–ด ๊ฐ์ข… ์ธก์ •ยท์‹œํ—˜์žฅ๋น„๋ฅผ ์ด์šฉํ•˜์—ฌ ์‹œ์„ค๋ฌผ์˜ ์ƒํƒœ ๋ฐ ์•ˆ์ „์„ฑ ํ‰๊ฐ€์— ๋Œ€ํ•œ ์ƒ์„ธ ๋ฐ์ดํ„ฐ๋ฅผ ํ™•๋ณดํ•œ๋‹ค. ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰์€ ์ ๊ฒ€ ๋ฐ ์ง„๋‹จ ๊ฒฐ๊ณผ์— ๋”ฐ๋ผ A(์šฐ์ˆ˜), B(์–‘ํ˜ธ), C(๋ณดํ†ต), D(๋ฏธํก), E(๋ถˆ๋Ÿ‰)์˜ ์ด 5๋‹จ๊ณ„๋กœ ๋ถ„๋ฅ˜๋˜๋ฉฐ, ์•ˆ์ „๋“ฑ๊ธ‰์— ๋”ฐ๋ฅธ ์ ๊ฒ€ ๋ฐ ์ง„๋‹จ ์ฃผ๊ธฐ๋Š” Table 1์— ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค(MOLIT, 2021a).

์ตœ๊ทผ ์ •์ž๊ต ๋ถ•๊ดด ์‚ฌ๊ณ ์—์„œ ๋ณด๋“ฏ์ด ์ค€๊ณต ํ›„ 30๋…„ ์ด์ƒ ๊ฒฝ๊ณผ๋œ ๋…ธํ›„ ๊ต๋Ÿ‰์— ๋Œ€ํ•œ ์šฐ๋ ค๊ฐ€ ์ปค์ง€๋ฉด์„œ ์ ๊ฒ€, ์ง„๋‹จ ๋ฐ ๋ณด์ˆ˜๋ณด๊ฐ•์„ ํฌํ•จํ•œ ์œ ์ง€๊ด€๋ฆฌ ๋Œ€์ฑ…์„ ํ•ฉ๋ฆฌ์ ์œผ๋กœ ์ˆ˜๋ฆฝํ•ด์•ผ ํ•  ํ•„์š”์„ฑ์ด ๋”์šฑ ๊ฐ•์กฐ๋˜๊ณ  ์žˆ๋‹ค(Yonhapnews, 2023). ๊ทธ๋Ÿฐ๋ฐ, ๊ต๋Ÿ‰์˜ ์•ˆ์ „์„ฑ๊ณผ ์‚ฌ์šฉ์„ฑ์€ ์ฃผ๋กœ ์•ˆ์ „๋“ฑ๊ธ‰์„ ํ†ตํ•ด ๊ด€๋ฆฌํ•˜๊ณ  ์žˆ์œผ๋ฏ€๋กœ, ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ์•ˆ์ „๋“ฑ๊ธ‰์˜ ํŒ์ •์€ ๋งค์šฐ ์ค‘์š”ํ•˜๋‹ค. ๋˜ํ•œ, ์ฃผ๊ธฐ์ ์ธ ์•ˆ์ „์ ๊ฒ€ ๋ฐ ์ง„๋‹จ ํ™œ๋™์— ๋”ฐ๋ฅธ ๋ณด์ˆ˜๋ณด๊ฐ• ์กฐ์น˜๋Š” ์‹œ์„ค๋ฌผ์˜ ์•ˆ์ „๋„๋ฅผ ์œ ์ง€ํ•˜๊ฑฐ๋‚˜ ์ƒํ–ฅ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ํ•„์ˆ˜์ ์œผ๋กœ ์š”๊ตฌ๋˜๋ฏ€๋กœ(Kang et al., 2016) ์‹œ์„ค๋ฌผ ์œ ์ง€๊ด€๋ฆฌ์— ์žˆ์–ด ์ ๊ฒ€ ์‹œ๊ธฐ๋ฅผ ์ค€์ˆ˜ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๊ตญ๋‚ด์˜ ๊ฒฝ์šฐ ์‚ฌํ›„ ์œ ์ง€๊ด€๋ฆฌ์— ์ดˆ์ ์ด ๋งž์ถ”์–ด์ ธ ์žˆ์–ด ๊ด€๋ฆฌ์‹œ์Šคํ…œ์˜ ๊ฐœ๋ฐœ์ด ๋ฏธํกํ•˜๊ณ , ํ˜„ํ™ฉ ์ž…๋ ฅ๊ณผ ์šด์˜ ์ƒํ™ฉ ๋˜ํ•œ ๋ถ€์ •ํ™•ํ•˜๊ฒŒ ๊ด€๋ฆฌ๋˜๊ณ  ์žˆ๋‹ค(Kim and Yoon, 2018). 2021๋…„ ๊ธฐ์ค€์œผ๋กœ ์ผ๋ฐ˜๊ตญ๋„์ƒ์— ์œ„์น˜ํ•œ ์‹œ์•ˆ๋ฒ• ์ ์šฉ ๋Œ€์ƒ์ธ ์ œ1~3์ข…์‹œ์„ค๋ฌผ์— ํ•ด๋‹นํ•˜๋Š” ๊ต๋Ÿ‰ 5,600๊ฐœ ์ค‘ ์ ๊ฒ€ ๋ฏธ์‹ค์‹œ ๋ฐ ์ ๊ฒ€ ์‹œ๊ธฐ ๋ฏธ์ค€์ˆ˜ ๊ต๋Ÿ‰์€ 101๊ฐœ๋กœ 1.8%์— ํ•ด๋‹น๋œ๋‹ค. ๋˜ํ•œ ์‹œ์•ˆ๋ฒ• ๋Œ€์ƒ์ด ์•„๋‹Œ ๊ต๋Ÿ‰์€ ์•ˆ์ „์ ๊ฒ€ ๋ฐ ์ง„๋‹จ์ด ํ•„์ˆ˜์ ์ด์ง€ ์•Š์œผ๋ฏ€๋กœ ์žฅ๊ธฐ๊ฐ„ ์•ˆ์ „์ ๊ฒ€์„ ์‹ค์‹œํ•˜์ง€ ์•Š์€ ์ถ”๊ฐ€์ ์ธ ๊ต๋Ÿ‰๋“ค๋„ ๋‹ค์ˆ˜ ์กด์žฌํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ๊ต๋Ÿ‰๋“ค์€ ๊ต๋Ÿ‰์˜ ์„ฑ๋Šฅ ๋ฐ ์ƒํƒœ ํŒŒ์•…์„ ํ†ตํ•œ ์„ ์ œ์  ๋ณด์ˆ˜๋ณด๊ฐ• ์กฐ์น˜๋ฅผ ์ทจํ•˜๊ธฐ ์–ด๋ ต๊ธฐ ๋•Œ๋ฌธ์— ์•ˆ์ „์„ฑ ๋ฐ ๋‚ด๊ตฌ์„ฑ์˜ ์ €ํ•˜๊ฐ€ ์šฐ๋ ค๋œ๋‹ค.

๋˜ํ•œ, 2018๋…„ ์‹œ์•ˆ๋ฒ• ๊ฐœ์ •์— ๋”ฐ๋ผ ๊ธฐ์กด์— ์‹œ์•ˆ๋ฒ• ๋Œ€์ƒ์ด ์•„๋‹ˆ์—ˆ๋˜ ์†Œ๊ทœ๋ชจ ๊ต๋Ÿ‰๋„ ๊ณต์šฉ๊ธฐ๊ฐ„ 10๋…„์ด ๊ฒฝ๊ณผ๋˜๋ฉด ์‹œ์•ˆ๋ฒ• ๋Œ€์ƒ์ธ ์ œ3์ข…์‹œ์„ค๋ฌผ๋กœ ์ƒˆ๋กœ์ด ์ง€์ •๋˜์—ˆ๋‹ค. ์ด๋กœ ์ธํ•ด ์‹œ์•ˆ๋ฒ•์— ๋”ฐ๋ฅธ ๋Œ€์ƒ ๊ต๋Ÿ‰์ด ์ฆ๊ฐ€ํ•˜๋ฉด์„œ ์•ˆ์ „์ ๊ฒ€ ๋Œ€์ƒ ๊ต๋Ÿ‰์ด ์ ์ฐจ ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ์Œ์—๋„ ๊ด€๋ฆฌ ์ธ๋ ฅ ๋ถ€์กฑ ๋ฐ ์œ ์ง€๊ด€๋ฆฌ ์˜ˆ์‚ฐ ๋ถ€์กฑ ๋“ฑ์˜ ๋ฌธ์ œ๋กœ ์ ์ ˆํ•˜๊ณ  ์ฒด๊ณ„์ ์ธ ์•ˆ์ „๊ด€๋ฆฌ๊ฐ€ ์ด๋ฃจ์–ด์ง€์ง€ ์•Š๊ณ  ์žˆ๋Š” ์‹ค์ •์ด๋‹ค(Lee et al., 2019a). ๊ทธ๋ฆฌ๊ณ  ๊ตญ๋‚ด์™€ ํ•ด์™ธ์˜ ๊ต๋Ÿ‰ ์ ๊ฒ€ ์ฃผ๊ธฐ๋ฅผ ๋น„๊ตํ•ด์„œ ์‚ดํŽด๋ณด๋ฉด ๊ตญ๋‚ด์˜ ์ ๊ฒ€ ์ฃผ๊ธฐ๊ฐ€ ๋น„๊ต์  ์งง์•„์„œ ์ด ๋˜ํ•œ ๊ด€๋ฆฌ ์ธ๋ ฅ ๋ถ€์กฑ ํ˜„์ƒ์„ ์‹ฌํ™”์‹œํ‚ค๊ณ  ์žˆ๋‹ค(Lee and Kim, 2015). ์ด์ฒ˜๋Ÿผ ์ ๊ฒ€ ๋Œ€์ƒ ๊ต๋Ÿ‰์˜ ์ฆ๊ฐ€์™€ ์งง์€ ์ ๊ฒ€ ์ฃผ๊ธฐ์— ๋”ฐ๋ฅธ ์ „๋ฌธ ์ธ๋ ฅ ๋ฐ ๋น„์šฉ ๋ถ€์กฑ์œผ๋กœ ์ธํ•ด ํ•ฉ๋ฆฌ์ ์ธ ์ ๊ฒ€ ๋ฐ ์ง„๋‹จ์ด ์–ด๋ ค์šด ์‹ค์ •์ด๋‹ค(Kang, 2016).

์ด์— ๋Œ€ํ•œ ๋Œ€์ฑ…์œผ๋กœ ์ธ๋ ฅ ๋Œ€์ฒด ๋ฐ ๋น„์šฉ ์ ˆ๊ฐ์„ ์œ„ํ•ด ๊ต๋Ÿ‰์˜ ์œ ์ง€๊ด€๋ฆฌ ์ฒด๊ณ„์— ์ธ๊ณต์ง€๋Šฅ๊ณผ ํ™•๋ฅ ๋ก ์  ๊ธฐ๋ฒ• ๋“ฑ์„ ์ด์šฉํ•˜์—ฌ ๊ต๋Ÿ‰์˜ ๋ถ€์žฌ๋ณ„ ์—ดํ™”๋„๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ์—ฐ๊ตฌ ๋“ฑ์ด ์ง„ํ–‰๋˜๊ณ  ์žˆ๋‹ค.

๊ตญ๋‚ด์—์„œ๋Š” ๊ต๋Ÿ‰ ์†์ƒ ์ •๋„๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ์ธ๊ณต์‹ ๊ฒฝ๋ง ๋ชจ๋ธ(Oh et al., 2010)๊ณผ ๋ฒ ์ด์ง€์•ˆ ๊ธฐ๋ฒ•์„ ํ†ตํ•œ ๊ต๋Ÿ‰์˜ ๋ถ€์žฌ ์ƒํƒœ ์˜ˆ์ธก ๋ชจ๋ธ(Lee et al., 2018)์ด ์ œ์•ˆ๋œ ๋ฐ” ์žˆ๋‹ค. ํ•ด์™ธ์˜ ์—ฐ๊ตฌ๋กœ๋Š” ์ธ๊ณต์ง€๋Šฅ์„ ์ด์šฉํ•œ ๊ต๋Ÿ‰ ๋ถ€์žฌ๋ณ„ ์ƒํƒœ๋“ฑ๊ธ‰ ์˜ˆ์ธก(Bektas et al., 2013; Nguyen and Dinh, 2019) ๋“ฑ์ด ์žˆ๋‹ค. ์ด์ฒ˜๋Ÿผ ๊ต๋Ÿ‰ ๋ถ€์žฌ๋ณ„ ์ƒํƒœ ๋ฐ ์†์ƒ ์ •๋„๋ฅผ ์˜ˆ์ธกํ•˜๋ ค๋Š” ์—ฐ๊ตฌ๋Š” ๋‹ค์ˆ˜ ์กด์žฌํ•˜์ง€๋งŒ, ๋ถ€์žฌ๋ณ„ ์ƒํƒœ๋Š” ์•ˆ์ „๋“ฑ๊ธ‰๊ณผ ๊ฐ™์ด ๊ต๋Ÿ‰ ์ „์ฒด์˜ ์•ˆ์ „์„ฑ ๋ฐ ๋‚ด๊ตฌ์„ฑ์„ ๋Œ€ํ‘œํ•œ๋‹ค๊ณ  ๋ณด๊ธฐ ์–ด๋ ต๋‹ค. ํ•œํŽธ ๊ต๋Ÿ‰ ์ƒํƒœ ์ง€์ˆ˜(BCI: Bridge Condition Index)์˜ ์˜ˆ์ธก ๋ชจ๋ธ ์—ฐ๊ตฌ(Martinez et al., 2020)๋„ ์กด์žฌํ•˜๋Š”๋ฐ, ๊ต๋Ÿ‰ ์ƒํƒœ ์ง€์ˆ˜๋Š” ์ˆ˜์น˜๋กœ ํ‘œํ˜„๋˜์–ด ์žˆ์–ด ์ด์— ์ ํ•ฉํ•œ ๋จธ์‹ ๋Ÿฌ๋‹์˜ ํšŒ๊ท€๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•œ ๋ฐ” ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ๊ตญ๋‚ด ๊ต๋Ÿ‰์—์„œ ์‚ฌ์šฉํ•˜๋Š” ์•ˆ์ „๋“ฑ๊ธ‰์€ ์ˆ˜์น˜๋กœ ํ‘œํ˜„๋˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์ด๋Ÿฌํ•œ ํšŒ๊ท€๊ธฐ๋ฒ•์„ ์ ์šฉํ•˜๋Š” ๊ฒƒ์€ ์ ํ•ฉํ•˜์ง€ ์•Š๋‹ค. ๋‹ค๋งŒ ๊ตญ๋‚ด์—์„œ ๊ทœ์น™ ๊ธฐ๋ฐ˜ ๋ถ„๋ฅ˜ ๊ธฐ๋ฒ•์œผ๋กœ ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰์„ ์ถ”์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•(Chung et al., 2016)๋„ ์‹œ๋„๋œ ๋ฐ” ์žˆ์œผ๋ฉฐ, ๊ต๋Ÿ‰ ๋“ฑ๊ธ‰์— ์žˆ์–ด C๋“ฑ๊ธ‰ ์ดํ•˜ ๊ต๋Ÿ‰์„ P(Poor)๋กœ, A์™€ B๋“ฑ๊ธ‰์€ G(Good)๋กœ ๋ถ„๋ฅ˜ํ•˜์—ฌ ๋‘ ๊ฐ€์ง€๋กœ ํ†ตํ•ฉํ•˜๊ณ , ์ด๋ฅผ ์ด์ง„ ๋ถ„๋ฅ˜๋ฅผ ํ†ตํ•ด ์˜ˆ์ธกํ•˜์˜€๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ A์™€ B๋“ฑ๊ธ‰์€ ๊ฐ ๋“ฑ๊ธ‰์— ํ•ด๋‹น๋˜๋Š” ๊ต๋Ÿ‰ ์ˆ˜๊ฐ€ ๋งŽ์œผ๋ฉฐ, ๊ต๋Ÿ‰ ์œ ์ง€๊ด€๋ฆฌ ์ธก๋ฉด์— ์žˆ์–ด ๋“ฑ๊ธ‰๋ณ„๋กœ ์ ์ ˆํ•œ ์œ ์ง€๊ด€๋ฆฌ ์˜ˆ์‚ฐ ๋ฐฐ๋ถ„ ๋ฐ ๋ณด์ˆ˜๋ณด๊ฐ• ์กฐ์น˜๊ฐ€ ์š”๊ตฌ๋˜๊ธฐ ๋•Œ๋ฌธ์— A์™€ B๋“ฑ๊ธ‰๋„ ๊ตฌ๋ถ„ํ•˜์—ฌ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด ๋ฐ”๋žŒ์งํ•˜๋‹ค. ์ด๋Š” ๋Œ€์ƒ์„ 2๊ฐ€์ง€ ์ข…๋ฅ˜๋กœ๋งŒ ๊ตฌ๋ถ„ํ•˜๋Š” ๋น„๊ต์  ๋‹จ์ˆœํ•œ ์ด์ง„ ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•œ ๋ฐ ๋”ฐ๋ฅด๋Š” ํ•œ๊ณ„๋กœ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

๋”ฐ๋ผ์„œ, ์ด๋Ÿฌํ•œ ๊ธฐ์กด ์—ฐ๊ตฌ์˜ ํ•œ๊ณ„์ ์„ ๊ทน๋ณตํ•˜๊ณ ์ž ์ด ์—ฐ๊ตฌ์—์„œ๋Š” ์•ˆ์ „์ ๊ฒ€ ๋ฏธ์‹ค์‹œ ๋ฐ ์ฃผ๊ธฐ ๋ฏธ์ค€์ˆ˜ ๊ต๋Ÿ‰์˜ ์กฐ์†ํ•œ ์•ˆ์ „๋“ฑ๊ธ‰ ํŒŒ์•… ๋ฐ ํŠน์ • ์‹œ๊ธฐ์˜ ์•ˆ์ „๋“ฑ๊ธ‰ ์˜ˆ์ธก์„ ํ†ตํ•œ ์„ ์ œ์ ์ด๊ณ  ๊ฒฝ์ œ์ ์ธ ์œ ์ง€๊ด€๋ฆฌ ๊ณ„ํš์„ ์ˆ˜๋ฆฝํ•˜๊ธฐ ์œ„ํ•ด ์ธ๊ณต์ง€๋Šฅ ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•˜์—ฌ ๊ต๋Ÿ‰์˜ ํ•ฉ๋ฆฌ์ ์ธ ์•ˆ์ „๋“ฑ๊ธ‰ ์˜ˆ์ธก ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ ๋จธ์‹ ๋Ÿฌ๋‹์˜ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด ๋ฐ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•œ ๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ์ด์šฉํ•˜์—ฌ ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰์„ A๋“ฑ๊ธ‰, B๋“ฑ๊ธ‰๊ณผ C, D๋“ฑ๊ธ‰์œผ๋กœ ๋‚˜๋ˆ„์–ด ์˜ˆ์ธกํ•˜๊ณ , ๊ทธ๋Ÿฌํ•œ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๊ฐ์ข… ์ง€ํ‘œ๋ฅผ ํ†ตํ•ด ์ข…ํ•ฉ์ ์œผ๋กœ ํ‰๊ฐ€ํ•˜๊ณ  ๋ถ„์„ํ•˜์˜€๋‹ค. Fig. 2๋Š” ์ด ์—ฐ๊ตฌ์—์„œ ๊ต๋Ÿ‰ ์•ˆ์ „๋“ฑ๊ธ‰ ์˜ˆ์ธก ๋ชจ๋ธ์„ ๋„์ถœํ•œ ๋ฐฉ๋ฒ•๋ก ์„ ์š”์•ฝํ•˜์—ฌ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค.

Fig. 1. Bridges with a Service Period of More than 30 Years
../../Resources/KSCE/Ksce.2023.43.3.0397/fig1.png
Fig. 2. Development and Evaluation of Classification Model
../../Resources/KSCE/Ksce.2023.43.3.0397/fig2.png
Table 1. Status and Inspection/Examination Cycle According to Safety Grade

Grade

Status

Cycle

Periodic safety inspection

Full safety inspection

Full safety examination

A

The best condition without problems

More than every half a year

More than every 3 years

More than every 6 years

B

Minor damages in supplementary members

More than every 2 years

More than every 5 years

C

Minor damages in main members or extensive damages in supplementary members

D

Major damages in main members

More than 3 times per year

More than

every year

More than

every 4 years

E

Serious damages in main members and immediate prohibition of usage of the bridge

2. ๊ฒฐ์ •๋‚˜๋ฌด ๊ธฐ๋ฐ˜ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ๊ตฌ์ถ•

2.1 ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด ๋ฐ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ

์ด ์—ฐ๊ตฌ์—์„œ๋Š” 2021๋…„ ์ƒ๋ฐ˜๊ธฐ ๊ธฐ์ค€ ์ „๊ตญ๊ต๋Ÿ‰ํ‘œ์ค€๋ฐ์ดํ„ฐ(MOLIT, 2021b)์™€ ์‹œ์„ค๋ฌผํ†ตํ•ฉ์ •๋ณด๊ด€๋ฆฌ์‹œ์Šคํ…œ(FMS, 2021)์˜ ๊ต๋Ÿ‰ ์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ผ๋ฐ˜๊ตญ๋„์ƒ ๊ต๋Ÿ‰ 8,850๊ฐœ๋ฅผ ๋Œ€์ƒ์œผ๋กœ ๋จธ์‹ ๋Ÿฌ๋‹์˜ ๊ฒฐ์ •๋‚˜๋ฌด ๊ธฐ๋ฐ˜ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•˜์˜€๋‹ค. ์—ฌ๊ธฐ์„œ ๋จธ์‹ ๋Ÿฌ๋‹์ด๋ž€ ์ธ๊ณต์ง€๋Šฅ์˜ ํ•œ ๋ถ„์•ผ๋กœ ์ปดํ“จํ„ฐ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์Šค์Šค๋กœ ํ•™์Šตํ•˜์—ฌ ํ•ด๊ฒฐ์ฑ…์„ ์ œ์•ˆํ•˜๋Š” ๊ธฐ๋ฒ•์ด๋‹ค(Gรฉron, 2019). ์ „๊ตญ๊ต๋Ÿ‰ํ‘œ์ค€๋ฐ์ดํ„ฐ์—๋Š” ํŠน์ •ํ•œ ๋ฒ”์ฃผ๋กœ ๋‚˜๋ˆ„์–ด์ง€๋Š” ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ์™€ ์ˆ˜์น˜๋กœ ํ‘œํ˜„๋œ ์—ฐ์†ํ˜• ๋ฐ์ดํ„ฐ๊ฐ€ ๋ชจ๋‘ ํฌํ•จ๋˜์–ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์—ฐ์†ํ˜•๊ณผ ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ๋ชจ๋‘ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ๊ธฐ๋ฒ•์ด ํ•„์š”ํ•˜๋‹ค. ๋˜ํ•œ, ์˜ˆ์ธกํ•˜๊ณ ์ž ํ•˜๋Š” ๋ชฉํ‘œ์ธ ์•ˆ์ „๋“ฑ๊ธ‰์€ A, B, C, D, E๋“ฑ๊ธ‰์œผ๋กœ ๋‚˜๋ˆ„์–ด์ง€๋ฏ€๋กœ ์ด์ง„ ๋ถ„๋ฅ˜ ๊ธฐ๋ฒ•์€ ์ ์ ˆ์น˜ ์•Š์œผ๋ฉฐ ๋‹ค์ค‘ ๋ถ„๋ฅ˜๊ฐ€ ๊ฐ€๋Šฅํ•ด์•ผ ํ•œ๋‹ค. ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์—๋Š” ์—ฐ์†ํ˜•๊ณผ ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅํ•˜๋ฉฐ ์ด์ง„ ๋ถ„๋ฅ˜๊ฐ€ ๊ฐ€๋Šฅํ•œ SVM(Support Vector Machine), ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ์˜ ์‚ฌ์šฉ๋งŒ ๊ฐ€๋Šฅํ•˜๊ณ  ์ด์ง„ ๋ถ„๋ฅ˜๋ฅผ ํ•  ์ˆ˜ ์žˆ๋Š” ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€, ์—ฐ์†ํ˜•๊ณผ ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ ๋ชจ๋‘ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋ฉฐ ๋‹ค์ค‘ ๋ถ„๋ฅ˜๊ฐ€ ๊ฐ€๋Šฅํ•œ ๊ฒฐ์ •๋‚˜๋ฌด ๊ธฐ๋ฐ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋“ฑ์ด ์žˆ๋‹ค(Scikit-learn developers, 2007-2022). ์ด ์—ฐ๊ตฌ์—์„œ๋Š” ์ „๊ตญ๊ต๋Ÿ‰ํ‘œ์ค€๋ฐ์ดํ„ฐ์— ๊ฐ€์žฅ ์ ํ•ฉํ•˜๋‹ค๊ณ  ํŒ๋‹จ๋˜๋Š” ๊ฒฐ์ •๋‚˜๋ฌด ๊ธฐ๋ฐ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค.

๊ฒฐ์ •๋‚˜๋ฌด ๊ธฐ๋ฐ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์—๋„ ๋ช‡ ๊ฐ€์ง€ ์ข…๋ฅ˜๊ฐ€ ์žˆ์ง€๋งŒ ์ด ์—ฐ๊ตฌ์—์„œ๋Š” ๊ฐ€์žฅ ๋Œ€ํ‘œ์ ์ธ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด(Decision Tree)์™€ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ(Random Forest)๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ๋‘ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋ชจ๋‘ ๊ฒฐ์ •๋‚˜๋ฌด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ผ์ข…์˜ ๊ทœ์น™์„ ๋งŒ๋“ค์–ด ๋ถˆ์ˆœ๋„๊ฐ€ ๋‚ฎ์•„์ง€๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๋Œ€์ƒ์„ ์ขํ˜€๋‚˜๊ฐ€๋ฉด์„œ ๋ถ„๋ฅ˜ํ•œ๋‹ค. ์ด๋Š” ๋ณ€์ˆ˜์— ๋”ฐ๋ฅธ ๋ถ„๋ฆฌ ๊ธฐ์ค€์„ ํ†ตํ•ด ๋ถˆ์ˆœ๋„๋ฅผ ๊ฐ์†Œ์‹œํ‚ค๋Š” ์ •๋„๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๋ถ„๋ฅ˜ ๋ฐ ์˜ˆ์ธก ์ž‘์—…์— ํšจ๊ณผ์ ์ธ ์ค‘์š”ํ•œ ๋ณ€์ˆ˜๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์–ด ์œ ์šฉํ•˜๋‹ค(Kazemitabar et al., 2017). ๊ฒฐ์ •๋‚˜๋ฌด์˜ ๊ตฌ์กฐ๋Š” Fig. 3๊ณผ ๊ฐ™์œผ๋ฉฐ, ๋ฃจํŠธ ๋…ธ๋“œ(Root node)์—์„œ ๋ถ„๋ฅ˜๊ฐ€ ์‹œ์ž‘๋˜๊ณ  ์ค‘๊ฐ„ ๋…ธ๋“œ(Intermediate node)์—์„œ ๋ณ€์ˆ˜์— ๋”ฐ๋ฅธ ๋ถ„๋ฅ˜๊ฐ€ ์ง„ํ–‰๋œ ํ›„ ๋งˆ์ง€๋ง‰์— ๋ฆฌํ”„ ๋…ธ๋“œ(Leaf node)์—์„œ ์ตœ์ข…์ ์œผ๋กœ ๋ถ„๋ฅ˜ ๊ฒฐ๊ณผ๊ฐ€ ๋„์ถœ๋œ๋‹ค.

๊ทธ๋Ÿฌ๋‚˜ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์˜ ๊ฒฝ์šฐ ๋‹จ์ผ ๊ฒฐ์ •๋‚˜๋ฌด๋ฅผ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ๋‚˜๋ฌด ๊นŠ์ด๊ฐ€ ๊นŠ์–ด์ง€๋ฉด ๊ณผ์ ํ•ฉ(Overfitting)์ด ๋ฐœ์ƒํ•˜์—ฌ ์˜ˆ์ธก๋ ฅ์ด ์ €ํ•˜๋  ์ˆ˜ ์žˆ๋‹ค. ์—ฌ๊ธฐ์„œ ๊ณผ์ ํ•ฉ์ด๋ž€ ๋ชจ๋ธ ์ƒ์„ฑ ์‹œ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ(Training data)๋ฅผ ๊ณผ๋„ํ•˜๊ฒŒ ํ•™์Šตํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋Šฅ๋ ฅ์ด ์ €ํ•˜๋˜๋Š” ํ˜„์ƒ์„ ์˜๋ฏธํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์˜ ๋‹จ์ ์„ ๋ณด์™„ํ•œ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ๋Š” ์•™์ƒ๋ธ” ๊ธฐ๋ฒ•์„ ์ ์šฉํ•˜์—ฌ ๋‹ค์ˆ˜์˜ ๊ฒฐ์ •๋‚˜๋ฌด๋ฅผ ์ƒ์„ฑํ•จ์œผ๋กœ์จ ๋‹ค์–‘์„ฑ์„ ํ™•๋ณดํ•˜์—ฌ ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•œ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ์•™์ƒ๋ธ” ๊ธฐ๋ฒ• ์ค‘ ๋ฐฐ๊น…(Bagging: Bootstrap aggregating)์„ ์‚ฌ์šฉํ•˜๋Š”๋ฐ, ๋ฐฐ๊น…์€ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฌด์ž‘์œ„๋กœ ์„ ์ •ํ•˜์—ฌ ๊ฒฐ์ •๋‚˜๋ฌด๋ฅผ ๋‹ค์–‘ํ•˜๊ฒŒ ์ƒ์„ฑํ•˜๋Š” ๊ธฐ๋ฒ•์„ ์˜๋ฏธํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ๋Š” Fig. 4์™€ ๊ฐ™์ด ๋ฐฐ๊น…์„ ์ด์šฉํ•˜์—ฌ ๋‹ค์ˆ˜์˜ ๊ฒฐ์ •๋‚˜๋ฌด๋ฅผ ๋ฌด์ž‘์œ„๋กœ ์ƒ์„ฑํ•˜๊ณ  ๋‹ค์ˆ˜๊ฒฐ ๋˜๋Š” ํ‰๊ท ์— ๋”ฐ๋ผ ๋ถ„๋ฅ˜ํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ์„œ ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•  ์ˆ˜ ์žˆ์–ด ๋Œ€์ฒด๋กœ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด๋ณด๋‹ค๋Š” ์„ฑ๋Šฅ์ด ์šฐ์ˆ˜ํ•˜๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ฐ์ดํ„ฐ ์ˆ˜๊ฐ€ ๋งŽ์•„์ง€๋ฉด ๋ถ„์„ ์†Œ์š” ์‹œ๊ฐ„์ด ๊ธธ์–ด์ง€๋Š” ๋‹จ์ ์ด ์žˆ๋‹ค(Gรฉron, 2019).

์ด๋•Œ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์™€ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ๋ชจ๋‘ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ์ˆœ๋„๊ฐ€ ์ตœ๋Œ€ํ•œ ๊ฐ์†Œํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๋ถ„๋ฆฌํ•˜๋„๋ก ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๋Š”๋ฐ, ๊ทธ ๊ธฐ์ค€์œผ๋กœ ์—”ํŠธ๋กœํ”ผ(Entropy)์™€ ์ง€๋‹ˆ๊ณ„์ˆ˜(Gini)๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. ์—”ํŠธ๋กœํ”ผ๋Š” ๋ถˆ์ˆœ๋„๋ฅผ ์ˆ˜์น˜ํ™”ํ•œ ์ง€ํ‘œ์ด๋ฉฐ ํ™•๋ฅ  ๋ณ€์ˆ˜์˜ ๋ถˆํ™•์‹ค์„ฑ์„ ์ˆ˜์น˜๋กœ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ์œผ๋กœ ์‹ (1)๋กœ ์‚ฐ์ •ํ•˜๋ฉฐ, ์—”ํŠธ๋กœํ”ผ์˜ ์ˆ˜์น˜๊ฐ€ 1์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก ๋ถˆ์ˆœ๋„๊ฐ€ ๋†’๋‹ค. ์ง€๋‹ˆ๊ณ„์ˆ˜๋Š” ํ™•๋ฅ ๋ถ„ํฌ๊ฐ€ ์–ด๋Š ๋ฒ”์ฃผ์— ์น˜์šฐ์ณ ์žˆ๋Š” ์ •๋„๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์œผ๋กœ ์‹ (2)๋กœ ๋„์ถœํ•  ์ˆ˜ ์žˆ๋‹ค. ์—”ํŠธ๋กœํ”ผ์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ์ง€๋‹ˆ๊ณ„์ˆ˜๊ฐ€ 1์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก ๋ถˆ์ˆœ๋„๊ฐ€ ๋†’๋‹ค.

(1)
${Entropy}(A)= -\sum_{k=1}^{m}p_{k}\log_{2}(p_{k})$
(2)
$Gini(A)= 1-\sum_{k=1}^{m}p_{k}^{2}$

์—ฌ๊ธฐ์„œ, $A$๋Š” ๋ฒ”์ฃผ ์ „์ฒด, $m$์€ ๋ถ„๋ฅ˜ํ•  ๋ฒ”์ฃผ์˜ ์ˆ˜, $p_{k}$๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ $k$ ๋ฒ”์ฃผ์— ์†ํ•  ํ™•๋ฅ ์ด๋‹ค. ์—”ํŠธ๋กœํ”ผ์™€ ์ง€๋‹ˆ๊ณ„์ˆ˜๋Š” ํฐ ์ฐจ์ด๊ฐ€ ์—†์œผ๋‚˜, ์ง€๋‹ˆ๊ณ„์ˆ˜๋Š” ๊ณ„์‚ฐ ์‹œ log๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•„ ์†๋„๊ฐ€ ๋น ๋ฅด๋‹ค. ๋˜ํ•œ ์ง€๋‹ˆ๊ณ„์ˆ˜๋Š” ๊ฒฐ์ •๋‚˜๋ฌด์—์„œ ๊ฐ€์žฅ ๋นˆ๋ฒˆํ•œ ๋ฒ”์ฃผ๋กœ ๋ถ„๋ฆฌํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์ง€๋งŒ, ์—”ํŠธ๋กœํ”ผ์˜ ๊ฒฝ์šฐ์—๋Š” ์กฐ๊ธˆ ๋” ๊ท ํ˜•์ด ์žกํžŒ ๊ฒฐ์ •๋‚˜๋ฌด๋ฅผ ์ƒ์„ฑํ•˜๋„๋ก ํ•˜์—ฌ ์„ฑ๋Šฅ์ด ๋” ์ข‹์€ ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค(Provost and Fawcett, 2013). ์ผ๋ฐ˜์ ์œผ๋กœ ๋น ๋ฅธ ๋ถ„์„์„ ์œ„ํ•ด ์ง€๋‹ˆ๊ณ„์ˆ˜๋ฅผ ๋งŽ์ด ์‚ฌ์šฉํ•˜๊ธฐ๋Š” ํ•˜์ง€๋งŒ, ์ด ์—ฐ๊ตฌ์—์„œ๋Š” ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰์„ ์ข€ ๋” ์ •ํ™•ํžˆ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•ด Scikit-learn์˜ ํŒŒ์ด์„  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์ค‘ ๋งค๊ฐœ๋ณ€์ˆ˜๋“ค์˜ ์กฐํ•ฉ์„ ๋น„๊ตํ•˜๋Š” GridSearchCV๋ฅผ ์ด์šฉํ•œ ๊ฒ€์ฆ์„ ํ†ตํ•˜์—ฌ ์—”ํŠธ๋กœํ”ผ์™€ ์ง€๋‹ˆ๊ณ„์ˆ˜์˜ ๋‘ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ค‘์—์„œ ๋” ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์œ ํ•œ ๊ฒƒ์„ ๋ถˆ์ˆœ๋„ ๊ณ„์‚ฐ ๊ธฐ์ค€์œผ๋กœ ์‚ฌ์šฉํ•˜์˜€๋‹ค(Scikit-learn developers, 2007-2022).

Fig. 3. Structure of Decision Tree-based Algorithm
../../Resources/KSCE/Ksce.2023.43.3.0397/fig3.png
Fig. 4. Structure of Random Forest Algorithm
../../Resources/KSCE/Ksce.2023.43.3.0397/fig4.png

2.2 ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

์ •์ œ๋˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ์—๋Š” ๊ฒฐ์ธก๊ฐ’์ด๋‚˜ ์ด์ƒ๊ฐ’์ด ์กด์žฌํ•  ์ˆ˜ ์žˆ๊ณ , ๋ถˆ๊ท ํ˜•ํ•œ ๋ฐ์ดํ„ฐ ๋ถ„ํฌ๋กœ ์ธํ•˜์—ฌ ๋ชจ๋ธ ๊ฐœ๋ฐœ ์‹œ ์„ฑ๋Šฅ์ด ์ €ํ•˜๋  ์ˆ˜๋„ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•ด์„œ๋Š” ๊ฒฐ์ธก๊ฐ’ ์ œ๊ฑฐ, ๋ณ€์ˆ˜ ์ œ๊ฑฐ, ๋ณ€์ˆ˜ ์ถ•์†Œ ๋ฐ ์ถ”๊ฐ€, ๋ถˆ๊ท ํ˜• ๋ฐ์ดํ„ฐ์˜ ์ƒ˜ํ”Œ๋ง๊ณผ ๊ฐ™์€ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ ๊ณผ์ •์ด ํ•„์š”ํ•˜๋‹ค.

2.2.1 ๋ณ€์ˆ˜ ์ถ”๊ฐ€, ์ œ๊ฑฐ, ์ถ•์†Œ ๋ฐ ๋‹ค์ค‘๊ณต์„ ์„ฑ

์ „๊ตญ๊ต๋Ÿ‰ํ‘œ์ค€๋ฐ์ดํ„ฐ์—์„œ ์•ˆ์ „๋“ฑ๊ธ‰๊ณผ ๋ฌด๊ด€ํ•œ ๋ณ€์ˆ˜(๊ด€๋ฆฌ๊ธฐ๊ด€๋ช…, ๊ด€๋ฆฌ๊ธฐ๊ด€์ „ํ™”๋ฒˆํ˜ธ, ๋ฐ์ดํ„ฐ๊ธฐ์ค€์ผ์ž) ๋ฐ ๊ฒฐ์ธก๊ฐ’์ด ๋งŽ์€ ๋ณ€์ˆ˜(๊ต๋Ÿ‰๋ณด์ˆ˜๋ณด๊ฐ•๋‚ด์—ญ, ๊ต๋Ÿ‰๋ณด์ˆ˜๋ณด๊ฐ•๋น„์šฉ, ํ•˜๋ถ€ํ†ต๊ณผ์ œํ•œ๋†’์ด)๋Š” ์ œ๊ฑฐํ•˜์˜€๋‹ค. ์˜๋ฏธ๊ฐ€ ์ค‘๋ณต๋˜๋Š” ๋ณ€์ˆ˜๋“ค์€ ํ•œ ๊ฐœ์˜ ๋ณ€์ˆ˜๋กœ ๋Œ€ํ‘œํ•˜์˜€๋Š”๋ฐ, ์ฐจ๋Ÿ‰ํ†ตํ–‰ํ•˜์ค‘ ๋ฐ ์„ค๊ณ„ํ™œํ•˜์ค‘์€ ์„ค๊ณ„ํ™œํ•˜์ค‘์œผ๋กœ, ์‹œ์„ค๋ฌผ์ข…๋ณ„๋“ฑ๊ธ‰๊ตฌ๋ถ„ ๋ฐ ์ ๊ฒ€์œ ํ˜•์€ ์‹œ์„ค๋ฌผ์ข…๋ณ„๋“ฑ๊ธ‰์œผ๋กœ ํ†ต์ผํ•˜์˜€๋‹ค. ๋˜ํ•œ, ๋ณ€์ˆ˜ ์ค‘ ์ค€๊ณต์—ฐ๋„์™€ ์ ๊ฒ€์ผ์ž๋Š” ๋‘ ๋ณ€์ˆ˜์˜ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๊ณต์šฉ๊ธฐ๊ฐ„์ด๋ผ๋Š” ํ•˜๋‚˜์˜ ๋ณ€์ˆ˜๋กœ ์ถ•์†Œํ•˜์˜€๋‹ค.

ํ•œํŽธ, ๊ณ ์œณ๊ฐ’์ด ํ•œ ๊ฐœ์ธ ๋ณ€์ˆ˜๋“ค์€ ๋ชจ๋ธ ํ•™์Šต์— ์˜ํ–ฅ์„ ๋ฏธ์น˜์ง€ ์•Š์œผ๋ฏ€๋กœ ์ œ๊ฑฐํ•˜๊ณ , ๊ณ ์œณ๊ฐ’์ด ๊ณผ๋‹คํ•˜๊ฒŒ ๋งŽ์€ ๋ณ€์ˆ˜๋“ค์€ ๋ชจ๋ธ์˜ ํ•™์Šต ๋Šฅ๋ ฅ์„ ์ €ํ•˜์‹œํ‚ค๊ธฐ ๋•Œ๋ฌธ์— ์ œ๊ฑฐํ•˜๊ฑฐ๋‚˜ ์ถ•์†Œํ•ด์•ผ ํ•œ๋‹ค. ๋จผ์ €, ๊ณ ์œณ๊ฐ’์ด ํ•œ ๊ฐœ์ธ ๋ณ€์ˆ˜์— ํ•ด๋‹นํ•˜๋Š” ๋‚ด์ง„์„ค๊ณ„์ ์šฉ์—ฌ๋ถ€์™€ ๋‚ด์ง„์„ฑ๋Šฅํ™•๋ณด์—ฌ๋ถ€๋Š” ๋ชจ๋“  ๊ต๋Ÿ‰๋“ค์ด ํ•ด๋‹น ์—†์Œ์œผ๋กœ ๋ฐ์ดํ„ฐ์ƒ์— ํ‘œ๊ธฐ๋˜์–ด ์žˆ์–ด์„œ ์ œ๊ฑฐํ•˜์˜€๋‹ค. ์ตœ๊ทผ ์‹œ๊ณต๋œ ์ผ๋ถ€ ๊ต๋Ÿ‰๋“ค์—๋Š” ๋‚ด์ง„์„ค๊ณ„๊ฐ€ ์ ์šฉ๋˜์—ˆ์„ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋˜๋‚˜ ํ‘œ๊ธฐ๊ฐ€ ๋ˆ„๋ฝ๋œ ๊ฒƒ์œผ๋กœ ์ถ”์ธก๋œ๋‹ค. ๋˜ํ•œ, ๊ณ ์œณ๊ฐ’์ด ๋งŽ์€ ๋ณ€์ˆ˜(์†Œ์žฌ์ง€๋„๋กœ๋ช…, ์†Œ์žฌ์ง€์ง€๋ฒˆ, ์‹œ๊ตฐ๊ตฌ๋ช…, ์‹œ๋„๋ช…, ๊ต๋Ÿ‰์‹œ์ž‘์ ๊ฒฝ๋„, ๊ต๋Ÿ‰์‹œ์ž‘์ ์œ„๋„, ๊ต๋Ÿ‰์ข…๋ฃŒ์ ๊ฒฝ๋„, ๊ต๋Ÿ‰์ข…๋ฃŒ์ ์œ„๋„, ์ƒ๋ถ€๊ตฌ์กฐํ˜•์‹) ์ค‘์—์„œ ๋‚˜๋จธ์ง€ ๋ณ€์ˆ˜๋“ค์€ ์ œ๊ฑฐํ•˜๋˜, ์ƒ๋ถ€๊ตฌ์กฐํ˜•์‹์€ ๊ณ ์œณ๊ฐ’ ์ถ•์†Œ๊ฐ€ ๊ฐ€๋Šฅํ•˜์—ฌ ์œ ์ง€ํ•˜์˜€๋‹ค. Hur et al.(2010)์€ ์•ˆ์ „๋“ฑ๊ธ‰ ๊ฒฐ์ •์š”์ธ ์ค‘ ๊ต๋Ÿ‰ ํ˜•์‹์„ ์‚ฌ์šฉ ์žฌ๋ฃŒ์— ๋”ฐ๋ผ ๊ตฌ๋ถ„ํ•˜์—ฌ ํ‰๊ฐ€ํ•˜์˜€๋Š”๋ฐ, ์ด๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ์ƒ๋ถ€๊ตฌ์กฐํ˜•์‹์˜ ๊ณ ์œณ๊ฐ’์„ RC(Reinforced Concrete)๊ต, PSC(PreStressed Concrete)๊ต, ๊ฐ•๊ต, ๊ธฐํƒ€๋กœ ์ถ•์†Œํ•˜์˜€๋‹ค.

์ถ”๊ฐ€์ ์œผ๋กœ ๊ธฐ์กด ์ž๋ฃŒ์—๋Š” ์กด์žฌํ•˜์ง€ ์•Š์ง€๋งŒ ๊ต๋Ÿ‰ ๋“ฑ๊ธ‰์— ํฐ ์˜ํ–ฅ์„ ๋ฏธ์น  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋˜๋Š” ๋ฐ์ดํ„ฐ์ธ ๊ต๋Ÿ‰๋ณ„ ํ‰๊ท ์ผ๊ตํ†ต๋Ÿ‰(ADT: Average Daily Traffic)๊ณผ ๊ต๋Ÿ‰ ์œ„์น˜ ์ •๋ณด ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜์—ฌ ๋ณ€์ˆ˜์— ์ถ”๊ฐ€ํ•˜์˜€๋‹ค. ํ”ผ๋กœ๋กœ ์ธํ•œ ๊ต๋Ÿ‰์˜ ์†์ƒ์€ ๊ตํ†ตํ•˜์ค‘ ํŠน์„ฑ์˜ ์˜ํ–ฅ์„ ๋ฐ›๊ธฐ ๋•Œ๋ฌธ์—(Lee et al., 2010) ์ด์™€ ์—ฐ๊ด€๋œ ๊ต๋Ÿ‰๋ณ„ ํ‰๊ท ์ผ๊ตํ†ต๋Ÿ‰์„ ๋„๋กœ ๊ต๋Ÿ‰ ๋ฐ ํ„ฐ๋„ ํ˜„ํ™ฉ์กฐ์„œ(MOLIT, 2021d)๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ํ™•๋ณดํ•˜์˜€๋‹ค. ๊ทธ๋ฆฌ๊ณ , ํƒ„์‚ฐํ™” ๋ฐ ์—ผํ•ด๋Š” ์ฝ˜ํฌ๋ฆฌํŠธ์™€ ๊ฐ•์žฌ์˜ ๋‚ด๊ตฌ์„ฑ ์ €ํ•˜ํ˜„์ƒ์„ ๊ฐ€์†ํ™”ํ•  ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ต๋Ÿ‰ ์œ„์น˜๋ฅผ ๋„์‹ฌ์ง€์˜ ํƒ„์‚ฐํ™” ๋ฐ ํ•ด์•ˆ๊ฐ€์˜ ์—ผํ•ด, ๊ทธ๋ฆฌ๊ณ  ๊ธฐํƒ€ ์ง€์—ญ์œผ๋กœ ๊ตฌ๋ถ„ํ•˜์—ฌ ๋ณ€์ˆ˜์— ๋ฐ˜์˜ํ•˜์˜€๋‹ค. ํ•ด์•ˆ๊ฐ€์˜ ์—ผํ•ด์˜ ๊ฒฝ์šฐ ์ฝ˜ํฌ๋ฆฌํŠธํ‘œ์ค€์‹œ๋ฐฉ์„œ ํ•ด์„ค(KCI, 2009)์„ ์ฐธ๊ณ ํ•˜์—ฌ ์„œํ•ด์•ˆ๊ณผ ๋‚จํ•ด์•ˆ์˜ ํ•ด์•ˆ์„ ์œผ๋กœ๋ถ€ํ„ฐ 250 m, ๋™ํ•ด์•ˆ์˜ ํ•ด์•ˆ์„ ์œผ๋กœ๋ถ€ํ„ฐ 1,000 m ๊ฑฐ๋ฆฌ ๋‚ด์— ์กด์žฌํ•˜๋Š” ์ง€์—ญ์„ ์—ผํ•ด์˜ ์˜ํ–ฅ๊ถŒ์œผ๋กœ ํŒ๋‹จํ•˜์˜€๋‹ค. ๋˜ํ•œ, ํƒ„์‚ฐํ™”๋Š” ํ–‰์ •๊ตฌ์—ญ ์ค‘ ์ธ๊ตฌ ๊ทœ๋ชจ๊ฐ€ ํฐ ์‹œ ๋˜๋Š” ์ž์น˜๊ตฌ, ๊ณต์—…๋‹จ์ง€์™€ ์‚ฐ์—…๋‹จ์ง€๋ฅผ ํƒ„์‚ฐํ™”์˜ ์˜ํ–ฅ๊ถŒ์œผ๋กœ ๊ฐ„์ฃผํ•˜์˜€๋‹ค.

ํ•œํŽธ, ์ „๊ตญํ‘œ์ค€๊ต๋Ÿ‰๋ฐ์ดํ„ฐ์ƒ์˜ ๋ณด์ˆ˜๋ณด๊ฐ•๋‚ด์—ญ ์ž๋ฃŒ๋Š” ๊ฒฐ์ธก๊ฐ’์ด ๋งŽ์ง€๋งŒ, ์‹œ์„ค๋ฌผํ†ตํ•ฉ์ •๋ณด๊ด€๋ฆฌ์‹œ์Šคํ…œ(FMS, 2021)์˜ ๋ณด์ˆ˜๋ณด๊ฐ•๋‚ด์—ญ์€ ๋น„๊ต์  ๊ฒฐ์ธก๊ฐ’์ด ์ ๊ธฐ ๋•Œ๋ฌธ์— ์ด๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ๋ณด์ˆ˜๋ณด๊ฐ•๊ณผ ๊ด€๋ จ๋œ ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•˜์˜€๋‹ค. ๊ณต์šฉ๊ธฐ๊ฐ„ ๋™์•ˆ ์‹ค์‹œํ•œ ๋ณด์ˆ˜๋ณด๊ฐ• ํ™œ๋™์˜ ์ข…๋ฅ˜๊ฐ€ ์˜จ์ „ํ•˜๊ฒŒ ๊ธฐ๋ก๋˜์ง€ ์•Š์€ ๊ต๋Ÿ‰๋“ค์ด ๋งŽ์•„ ๋ณด์ˆ˜๋ณด๊ฐ• ํšŸ์ˆ˜์™€ 2๋…„ ์ด๋‚ด ๋ณด์ˆ˜๋ณด๊ฐ• ๋‚ด์—ญ ์—ฌ๋ถ€๋ฅผ ํ™•์ธํ•˜์—ฌ ๋ณ€์ˆ˜๋กœ ์ถ”๊ฐ€ํ•˜์˜€๋‹ค. ์ด๋•Œ ๋น„๋ก ๊ธด๊ธ‰ํ•˜์ง€๋Š” ์•Š์ง€๋งŒ ๊ฒฐํ•จ์˜ ์กด์žฌ๋กœ ์ธํ•ด ๋ณด์ˆ˜๋‚˜ ๋ณด๊ฐ•์„ ์‹ค์‹œํ•ด์•ผ ํ•˜๋Š” B์™€ C๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ์ •๋ฐ€์•ˆ์ „์ ๊ฒ€ ์ฃผ๊ธฐ๊ฐ€ 2๋…„์ด๊ธฐ ๋•Œ๋ฌธ์— ์ด ๊ธฐ๊ฐ„์„ ๋ณด์ˆ˜๋ณด๊ฐ• ์—ฌ๋ถ€์˜ ๊ธฐ์ค€์œผ๋กœ ์‚ผ์•˜๋‹ค.

๋ณ€์ˆ˜๋“ค์€ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ๊ฑฐ์˜ ์—†๋Š” ๋…๋ฆฝ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋ฉฐ, ๋ณ€์ˆ˜๋“ค ๊ฐ„ ์ƒ๊ด€์„ฑ์ด ๋†’์€ ๋‹ค์ค‘๊ณต์„ ์„ฑ์ด ๋ฐœ์ƒํ•˜๋Š” ๋ชจ๋ธ์€ ์„ฑ๋Šฅ์ด ์ €ํ•˜๋  ์ˆ˜ ์žˆ์–ด ๋ฐ”๋žŒ์งํ•˜์ง€ ์•Š๋‹ค(Dormann et al., 2013). ๋”ฐ๋ผ์„œ ์‹ (3)๊ณผ ๊ฐ™์€ ํ”ผ์–ด์Šจ ์ƒ๊ด€๊ณ„์ˆ˜(Pearson correlation coefficient)๋ฅผ ๊ตฌํ•˜์—ฌ ์—ฐ์†ํ˜• ๋ณ€์ˆ˜๋“ค์˜ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ์‚ฐ์ถœํ•˜๊ณ , ๋†’์€ ์ƒ๊ด€์„ฑ์„ ๊ฐ€์ง€๋Š” ๋ณ€์ˆ˜๋“ค์€ ์ œ๊ฑฐํ•˜์˜€๋‹ค.

(3)
$\rho_{X,\: Y}=\dfrac{Cov(X,\: Y)}{\sigma_{X}\sigma_{Y}}=\dfrac{E[(X-\mu_{X})(Y-\mu_{Y})]}{\sigma_{X}\sigma_{Y}} \\ =\dfrac{\sum_{i=1}^{N}(X_{i}-\mu_{X})(Y_{i}-\mu_{Y})}{\sigma_{X}\sigma_{Y}N}$

์—ฌ๊ธฐ์„œ, $\rho_{X,\: Y}$๋Š” ๋ณ€์ˆ˜ ๊ฐ„ ํ”ผ์–ด์Šจ ์ƒ๊ด€๊ณ„์ˆ˜, $X$, $Y$๋Š” ๊ฐ ๋ณ€์ˆ˜์˜ ๊ฐ’, $Cov(X,\: Y)$๋Š” ๊ฐ ๋ณ€์ˆ˜ ๊ฐ„ ๊ณต๋ถ„์‚ฐ, $\sigma_{X}$, $\sigma_{Y}$๋Š” ๊ฐ ๋ณ€์ˆ˜์˜ ํ‘œ์ค€ํŽธ์ฐจ, $\mu_{X}$, $\mu_{Y}$๋Š” ๊ฐ ๋ณ€์ˆ˜์˜ ํ‰๊ท , $N$์€ ๋ณ€์ˆ˜์˜ ๊ฐœ์ˆ˜์ด๋‹ค. ๋ณ€์ˆ˜ ๊ฐ„ ํ”ผ์–ด์Šจ ์ƒ๊ด€๊ณ„์ˆ˜๋ฅผ ์‹œ๊ฐํ™”ํ•˜๋ฉด Fig. 5์™€ ๊ฐ™๋‹ค. ํ†ต์ƒ์ ์œผ๋กœ ํ”ผ์–ด์Šจ ์ƒ๊ด€๊ณ„์ˆ˜์˜ ์ ˆ๋Œ“๊ฐ’์ด 0.3 ์ด์ƒ์ด๋ฉด ๋ณ€์ˆ˜ ๊ฐ„์— ๋šœ๋ ทํ•œ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ์žˆ๋‹ค๊ณ  ๊ฐ„์ฃผํ•œ๋‹ค(Ratner, 2009). ๋”ฐ๋ผ์„œ ํ”ผ์–ด์Šจ ์ƒ๊ด€๊ณ„์ˆ˜์˜ ์ ˆ๋Œ“๊ฐ’์ด 0.3 ์ด์ƒ์ธ ๋ณ€์ˆ˜๋“ค์€ ๊ต๋Ÿ‰ํญ, ๊ต๋Ÿ‰์—ฐ์žฅ๊ณผ ๊ณต์šฉ๊ธฐ๊ฐ„์œผ๋กœ ๋Œ€ํ‘œ๋œ๋‹ค๊ณ  ๋ณด๊ณ , ์ด 3๊ฐ€์ง€ ๋ณ€์ˆ˜๋“ค์„ ์ œ์™ธํ•˜๊ณ ๋Š” ์ œ๊ฑฐํ•˜์˜€๋‹ค.

์ด์ƒ๊ณผ ๊ฐ™์€ ์ ˆ์ฐจ๋ฅผ ๊ฑฐ์ณ ๋ชจ๋ธ ๊ตฌ์ถ•์— ์‚ฌ์šฉํ•˜๋Š” ๋ณ€์ˆ˜๋Š” ์‹œ์„ค๋ฌผ์ข…๋ณ„๋“ฑ๊ธ‰๊ตฌ๋ถ„, ๊ต๋Ÿ‰์—ฐ์žฅ, ๊ต๋Ÿ‰ํญ, ์ƒํ•˜ํ–‰์„ ๋ถ„๋ฆฌ์—ฌ๋ถ€, ์ƒ๋ถ€๊ตฌ์กฐํ˜•์‹, ๊ณต์šฉ๊ธฐ๊ฐ„, ๊ต๋Ÿ‰๋ณ„ ํ‰๊ท ์ผ๊ตํ†ต๋Ÿ‰, ๊ต๋Ÿ‰์œ„์น˜์™€ 2๋…„ ์ด๋‚ด ๋ณด์ˆ˜๋ณด๊ฐ• ๋‚ด์—ญ ์—ฌ๋ถ€์™€ ๊ฐ™์ด ์ด 9๊ฐœ๊ฐ€ ๋„์ถœ๋˜์—ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ชจ๋ธ์˜ ๊ตฌ์ถ• ์‹œ ๋ชจ๋ธ์—์„œ ์˜ˆ์ธกํ•˜๊ณ ์ž ํ•˜๋Š” ๋ชฉํ‘œ์ธ ์•ˆ์ „๋“ฑ๊ธ‰์„ ํฌํ•จํ•˜๋ฉด ์ด 10๊ฐœ์˜ ๋ฐ์ดํ„ฐ ์ข…๋ฅ˜๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค.

Fig. 5. Pearson Correlation Coefficient of Features
../../Resources/KSCE/Ksce.2023.43.3.0397/fig5.png

2.2.2 ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ๋ง

๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ๊ฐœ๋ฐœ์„ ์œ„ํ•ด์„œ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์™€ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ(Test data)๋กœ ๋‚˜๋ˆˆ ํ›„ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•˜์—ฌ ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•˜๊ณ , ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๋กœ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ์ ˆ์ฐจ๋ฅผ ๋”ฐ๋ฅธ๋‹ค. ์ด๋•Œ, ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์™€ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๋Š” Fig. 2์™€ ๊ฐ™์ด 7:3 ๋น„์œจ ์ •๋„๋กœ ๋ถ„ํ• ํ•˜๋Š” ๊ฒƒ์ด ์ผ๋ฐ˜์ ์ด๋ฉฐ, ์ด ์—ฐ๊ตฌ์—์„œ๋„ ์ด๋Ÿฌํ•œ ๋น„์œจ์„ ์ทจํ•˜์˜€๋‹ค.

์ผ๋ฐ˜๊ตญ๋„์ƒ ๊ต๋Ÿ‰ 8,850๊ฐœ์˜ ์•ˆ์ „๋“ฑ๊ธ‰ ๋ถ„ํฌ๋Š” Table 2์™€ ๊ฐ™๋‹ค. ์ผ๋ฐ˜๊ตญ๋„ ๊ต๋Ÿ‰์˜ ๊ฒฝ์šฐ E๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์€ ์—†์œผ๋ฉฐ, C์™€ D๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์€ A์™€ B๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์— ๋น„ํ•ด ๊ทธ ์ˆ˜๊ฐ€ ํ˜„์ €ํžˆ ์ ์€๋ฐ, ํŠนํžˆ D๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ์ˆ˜๊ฐ€ ํ˜„์ €ํžˆ ์ ๋‹ค. ํ•ด๋‹น ๋ฐ์ดํ„ฐ๋Š” ๊ฐ ์ง‘๋‹จ์˜ ๋ถ„ํฌ๊ฐ€ ๊ท ๋“ฑํ•˜์ง€ ์•Š์€ ๋ถˆ๊ท ํ˜• ๋ฐ์ดํ„ฐ์ด๋ฏ€๋กœ ์ด๋ฅผ ๊ทธ๋Œ€๋กœ ํ•™์Šตํ•œ๋‹ค๋ฉด A์™€ B๋“ฑ๊ธ‰๋งŒ ์ฃผ๋กœ ํ•™์Šตํ•˜๊ณ , C์™€ D๋“ฑ๊ธ‰์€ ๊ฑฐ์˜ ํ•™์Šตํ•˜์ง€ ์•Š๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜์—ฌ ๋ชจ๋ธ์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ์ด ์ €ํ•˜๋  ์ˆ˜ ์žˆ๋‹ค. ํŠนํžˆ C์™€ D๋“ฑ๊ธ‰์€ ์œ ์ง€๊ด€๋ฆฌ ์ธก๋ฉด์—์„œ ์ฃผ์˜๋ฅผ ์š”ํ•˜๋Š” ๊ต๋Ÿ‰์ด๋ฏ€๋กœ C์™€ D๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์„ ์˜ฌ๋ฐ”๋กœ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋„๋ก ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ์ด ๋ฐ”๋žŒ์งํ•˜๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋Ÿฌํ•œ ๋ถˆ๊ท ํ˜• ๋ฌธ์ œ๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด C์™€ D๋“ฑ๊ธ‰์„ ํ•œ ์ง‘๋‹จ์œผ๋กœ ๋ฌถ์€ ํ›„ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์„ ํ†ตํ•ด ๊ฐ ์ง‘๋‹จ(A, B, C+D)์˜ ๋ถ„ํฌ๋ฅผ ๊ท ๋“ฑํ•˜๊ฒŒ ๋งŒ๋“ค์–ด ๋“ฑ๊ธ‰๋ณ„ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๋†’์ด๊ณ ์ž ํ•˜์˜€๋‹ค.

์ด๋•Œ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์—๋Š” Fig. 6๊ณผ ๊ฐ™์ด ์–ธ๋” ์ƒ˜ํ”Œ๋ง(Under- sampling), ์˜ค๋ฒ„ ์ƒ˜ํ”Œ๋ง(Over-sampling) ๋ฐ ๋ณตํ•ฉ ์ƒ˜ํ”Œ๋ง(Combined sampling)์ด ์žˆ๋‹ค. ์–ธ๋” ์ƒ˜ํ”Œ๋ง์ด๋ž€ ๋‹ค์ˆ˜ ์ง‘๋‹จ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ž„์˜๋กœ ์„ ํƒํ•˜์—ฌ ์†Œ์ˆ˜ ์ง‘๋‹จ์˜ ์ˆ˜์— ๋งž๋„๋ก ๋ฐ์ดํ„ฐ ๋น„์ค‘์„ ์กฐ์ ˆํ•˜์—ฌ ๋ชจ๋ธ๋ง์— ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋ฉฐ, ์œ ์˜๋ฏธํ•œ ๋ฐ์ดํ„ฐ๋กœ ์ถ•์•ฝ์‹œํ‚ฌ ์ˆ˜ ์žˆ์ง€๋งŒ ์ค‘์š” ์ •๋ณด๊ฐ€ ์œ ์‹ค๋  ์šฐ๋ ค๊ฐ€ ์žˆ๋‹ค. ์˜ค๋ฒ„ ์ƒ˜ํ”Œ๋ง์€ ์–ธ๋” ์ƒ˜ํ”Œ๋ง๊ณผ๋Š” ๋ฐ˜๋Œ€๋กœ ์†Œ์ˆ˜ ์ง‘๋‹จ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์ˆ˜ ์ง‘๋‹จ์— ๋งž์ถ”์–ด ์ฆํญ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ, ์ •๋ณด์˜ ์œ ์‹ค์„ ๋ฐฉ์ง€ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ๋ฐ˜๋ณต๋˜๊ฑฐ๋‚˜ ์œ ์‚ฌํ•œ ๋ฐ์ดํ„ฐ์˜ ์ฆ๊ฐ€๋กœ ๊ณผ์ ํ•ฉ ํ˜„์ƒ์ด ์ผ์–ด๋‚˜ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ์ €ํ•˜์‹œํ‚ฌ ์šฐ๋ ค๊ฐ€ ์žˆ๋‹ค. ํ•œํŽธ, ์–ธ๋” ์ƒ˜ํ”Œ๋ง๊ณผ ์˜ค๋ฒ„ ์ƒ˜ํ”Œ๋ง์˜ ๋‹จ์ ์„ ๋ณด์™„ํ•˜๊ณ ์ž ๋‘ ๊ฐ€์ง€ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์„ ๊ฒฐํ•ฉํ•œ ๋ณตํ•ฉ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•๋„ ์žˆ๋‹ค(Lee et al., 2019b). ๋”ฐ๋ผ์„œ ์ด ์—ฐ๊ตฌ์—์„œ๋Š” ๋ชจ๋ธ๋ง์— ์–ธ๋” ์ƒ˜ํ”Œ๋ง, ์˜ค๋ฒ„ ์ƒ˜ํ”Œ๋ง ๋ฐ ๋ณตํ•ฉ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์„ ๋ชจ๋‘ ์ ์šฉํ•˜์—ฌ ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ๋น„๊ตํ•ด ๋ณด์•˜๋‹ค. ์–ธ๋” ๋ฐ ์˜ค๋ฒ„ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์—๋„ ๋ช‡ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์ด ์žˆ๋Š”๋ฐ, ๊ทธ ์ค‘ ๋‹ค์ˆ˜ ์ง‘๋‹จ ๋ฐ์ดํ„ฐ๋ฅผ ์†Œ์ˆ˜ ์ง‘๋‹จ ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๋งŒํผ ๋ฌด์ž‘์œ„๋กœ ๊ฐ์†Œ์‹œํ‚ค๋Š” ๋žœ๋ค ์–ธ๋” ์ƒŒํ”Œ๋ง์„, ์†Œ์ˆ˜ ์ง‘๋‹จ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์ˆ˜ ์ง‘๋‹จ ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๋งŒํผ ๋ฌด์ž‘์œ„๋กœ ์ฆ๊ฐ€์‹œํ‚ค๋Š” ๋žœ๋ค ์˜ค๋ฒ„ ์ƒ˜ํ”Œ๋ง์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ๋ณตํ•ฉ ์ƒ˜ํ”Œ๋ง์˜ ๊ฒฝ์šฐ ์–ธ๋” ์ƒ˜ํ”Œ๋ง์˜ TomekLinks ์ƒ˜ํ”Œ๋ง๊ณผ ์˜ค๋ฒ„ ์ƒ˜ํ”Œ๋ง์˜ SMOTE ์ƒ˜ํ”Œ๋ง์„ ๊ฒฐํ•ฉํ•œ SMOTETomek ์ƒ˜ํ”Œ๋ง์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ์ด๋Š” ์†Œ์ˆ˜ ์ง‘๋‹จ๊ณผ ๋‹ค์ˆ˜ ์ง‘๋‹จ์— ์†ํ•œ ๋ฐ์ดํ„ฐ ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ๊ฐ€ ๊ทผ์ ‘ํ•œ ๊ฒƒ์„ ํ•œ ์ง‘๋‹จ์œผ๋กœ ๋ณด๊ณ  ์ด๋ฅผ ๋…ธ์ด์ฆˆ๋กœ ๊ฐ„์ฃผํ•˜์—ฌ ์ œ๊ฑฐํ•œ ํ›„, ์†Œ์ˆ˜ ์ง‘๋‹จ์— ํ•ด๋‹นํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ƒ์œผ๋กœ ํ•ฉ์„ฑํ•˜๊ณ  ์ฆ๊ฐ€์‹œ์ผœ ๋ฐ์ดํ„ฐ ๋ถˆ๊ท ํ˜• ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๊ธฐ๋ฒ•์ด๋‹ค.

Fig. 6. Data Sampling Technique to Improve Imbalanced Data: (a) Random Under-sampling, (b) Random Over-sampling, (c) SMOTETomek Sampling
../../Resources/KSCE/Ksce.2023.43.3.0397/fig6.png
Table 2. Distribution of Safety Grade of Bridges Located in National Roads

Safety grade

Number of bridges

Percentage of bridges(%)

A

2,252

25.4

B

5,775

65.3

C

696

7.9

D

8

0.1

E

0

0

None

(No inspection or examination)

119

1.3

Sum

8,850

100

2.3 ์ตœ์ ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜ ์ ์šฉ ๋ชจ๋ธ

์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด ๋ฐ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ์˜ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•˜๋ฉด ๋ฐ์ดํ„ฐ์— ์ ํ•ฉํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ์ง€์ •๋˜์–ด ์žˆ์ง€ ์•Š์•„ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ๋‚ฎ์•„์ง„๋‹ค. ์ด๋•Œ ๋งค๊ฐœ๋ณ€์ˆ˜(Parameter)๋ž€ ๋ชจ๋ธ์˜ ์„ค์ •๊ฐ’์„ ์˜๋ฏธํ•˜๋ฉฐ ์‚ฌ์šฉ์ž๊ฐ€ ์ง์ ‘ ์กฐ์ ˆ ๊ฐ€๋Šฅํ•œ๋ฐ, ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ตœ์ ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์„ ์ •ํ•ด์•ผ ํ•œ๋‹ค(Provost and Fawcett, 2013; Truicฤƒ and Leordeanu, 2017). ์—„๋ฐ€ํžˆ ๋งํ•ด ์ด์ฒ˜๋Ÿผ ์‚ฌ์šฉ์ž๊ฐ€ ์ง์ ‘ ๊ฐ’์„ ์„ค์ •ํ•ด ์ค„ ์ˆ˜ ์žˆ๋Š” ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ์ดˆ๋งค๊ฐœ๋ณ€์ˆ˜(Hyper parameter, ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ)๋กœ์„œ ์ผ๋ฐ˜์ ์ธ ๋งค๊ฐœ๋ณ€์ˆ˜์™€๋Š” ๊ตฌ๋ณ„๋˜์–ด์•ผ ํ•˜์ง€๋งŒ, ์—ฌ๊ธฐ์—์„œ๋Š” ๊ธฐ์กด ์—ฐ๊ตฌ์—์„œ ํ†ต์ƒ์ ์œผ๋กœ ๊ทธ๋Ÿฌํ•˜๋“ฏ ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ ์ง€์นญํ•˜์˜€๋‹ค.

์ด ์—ฐ๊ตฌ์—์„œ๋Š” Scikit-learn์˜ GridSearchCV๋ฅผ ์ด์šฉํ•˜์—ฌ ์ตœ์ ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์„ ์ •ํ•˜์˜€๋‹ค. GridSearchCV๋Š” ๋ถ„๋ฅ˜์— ์‚ฌ์šฉํ•˜๋Š” ๋งค๊ฐœ๋ณ€์ˆ˜๋“ค์˜ ์ˆœ์ฐจ์ ์ธ ์ž…๋ ฅ์„ ํ†ตํ•ด ์ง€์ •ํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜ ๋ฒ”์œ„์—์„œ ๋ชจ๋“  ๊ฒฝ์šฐ์˜ ์กฐํ•ฉ์„ ๋งŒ๋“ค์–ด ํ‰๊ฐ€ํ•œ๋‹ค. ๋˜ํ•œ, ํ•™์Šต๊ณผ ๊ฒ€์ฆ์„ ๋ฐ˜๋ณตํ•˜์—ฌ ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•˜๊ณ  ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ๋†’์ด๋Š” ๊ต์ฐจ๊ฒ€์ฆ์„ ๋™์‹œ์— ์ง„ํ–‰ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ง€์ •ํ•œ ๋ฒ”์œ„์—์„œ ๊ฐ€์žฅ ์šฐ์ˆ˜ํ•œ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๊ฐ€์ง€๋Š” ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ’์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค(Scikit-learn developers, 2007- 2022). ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์™€ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ์—์„œ ์กฐ์ •ํ•  ๋งค๊ฐœ๋ณ€์ˆ˜๋“ค์„ Table 3์— ๋‚˜ํƒ€๋‚ด์—ˆ์œผ๋ฉฐ, ์—ฌ๊ธฐ์„œ max_depth, min_samples_split๊ณผ min_samples_leaf๋Š” ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด๊ฐ€ ๊นŠ๊ฒŒ ์ƒ์„ฑ๋˜๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•จ์œผ๋กœ์จ ๊ณผ์ ํ•ฉ์˜ ์˜ํ–ฅ์„ ๋‚ฎ์ถ”์–ด ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ™•๋ณดํ•˜๋Š” ๋งค๊ฐœ๋ณ€์ˆ˜์ด๋ฏ€๋กœ ํ•„์ˆ˜์ ์œผ๋กœ ์กฐ์ •ํ•ด์•ผ ํ•œ๋‹ค. ๊ฐ ์ƒ˜ํ”Œ๋ง๋ณ„๋กœ ์ด ์—ฐ๊ตฌ์—์„œ ๋„์ถœ๋œ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด ๋ฐ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์ตœ์  ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ๊ฐ๊ฐ Table 4 ๋ฐ 5์™€ ๊ฐ™๋‹ค.

Table 3. Hyper Parameters in Decision Tree and Random Forest

Hyper parameter

Characteristics

criterion

The function to measure the quality of a split (entropy, gini)

max_depth

The maximum depth of the tree

min_samples _split

The minimum number of samples required to split an intermediate node

min_samples _leaf

The minimum number of samples required to be at a leaf node

max_ features

The number of features to consider when looking for the best split

(auto: , log: )

class_ weight

Whether to apply the weight of each class

(apply: balanced, non-apply: None)

splitter

The strategy used to choose the split at each node

(the best method of splitting node for all features: best, the best method of splitting node after randomly extracting the features: random)

*Only used for Decision Tree

bootstrap

Whether bootstrap samples are used when building trees

*Only used for Random Forest

Table 4. Hyper Parameters in Decision Tree Model

Hyper parameter

Sampling

Random under-sampling

Random over-sampling

SMOTETomek sampling

criterion

[entropy, gini]

gini

entropy

entropy

max_depth

[1โˆผ10]

5

None

None

min_samples _split [1โˆผ15]

12

2

2

min_samples _leaf [1โˆผ15]

2

1

1

max_features

[auto, log]

auto

auto

auto

class_weight

[balanced, None]

balanced

None

None

splitter

[best, random]

best

best

best

Table 5. Hyper Parameters in Random Forest Model

Hyper parameter

Sampling

Random under-sampling

Random over-sampling

SMOTETomek sampling

criterion

[entropy, gini]

gini

gini

gini

max_depth

[1โˆผ10]

9

9

9

min_samples _split [1โˆผ15]

4

7

2

min_samples _leaf [1โˆผ15]

1

1

1

max_features

[auto, log]

auto

auto

auto

class_weight

[balanced, None]

balanced

balanced

balanced

bootstrap

[True, False]

False

False

False

3. ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰ ์˜ˆ์ธก ๋ชจ๋ธ ์„ฑ๋Šฅ ํ‰๊ฐ€

์ด ์—ฐ๊ตฌ์—์„œ ์˜ˆ์ธกํ•˜๊ณ ์ž ํ•˜๋Š” ์•ˆ์ „๋“ฑ๊ธ‰์˜ ๋ฒ”์ฃผ๋Š” A๋“ฑ๊ธ‰, B๋“ฑ๊ธ‰ ๋ฐ C, D๋“ฑ๊ธ‰์˜ ์ด 3๊ฐ€์ง€๋กœ ๊ตฌ๋ถ„๋˜๋ฏ€๋กœ ๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์— ํ•ด๋‹น๋œ๋‹ค. ํ•ด๋‹น ๋ชจ๋ธ์—์„œ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๋Š” ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์™€๋Š” ๋‹ฌ๋ฆฌ ์ƒ˜ํ”Œ๋งํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๋ฒ”์ฃผ๋ณ„ ๋ถ„ํฌ๊ฐ€ ๋ถˆ๊ท ํ˜•ํ•˜๋‹ค. ๋”ฐ๋ผ์„œ ์ผ๋ฐ˜์ ์ธ ์ •ํ™•๋„(Accuracy)๋Š” ๊ฐ ๋ฒ”์ฃผ์˜ ๋ถ„ํฌ๊ฐ€ ๊ณ ๋ ค๋˜์ง€ ์•Š๊ณ  ํ‰๊ท ํ™”๋˜์–ด ๊ณ„์‚ฐ๋˜๋ฏ€๋กœ ์†Œ์ˆ˜ ๋ฒ”์ฃผ์™€ ๋‹ค์ˆ˜ ๋ฒ”์ฃผ๊ฐ€ ํ˜ผ์žฌํ•˜๋Š” ๋ถ„๋ฅ˜์—์„œ๋Š” ๋ชจ๋ธ ์„ฑ๋Šฅ์˜ ์ ์ ˆํ•œ ํ‰๊ฐ€ ์ง€ํ‘œ๋กœ ๋ณผ ์ˆ˜ ์—†์œผ๋ฉฐ, ์ •ํ™•๋„ ๋Œ€์‹  ๋‹ค๋ฅธ ํ‰๊ฐ€ ์ง€ํ‘œ๊ฐ€ ์‚ฌ์šฉ๋˜์–ด์•ผ ํ•œ๋‹ค(He and Garcia, 2009). ์ด ์—ฐ๊ตฌ์—์„œ๋Š” ๋‹ค์ˆ˜ ๋ฒ”์ฃผ์ธ A์™€ B๋“ฑ๊ธ‰๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์†Œ์ˆ˜ ๋ฒ”์ฃผ์ธ C, D๋“ฑ๊ธ‰์˜ ์˜ˆ์ธก๋ ฅ๋„ ์šฐ์ˆ˜ํ•œ ๋ชจ๋ธ์„ ์„ ์ •ํ•˜๊ธฐ ์œ„ํ•ด ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ํ‰๊ฐ€ ์ง€ํ‘œ๋กœ์„œ ํ˜ผ๋™ํ–‰๋ ฌ(Confusion matrix), ๊ท ํ˜• ์ •ํ™•๋„(Balanced accuracy), ์žฌํ˜„์œจ(Recall), ROC ๊ณก์„ (Receiver Operating Characteristic curve) ๋ฐ AUC(Area Under the Curve)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜์˜€๋‹ค.

3.1 ํ˜ผ๋™ํ–‰๋ ฌ

ํ˜ผ๋™ํ–‰๋ ฌ์€ ์‹ค์ œ๊ฐ’๊ณผ ์˜ˆ์ธก๊ฐ’์˜ ์ผ์น˜ ์—ฌ๋ถ€๋ฅผ ๋ณด์—ฌ์ฃผ๋ฉฐ, ์ด๋Š” ๊ต๋Ÿ‰ ๋“ฑ๊ธ‰ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ๊ธฐ๋ณธ ์ง€ํ‘œ๊ฐ€ ๋œ๋‹ค. Fig. 7์—์„œ ๋Œ€์ƒ์˜ ์‹ค์ œ๊ฐ’๊ณผ ์˜ˆ์ธก๊ฐ’์ด ๋ถ„๋ฅ˜ํ•˜๊ณ ์ž ํ•˜๋Š” ํŠน์ • ๋ฒ”์ฃผ๋กœ ์ผ์น˜ํ•˜๋ฉด TP(True Positive)์ด๊ณ , ์‹ค์ œ๊ฐ’๊ณผ ์˜ˆ์ธก๊ฐ’์ด ๋ชจ๋‘ ํ•ด๋‹น ๋ฒ”์ฃผ๊ฐ€ ์•„๋‹ˆ๋ฉด TN(True Negative)์ด๋‹ค. ๋˜ํ•œ ๋ถ„๋ฅ˜ํ•˜๊ณ ์ž ํ•˜๋Š” ๋ฒ”์ฃผ์—์„œ ์‹ค์ œ๊ฐ’์€ ํŠน์ • ๋ฒ”์ฃผ์ด๋‚˜ ์˜ˆ์ธก๊ฐ’์€ ๊ทธ ์™ธ ๋ฒ”์ฃผ์ด๋ฉด FN(False Negative), ์‹ค์ œ๊ฐ’์€ ํŠน์ • ๋ฒ”์ฃผ๊ฐ€ ์•„๋‹ˆ์ง€๋งŒ ์˜ˆ์ธก๊ฐ’์€ ๊ทธ๋Ÿฌํ•œ ํŠน์ • ๋ฒ”์ฃผ์ด๋ฉด FP(False Positive)๋ฅผ ์˜๋ฏธํ•œ๋‹ค. ์ฆ‰, TP์™€ TN์€ ์˜ˆ์ธก์ด ์‹ค์ œ์™€ ์ผ์น˜ํ•˜๋Š” ์ •๋‹ต์ด๊ณ , FP์™€ FN์€ ์˜ˆ์ธก์ด ์‹ค์ œ์™€ ๋‹ค๋ฅธ ์˜ค๋‹ต์ด๋‹ค. ํŠนํžˆ, ์‹ค์ œ๋กœ C, D๋“ฑ๊ธ‰์ธ ๊ต๋Ÿ‰์„ A์™€ B๋“ฑ๊ธ‰์œผ๋กœ ์ƒํ–ฅํ•˜์—ฌ ์˜ˆ์ธกํ•˜๋Š” ๊ฒฝ์šฐ์ธ FN์€ ๊ต๋Ÿ‰์˜ ์œ ์ง€๊ด€๋ฆฌ๊ฐ€ ๋น„๊ต์  ์‹œ๊ธ‰ํ•œ C, D๋“ฑ๊ธ‰์„ ์ถ”์ถœํ•˜์ง€ ๋ชปํ•˜์—ฌ ๊ตฌ์กฐ ์•ˆ์ „์„ฑ์— ํฐ ๋ฌธ์ œ๋ฅผ ์•ผ๊ธฐํ•  ์šฐ๋ ค๊ฐ€ ์žˆ์œผ๋ฏ€๋กœ ๊ทธ ๊ฐœ์ˆ˜๋ฅผ ์ตœ์†Œํ™”์‹œํ‚ค๋Š” ๊ฒƒ์ด ๋ฐ”๋žŒ์งํ•˜๋‹ค.

์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์™€ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ์˜ ํ˜ผ๋™ํ–‰๋ ฌ์„ ์‚ดํŽด๋ณด๋ฉด Fig. 8๊ณผ ๊ฐ™์ด ์ •๋‹ต์ธ TP์™€ TN์˜ ๊ฐœ์ˆ˜๋Š” ๋ชจ๋“  ๋“ฑ๊ธ‰์˜ ์˜ˆ์ธก์— ์žˆ์–ด ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ๊ฐ€ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด๋ณด๋‹ค ๋” ๋งŽ์€ ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋˜ํ•œ, ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ• ์ค‘ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง์„ ์ ์šฉํ•œ ๋ชจ๋ธ์ด ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์™€ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ๋ชจ๋‘์—์„œ ์‹ค์ œ C, D๋“ฑ๊ธ‰์ธ ๊ต๋Ÿ‰์„ A๋“ฑ๊ธ‰ ๋˜๋Š” B๋“ฑ๊ธ‰์œผ๋กœ ์ƒํ–ฅํ•˜์—ฌ ์˜ˆ์ธกํ•˜๋Š” ๊ฒฝ์šฐ์ธ FN์˜ ๊ฐœ์ˆ˜๊ฐ€ ๊ฐ€์žฅ ์ ์—ˆ๋‹ค. ์œ ์ง€๊ด€๋ฆฌ ์ธก๋ฉด์—์„œ๋Š” C, D๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ์ด ๊ฐ€์žฅ ์ค‘์š”ํ•˜๋ฏ€๋กœ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์™€ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ๋ชจ๋‘ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์ด ํšจ๊ณผ์ ์ด๋ผ๊ณ  ํŒ๋‹จ๋œ๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ C, D๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ์˜ˆ์ธก์—์„œ ๋žœ๋ค ์˜ค๋ฒ„ ์ƒ˜ํ”Œ๋ง๊ณผ SMOTETomek ์ƒ˜ํ”Œ๋ง์€ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์˜ ๊ฒฝ์šฐ TP๋ณด๋‹ค FN์ด ๋” ๋งŽ์•„์„œ ๋ถ€์ ํ•ฉํ–ˆ์ง€๋งŒ, ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ์—์„œ๋Š” ๋น„๋ก ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง์˜ ์ˆ˜์ค€์—๋Š” ๋ชป ๋ฏธ์ณค์ง€๋งŒ FN๋ณด๋‹ค TP๊ฐ€ ๋” ๋งŽ์€ ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ํ•œํŽธ, A๋“ฑ๊ธ‰์˜ ์˜ˆ์ธก๋ ฅ์€ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์— ๊ด€๊ณ„์—†์ด ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ๊ฐ€ ๋Œ€์ฒด๋กœ ์šฐ์ˆ˜ํ–ˆ๊ณ , B๋“ฑ๊ธ‰์˜ ์˜ˆ์ธก๋ ฅ์€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ด๋‚˜ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์— ๋”ฐ๋ผ ์ฐจ์ด๊ฐ€ ์žˆ์–ด ์ผ๋ฅ ์ ์œผ๋กœ ์–ธ๊ธ‰ํ•˜๊ธฐ ์–ด๋ ค์› ๋‹ค.

์ด์ฒ˜๋Ÿผ ํ˜ผ๋™ํ–‰๋ ฌ์„ ํ†ตํ•˜์—ฌ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์™€ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ์ค‘ ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํŒ๋ณ„ํ•  ์ˆ˜ ์žˆ์œผ๋‚˜, ์ƒ˜ํ”Œ๋ง๋ณ„ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ์ˆ˜์น˜๋กœ ํ‘œํ˜„๋œ ์ •๋Ÿ‰์ ์ธ ๊ธฐ์ค€์œผ๋กœ ๋น„๊ตํ•˜๊ธฐ๋Š” ์–ด๋ ต๋‹ค. ๋”ฐ๋ผ์„œ ๋ชจ๋ธ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ์ข€ ๋” ๋ช…ํ™•ํ•˜๊ฒŒ ์ •๋Ÿ‰์ ์œผ๋กœ ํŒ๋‹จํ•˜๊ธฐ ์œ„ํ•ด 3.2์ ˆ์—์„œ๋Š” ์ˆ˜์น˜๋‚˜ ๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„๋  ์ˆ˜ ์žˆ๋Š” ์ถ”๊ฐ€์ ์ธ ์ง€ํ‘œ๋“ค์„ ์ด์šฉํ•˜์—ฌ ๋ถ„์„ํ•˜์˜€๋‹ค.

Fig. 7. Confusion Matrix
../../Resources/KSCE/Ksce.2023.43.3.0397/fig7.png
Fig. 8. Results of Confusion Matrix: (a) Decision Tree, (b) Random Forest
../../Resources/KSCE/Ksce.2023.43.3.0397/fig8.png

3.2 ์ •ํ™•๋„, C, D๋“ฑ๊ธ‰ ์žฌํ˜„์œจ, ROC ๊ณก์„  ๋ฐ AUC

์ผ๋ฐ˜์ ์ธ ์ •ํ™•๋„๋Š” ์ „์ฒด์—์„œ ์ •๋‹ต์˜ ๋น„์œจ์„ ์‹ (4)๋กœ ๊ตฌํ•˜๋ฉฐ, ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ๋ฒ”์ฃผ๋ณ„ ๋ถ„ํฌ๊ฐ€ ๊ท ๋“ฑํ•˜์ง€ ์•Š์œผ๋ฉด ์˜ˆ์ธก๋ ฅ์„ ๊ณผ๋Œ€ ๋˜๋Š” ๊ณผ์†Œํ‰๊ฐ€ํ•  ์šฐ๋ ค๊ฐ€ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋Ÿฌํ•œ ์ •ํ™•๋„๋Š” ์ด ์—ฐ๊ตฌ์—์„œ์™€ ๊ฐ™์ด ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถˆ๊ท ํ˜•ํ•œ ๊ฒฝ์šฐ์—๋Š” ํ‰๊ฐ€ ์ง€ํ‘œ๋กœ์„œ ์ ์ ˆ์น˜ ์•Š๋‹ค. ๋ถˆ๊ท ํ˜•ํ•œ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ๊ฒฝ์šฐ ์‹ (5)์™€ ๊ฐ™์ด ์žฌํ˜„์œจ๊ณผ ํŠน์ด๋„์˜ ์‚ฐ์ˆ ํ‰๊ท ์œผ๋กœ ๊ณ„์‚ฐ๋˜๋Š” ๊ท ํ˜• ์ •ํ™•๋„๋ฅผ ํ‰๊ฐ€ ์ง€ํ‘œ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋ฐ”๋žŒ์งํ•˜๋‹ค. ์ด๋Š” ๊ฐ ๋ฒ”์ฃผ์˜ ๊ฐœ์ˆ˜ ์ฐจ์ด์˜ ์˜ํ–ฅ์„ ์ค„์ž„์œผ๋กœ์จ ๊ฐœ์ˆ˜๊ฐ€ ์ž‘์€ ๋ฒ”์ฃผ์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ์ด ์™œ๊ณก๋˜์ง€ ์•Š๋„๋ก ํ•˜๋Š” ์žฅ์ ์ด ์žˆ๋‹ค.

(4)
${Accuracy}=\dfrac{TP+TN}{TP+TN+FP+FN}$
(5)
${Balanced}\;\;{accuracy}=\dfrac{1}{2}\left(\dfrac{TP}{TP+FN}+\dfrac{TN}{TN+FP}\right)$

์ •๋ฐ€๋„(Precision)๋Š” ์‹ (6)๊ณผ ๊ฐ™์ด ํŠน์ • ๋ฒ”์ฃผ๋กœ ์˜ˆ์ธกํ•œ ๊ฒƒ ์ค‘์—์„œ ์‹ค์ œ๋กœ ํŠน์ • ๋ฒ”์ฃผ์— ์กด์žฌํ•˜๋Š” ๋น„์œจ์„ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ์ด๋ฉฐ, ์‹ (7)๊ณผ ๊ฐ™์€ ์žฌํ˜„์œจ์€ ์‹ค์ œ ํŠน์ • ๋ฒ”์ฃผ์— ์†ํ•˜๋Š” ๊ฒƒ ์ค‘ ์˜ˆ์ธก์ด ๋งž๋Š” ๋น„์œจ์„ ์˜๋ฏธํ•œ๋‹ค(He and Garcia, 2009). ํ•œํŽธ F1-score๋Š” ์‹ (8)๊ณผ ๊ฐ™์ด ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ์˜ ์กฐํ™”ํ‰๊ท ์œผ๋กœ์„œ ๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ๋Œ€ํ‘œ์ ์ธ ํ‰๊ฐ€ ์ง€ํ‘œ ์ค‘ ํ•˜๋‚˜์ด๋‹ค(Grandini et al., 2020). ๋ณดํ†ต ์žฌํ˜„์œจ๊ณผ ์ •๋ฐ€๋„๋Š” ์ƒ๋Œ€์ ์ธ ๊ด€๊ณ„๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์–ด ๋ชจ๋“  ๋ฒ”์ฃผ์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚˜์ง€ ์•Š์€ ์ด์ƒ ์žฌํ˜„์œจ์ด ๊ฐ์†Œํ•˜๋ฉด ์ •๋ฐ€๋„๊ฐ€ ์ƒ์Šนํ•˜๊ณ , ์ •๋ฐ€๋„๊ฐ€ ๊ฐ์†Œํ•˜๋ฉด ์žฌํ˜„์œจ์ด ์ƒ์Šนํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค. F1-score๋Š” ์กฐํ™”ํ‰๊ท ์˜ ํŠน์„ฑ์ƒ ์žฌํ˜„์œจ๊ณผ ์ •๋ฐ€๋„ ์ค‘ ํ•˜๋‚˜๊ฐ€ ์ €ํ•˜๋˜๋ฉด ๋‚ฎ์€ ์ˆ˜์น˜๋ฅผ ์‚ฐ์ถœํ•˜๊ฒŒ ๋œ๋‹ค. ๋”ฐ๋ผ์„œ ๋ชจ๋ธ์˜ ๋ชฉ์ ์— ๋”ฐ๋ผ ์ •๋ฐ€๋„, ์žฌํ˜„์œจ๊ณผ F1-score ์ค‘ ๋” ์ค‘์š”์‹œ๋˜๋Š” ํ‰๊ฐ€ ์ง€ํ‘œ๋ฅผ ์„ ์ •ํ•ด์•ผ ํ•œ๋‹ค. ์•ˆ์ „๋“ฑ๊ธ‰ ์˜ˆ์ธก ์‹œ C, D๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์ด ์œ ์ง€๊ด€๋ฆฌ์˜ ์ค‘์ ์ด ๋˜๋ฏ€๋กœ ์ด๋ฅผ ์˜ฌ๋ฐ”๋กœ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด ๋ฌด์—‡๋ณด๋‹ค ์ค‘์š”ํ•˜๋‹ค. ๋”ฐ๋ผ์„œ ์ •๋ฐ€๋„, ์žฌํ˜„์œจ, F1-score ์ค‘ C, D๋“ฑ๊ธ‰์˜ ์žฌํ˜„์œจ์„ ์ฃผ๋กœ ๊ณ ๋ คํ•˜์—ฌ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜์˜€๋‹ค.

(6)
${Precision} =\dfrac{TP}{TP+FP}$
(7)
${Recall}=\dfrac{TP}{TP+FN}$
(8)
${F}1โ€{score}= 2\left(\dfrac{{Precision}ยท{Recall}}{{Precision}+{Recall}}\right)$

ํ•œํŽธ, ROC ๊ณก์„ ์€ ๋ถ„๋ฅ˜์—์„œ ๋น„์šฉ์— ํ•ด๋‹น๋˜๋Š” FPR(False Positive Rate)๊ณผ ์ด๋“์— ํ•ด๋‹น๋˜๋Š” TPR(True Positive Rate)์˜ ๋น„์œจ์„ ์‹œ๊ฐ์ ์œผ๋กœ ๋‚˜ํƒ€๋‚ด์–ด ๋ชจ๋ธ์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ์‰ฝ๊ฒŒ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ ๊ฒƒ์ด๋‹ค. FPR์€ ์‹ค์ œ ํŠน์ • ๋ฒ”์ฃผ์— ์กด์žฌํ•˜์ง€ ์•Š๋Š” ๊ฐ’์„ ํŠน์ • ๋ฒ”์ฃผ์— ์กด์žฌํ•œ๋‹ค๊ณ  ์˜ˆ์ธกํ•˜๋Š” ๋น„์œจ๋กœ ๋น„์šฉ์— ํ•ด๋‹นํ•˜๋ฉฐ, TPR์€ ์‹ค์ œ ํŠน์ • ๋ฒ”์ฃผ์— ์กด์žฌํ•˜๋Š” ๊ฐ’์„ ์˜ฌ๋ฐ”๋กœ ์˜ˆ์ธกํ•˜๋Š” ๋น„์œจ๋กœ ์ด๋“์— ํ•ด๋‹นํ•œ๋‹ค. ROC ๊ณก์„ ์€ (0, 1)์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก ์ด๋“์ด ๋งŽ์ด ๋ฐœ์ƒํ•ด๋„ ์†์‹ค ๋ฐœ์ƒ์ด ์ ๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•˜๋ฏ€๋กœ ๋ถ„๋ฅ˜๊ฐ€ ์™„๋ฒฝ์— ๊ฐ€๊นŒ์›€์„ ๋‚˜ํƒ€๋‚ธ๋‹ค(He and Garcia, 2009). AUC๋Š” ROC ๊ณก์„ ์˜ ์•„๋ž˜์ชฝ ๋ฉด์ ์„ ๋‚˜ํƒ€๋‚ธ ๊ฐ’์œผ๋กœ ROC ๊ณก์„ ์˜ ์„ฑ๋Šฅ์„ ์ˆ˜์น˜๋กœ ๋น„๊ตํ•  ๋•Œ ์œ ์šฉํ•˜๋‹ค. AUC๋Š” 1์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก ์„ฑ๋Šฅ์ด ์ข‹๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•˜๊ณ , ํ†ต์ƒ์ ์œผ๋กœ 0.8 ์ด์ƒ์ด๋ฉด ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚œ ๋ถ„๋ฅ˜๊ธฐ, 0.7 ์ด์ƒ์ด๋ฉด ์„ฑ๋Šฅ์ด ์ค€์ˆ˜ํ•œ ๋ถ„๋ฅ˜๊ธฐ๋กœ ๊ฐ„์ฃผํ•˜๋ฉฐ, 0.5 ์ดํ•˜์ด๋ฉด ๋ถ„๋ฅ˜์˜ ์˜๋ฏธ๊ฐ€ ์—†๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค(Hosmer and Lemeshow, 2000). ROC ๊ณก์„  ๋ฐ AUC๋Š” ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ๋Œ€ํ‘œ์ ์ธ ํ‰๊ฐ€ ์ง€ํ‘œ์ด๋ฉฐ, ๊ฐœ๋ณ„ ๋ฒ”์ฃผ์— ๋”ฐ๋ฅธ ๋น„์šฉ ๋ฐ ์†์‹ค์„ ๊ณ„์‚ฐํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค๋ฅธ ๋ฒ”์ฃผ์˜ ์˜ํ–ฅ์„ ์ ๊ฒŒ ๋ฐ›์œผ๋ฏ€๋กœ ๋ถˆ๊ท ํ˜• ๋ฐ์ดํ„ฐ์˜ ๋ถ„์„์— ์ ํ•ฉํ•˜๋‹ค๊ณ  ํŒ๋‹จํ•˜์—ฌ ํ‰๊ฐ€ ์ง€ํ‘œ๋กœ ์‚ฌ์šฉํ•˜์˜€๋‹ค.

๋ถ„์„ ๊ฒฐ๊ณผ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์™€ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ๋ชจ๋ธ์˜ ๊ท ํ˜• ์ •ํ™•๋„, C, D๋“ฑ๊ธ‰ ์žฌํ˜„์œจ, AUC๋ฅผ Table 6 ๋ฐ 7์— ๋‚˜ํƒ€๋‚ด์—ˆ๊ณ , ROC ๊ณก์„ ์€ Fig. 9์™€ ๊ฐ™๋‹ค. ์—ฌ๊ธฐ์„œ ๊ท ํ˜• ์ •ํ™•๋„ ๋ฐ AUC๋Š” ๋ชจ๋“  ๋“ฑ๊ธ‰์— ๋Œ€ํ•œ ํ‰๊ท ๊ฐ’์ด๋‹ค. ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์˜ ๊ฒฝ์šฐ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง์ด ๋ชจ๋“  ์ธก๋ฉด์—์„œ ๋žœ๋ค ์˜ค๋ฒ„ ์ƒ˜ํ”Œ๋ง ๋ฐ SMOTETomek ์ƒ˜ํ”Œ๋ง๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์šฐ์ˆ˜ํ•˜์˜€๋‹ค. ํŠนํžˆ, ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง์€ AUC๊ฐ€ 0.7 ์ด์ƒ์œผ๋กœ ์„ฑ๋Šฅ์ด ์ค€์ˆ˜ํ•˜์˜€๊ณ , ๋ฌด์—‡๋ณด๋‹ค C, D๋“ฑ๊ธ‰์˜ ์žฌํ˜„์œจ์ด 78.7%๋กœ ๋‹ค๋ฅธ ๋‘ ๊ฐ€์ง€ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•๋ณด๋‹ค ์›”๋“ฑํžˆ ์šฐ์ˆ˜ํ•˜์˜€๋‹ค. ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ์˜ ๊ฒฝ์šฐ ๋ชจ๋“  ์ƒ˜ํ”Œ๋ง์—์„œ ๊ท ํ˜• ์ •ํ™•๋„๊ฐ€ 64~67% ์ˆ˜์ค€์œผ๋กœ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด๋ณด๋‹ค ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์œ ํ•˜์˜€์œผ๋ฉฐ, AUC ๋˜ํ•œ 0.8 ์ด์ƒ์œผ๋กœ ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚œ ๋ถ„๋ฅ˜๊ธฐ๋กœ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋‹ค๋งŒ C, D๋“ฑ๊ธ‰์˜ ์žฌํ˜„์œจ์€ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง์ด 83.4%๋กœ ๋‹ค๋ฅธ ๋‘ ์ƒ˜ํ”Œ๋ง๋ณด๋‹ค ์šฐ์ˆ˜ํ•˜์˜€๋‹ค. ์ด๋Š” ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰ ํŒ์ •๊ณผ ๊ด€๋ จ๋œ ๊ธฐ์กด์˜ ์ด์ง„ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ์—ฐ๊ตฌ(Chung et al., 2016)์—์„œ C, D๋“ฑ๊ธ‰์˜ ์žฌํ˜„์œจ์ด 67.3%์˜€๋˜ ๊ฒƒ๊ณผ ๋น„๊ตํ•  ๋•Œ ์žฌํ˜„์œจ์„ 16.1%p ํ–ฅ์ƒ์‹œํ‚จ ๊ฒƒ์œผ๋กœ์„œ, ์ด ์—ฐ๊ตฌ์—์„œ ์ ์šฉํ•œ ๋ถ„๋ฅ˜ ๊ธฐ๋ฒ•์˜ ์šฐ์ˆ˜์„ฑ์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค.

ROC ๊ณก์„ ์˜ ๊ฒฝ์šฐ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์˜ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง์—์„œ๋Š” A๋“ฑ๊ธ‰ ๋ฐ C, D๋“ฑ๊ธ‰์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ์ด B๋“ฑ๊ธ‰๋ณด๋‹ค ์ƒ๋Œ€์ ์œผ๋กœ ์šฐ์ˆ˜ํ•˜๊ฒŒ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ํ•˜์ง€๋งŒ, ๋žœ๋ค ์˜ค๋ฒ„ ์ƒ˜ํ”Œ๋ง ๋ฐ SMOTETomek ์ƒ˜ํ”Œ๋ง์€ ๋“ฑ๊ธ‰์— ๊ด€๊ณ„ ์—†์ด ๊ณก์„ ์ด (0, 1)์—์„œ ๋น„๊ต์  ๋ฉ€๋ฆฌ ๋–จ์–ด์ง„ ํ˜•์ƒ์ด๊ธฐ ๋•Œ๋ฌธ์— ์„ฑ๋Šฅ์ด ๋‚ฎ์•˜์œผ๋ฉฐ, ์ด๋Ÿฌํ•œ ๊ฒฝํ–ฅ์€ Table 6์˜ ์ƒ๋Œ€์ ์œผ๋กœ ์ž‘์€ AUC๊ฐ’์—๋„ ๋ฐ˜์˜๋˜์–ด ์žˆ๋‹ค. ๋ฐ˜๋ฉด ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ์—์„œ๋Š” ๋ชจ๋“  ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์—์„œ ROC ๊ณก์„ ์˜ ํ˜•ํƒœ๊ฐ€ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ์œผ๋ฉฐ, ์ด๋Š” Table 7์˜ AUC๊ฐ’์—์„œ๋„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ๋‹ค๋งŒ B๋“ฑ๊ธ‰์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ์ด A๋“ฑ๊ธ‰ ๋ฐ C, D๋“ฑ๊ธ‰๋ณด๋‹ค ๋–จ์–ด์ง€๋Š” ๊ฒฝํ–ฅ์€ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์˜ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ ๋‚˜ํƒ€๋‚ฌ๋‹ค.

์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋“ค์„ ๋ฐ”ํƒ•์œผ๋กœ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์™€ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ์˜ ์„ฑ๋Šฅ์„ ์ „๋ฐ˜์ ์œผ๋กœ ๋น„๊ตํ•˜์ž๋ฉด ๊ท ํ˜• ์ •ํ™•๋„, C, D๋“ฑ๊ธ‰ ์žฌํ˜„์œจ, AUC, ROC ๊ณก์„  ๋ชจ๋‘ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ๊ฐ€ ๋” ์šฐ์ˆ˜ํ•œ ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ํŠนํžˆ, ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ์˜ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ• ์ค‘ C, D๋“ฑ๊ธ‰์˜ ์žฌํ˜„์œจ์ด ์›”๋“ฑํžˆ ๋›ฐ์–ด๋‚œ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง์ด ๋ถ€๊ฐ๋˜์—ˆ๋‹ค. ์ด์ƒ๊ณผ ๊ฐ™์ด ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰ ์˜ˆ์ธก์— ๋‘ ๊ฐ€์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ์ ์šฉํ•˜์—ฌ ๋ถ„์„ํ•œ ๊ฒฐ๊ณผ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ๋ชจ๋ธ์˜ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง์ด C, D๋“ฑ๊ธ‰์„ ์ถ”์ถœํ•˜๋Š” ์˜ˆ์ธก๋ ฅ์ด ์šฐ์ˆ˜ํ•˜์—ฌ ๊ถŒ์žฅ๋  ์ˆ˜ ์žˆ๋‹ค.

Fig. 9. Comparison of ROC Curves: (a) Decision Tree, (b) Random Forest
../../Resources/KSCE/Ksce.2023.43.3.0397/fig9.png
Table 6. Evaluation of Predictive Performance in Decision Tree

Evaluation index

Sampling

Random under- sampling

Random over- sampling

SMOTETomek sampling

Balanced accuracy(%)

61.1

56.4

56.8

Recall of C, D grade(%)

78.7

31.3

41.2

AUC

0.763

0.675

0.676

Table 7. Evaluation of Predictive Performance in Random Forest

Evaluation index

Sampling

Random under- sampling

Random over- sampling

SMOTETomek sampling

Balanced accuracy(%)

67.0

64.7

67.0

Recall of C, D grade(%)

83.4

63.5

71.1

AUC

0.823

0.834

0.834

3.3 ํ™œ์šฉ ๋ฐฉ์•ˆ

์ด์ƒ๊ณผ ๊ฐ™์ด ์ผ๋ฐ˜๊ตญ๋„์ƒ ๊ต๋Ÿ‰ 8,850๊ฐœ์— ๋Œ€ํ•ด์„œ๋Š” Table 4 ๋ฐ 5์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๊ฐ€์ง€๊ณ  ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ๋ชจ๋ธ์˜ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์„ ์ ์šฉํ•˜๋Š” ๊ฒƒ์ด ํšจ๊ณผ์ ์ธ ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ๋‚˜ ์„ฑ์งˆ์— ๋”ฐ๋ผ ์ตœ์ ์˜ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ด๋‚˜ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์€ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์œผ๋‚˜, ์ผ๋ฐ˜๊ตญ๋„ ์™ธ์— ๊ณ ์†๊ตญ๋„๋‚˜ ์ง€๋ฐฉ๋„์ƒ์˜ ๊ต๋Ÿ‰๋“ค๋„ ์œ ์‚ฌํ•œ ๋ฐ์ดํ„ฐ ๋ถ„ํฌ ๋ฐ ์„ฑ์งˆ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•œ๋‹ค๋ฉด ๋™์ผํ•œ ๊ธฐ๋ฒ•์„ ์ ์šฉํ•˜์—ฌ ๋ถ„์„ ๊ฐ€๋Šฅํ•˜๋‹ค.

์ œ์•ˆ๋œ ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•˜์—ฌ ์ ๊ฒ€์ด ์‹ค์‹œ๋˜์ง€ ๋ชปํ–ˆ๊ฑฐ๋‚˜ ์ ๊ฒ€ ์‹œ๊ธฐ๊ฐ€ ์ดˆ๊ณผ๋œ ๊ต๋Ÿ‰์˜ ์„ฑ๋Šฅ์„ ์˜ˆ์ธกํ•  ๋•Œ๋Š” ๊ฐ์ข… ๋ฐ์ดํ„ฐ์™€ ๋”๋ถˆ์–ด ์•ˆ์ „๋“ฑ๊ธ‰์„ ํ™•์ธํ•˜๊ณ ์ž ํ•˜๋Š” ์—ฐ๋„๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ์˜ˆ์ƒ๋˜๋Š” ๊ต๋Ÿ‰ ์•ˆ์ „๋“ฑ๊ธ‰์„ ์‚ฐ์ถœํ•˜๊ฒŒ ๋œ๋‹ค. ํŠนํžˆ, ์ œ์•ˆ ๊ธฐ๋ฒ•์€ ๋…ธํ›„๋„๊ฐ€ ํฐ C, D๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ์˜ˆ์ธก๋ ฅ์ด ์šฐ์ˆ˜ํ•˜๋ฏ€๋กœ ๊ต๋Ÿ‰์˜ ์ ์ • ๋ณด์ˆ˜๋ณด๊ฐ• ์‹œ๊ธฐ ์ถ”์ • ๋ฐ ์œ ์ง€๊ด€๋ฆฌ ์˜ˆ์‚ฐ ์‚ฐ์ถœ์— ์œ ์šฉํ•˜๊ฒŒ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค.

4. ๊ต๋Ÿ‰ ์•ˆ์ „๋“ฑ๊ธ‰์˜ ์˜ํ–ฅ ์š”์ธ

๊ฒฐ์ •๋‚˜๋ฌด ๊ธฐ๋ฐ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ๋ถ„๋ฅ˜ ๋ชจ๋ธ๋“ค์„ ์ƒ์„ฑํ•˜์˜€๊ธฐ ๋•Œ๋ฌธ์— ๋ถˆ์ˆœ๋„๊ฐ€ ๊ฐ์†Œํ•˜๋Š” ์ •๋„๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋Š” ๊ณผ์ •์—์„œ ๋ณ€์ˆ˜๋“ค์ด ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” ์ •๋„์ธ ๋ณ€์ˆ˜ ์ค‘์š”๋„(Variable importance)๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ, ๋‹ค๋ฅธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์—๋„ ์‚ฌ์šฉํ•˜๋Š” ์ˆœ์—ด ๋ณ€์ˆ˜ ์ค‘์š”๋„(Permutation feature importance)๋„ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค. ์ˆœ์—ด ๋ณ€์ˆ˜ ์ค‘์š”๋„๋Š” ํ•™์Šต์ด ๋๋‚œ ๋ชจ๋ธ์—์„œ ๋ณ€์ˆ˜๋ฅผ ํ•˜๋‚˜์”ฉ ์ œ๊ฑฐํ•ด๊ฐ€๋ฉฐ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๋งŽ์ด ์ €ํ•˜์‹œํ‚ค๋Š” ๋ณ€์ˆ˜๋ฅผ ์„ ์ •ํ•จ์œผ๋กœ์จ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๋‹ค(Scikit-learn developers, 2007-2022).

์ด ์—ฐ๊ตฌ์—์„œ๋Š” ๋ณ€์ˆ˜ ์ค‘์š”๋„์™€ ์ˆœ์—ด ๋ณ€์ˆ˜ ์ค‘์š”๋„๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” ์ฃผ์š” ์š”์ธ์„ ๋„์ถœํ•˜๊ณ  ์‹ค์ œ ๊ฒฝํ–ฅ๊ณผ ๋น„๊ตํ•˜์—ฌ ์ฃผ์š” ๋ณ€์ˆ˜์™€ ์•ˆ์ „๋“ฑ๊ธ‰์˜ ๊ด€๋ จ์„ฑ์„ ๋ถ„์„ํ•˜์˜€๋‹ค. ์ด ์—ฐ๊ตฌ์—์„œ ์ตœ์ ์œผ๋กœ ๋‚˜ํƒ€๋‚œ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง์„ ์ ์šฉํ•œ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ๋ชจ๋ธ์—์„œ ๋ณ€์ˆ˜ ์ค‘์š”๋„์™€ ์ˆœ์—ด ๋ณ€์ˆ˜ ์ค‘์š”๋„๋Š” Table 8๊ณผ ๊ฐ™์œผ๋ฉฐ, ๊ณตํ†ต์ ์œผ๋กœ ๋‚˜ํƒ€๋‚œ ์ค‘์š” ๋ณ€์ˆ˜๋Š” ๊ณต์šฉ๊ธฐ๊ฐ„, ๊ต๋Ÿ‰์—ฐ์žฅ, ๊ตํ†ต๋Ÿ‰, ์‹œ์„ค๋ฌผ์ข…๋ณ„๋“ฑ๊ธ‰๊ตฌ๋ถ„์ด๋‹ค.

์ดํ•˜์—์„œ๋Š” Table 8์— ์ œ์‹œ๋œ ์ค‘์š” ๋ณ€์ˆ˜๋“ค ์ค‘ ๋Œ€ํ‘œ์ ์œผ๋กœ ๊ณต์šฉ๊ธฐ๊ฐ„๊ณผ ๊ตํ†ต๋Ÿ‰์ด ๊ต๋Ÿ‰ ์•ˆ์ „๋“ฑ๊ธ‰๊ณผ ๊ด€๋ จ์„ฑ์ด ํฐ ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚œ ์ด์œ ๋ฅผ ์‹ค์ œ ์ƒํ™ฉ์— ๋น„์ถ”์–ด ๋ถ„์„ํ•˜์˜€๋‹ค.

๊ณต์šฉ๊ธฐ๊ฐ„์€ ๊ต๋Ÿ‰์˜ ๋…ธํ›„ํ™”์™€ ์ง์ ‘์ ์œผ๋กœ ๊ด€๋ จ๋œ ์‹œ๊ฐ„์  ์š”์ธ์œผ๋กœ ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰์— ํฐ ์˜ํ–ฅ์„ ์ค€๋‹ค. ๊ต๋Ÿ‰ ์ค€๊ณต ํ›„ ์‹œ๊ฐ„์ด ๋งŽ์ด ๊ฒฝ๊ณผ๋ ์ˆ˜๋ก ๊ฒฐํ•จ์ด ๋ฐœ์ƒํ•˜๊ณ , ์ง€์†์ ์ธ ์‚ฌ์šฉ์œผ๋กœ ์ธํ•˜์—ฌ ๋‚ด๊ตฌ์„ฑ๊ณผ ์•ˆ์ „์„ฑ์ด ์ €ํ•˜๋˜๊ธฐ ๋•Œ๋ฌธ์— ๊ต๋Ÿ‰์˜ ๋…ธํ›„ํ™”๊ฐ€ ์ง„ํ–‰๋œ๋‹ค. Fig. 10์—์„œ C, D๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ๋น„์œจ์€ 1991๋…„ ์ด์ „์— ์ค€๊ณต๋œ ๊ต๋Ÿ‰์—์„œ 25.6%๋กœ ๊ฐ€์žฅ ๋†’์•˜์œผ๋ฉฐ, ์ตœ๊ทผ 10๋…„ ์‚ฌ์ด์— ์ค€๊ณต๋œ ๊ต๋Ÿ‰์—์„œ๋Š” ๊ทธ ๋น„์œจ์ด 1.2%๋กœ ๊ฐ€์žฅ ๋‚ฎ์•˜๋‹ค. ๋˜ํ•œ, ์ตœ๊ทผ ์ค€๊ณต๋˜์–ด ๊ณต์šฉ๊ธฐ๊ฐ„์ด ์งง์€ ๊ต๋Ÿ‰์ผ์ˆ˜๋ก ๋…ธํ›„ํ™”๊ฐ€ ๋งŽ์ด ์ง„ํ–‰๋˜์ง€ ์•Š์•„ ๊ฒฐํ•จ์ด ์ ์–ด์„œ A๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ๋น„์œจ์ด ์ฆ๊ฐ€ํ•˜์˜€๋‹ค. ์ด์ฒ˜๋Ÿผ A์™€ C, D๋“ฑ๊ธ‰์˜ ๊ต๋Ÿ‰์€ ๊ณต์šฉ๊ธฐ๊ฐ„์— ๋”ฐ๋ฅธ ๊ฒฝํ–ฅ์ด ๋šœ๋ ทํ•˜์˜€๋‹ค. ๋ฐ˜๋ฉด B๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ๊ฒฝ์šฐ ๊ณต์šฉ๊ธฐ๊ฐ„๊ณผ์˜ ๊ด€๋ จ์„ฑ์ด ๋ช…ํ™•ํ•˜์ง€ ์•Š์•„ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์—์„œ Fig. 9์™€ ๊ฐ™์ด B๋“ฑ๊ธ‰์˜ ์˜ˆ์ธก๋ ฅ์ด ์ €ํ•˜๋œ ์ผ๋ถ€ ์›์ธ์ด ๋˜์—ˆ์„ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋œ๋‹ค.

๊ตํ†ต๋Ÿ‰์€ ์ฐจ๋Ÿ‰ํ•˜์ค‘์— ์˜ํ•œ ๊ต๋Ÿ‰์˜ ํ”ผ๋กœ ํ˜„์ƒ๊ณผ ๊ด€๋ จํ•˜์—ฌ ์•ˆ์ „์„ฑ ๋ฐ ๋‚ด๊ตฌ์„ฑ์— ์˜ํ–ฅ์„ ๋ฏธ์นœ๋‹ค. ํŠน์ด ์‚ฌํ•ญ์œผ๋กœ๋Š” Fig. 11๊ณผ ๊ฐ™์ด ๊ตํ†ต๋Ÿ‰์ด ํ•˜๋ฃจ 1,000๋Œ€ ์ดํ•˜์ด๊ฑฐ๋‚˜ 10,000๋Œ€ ์ด์ƒ์ด๋ฉด C, D๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ๋น„์œจ์ด ๋‹ค์†Œ ์ปค์ง€๊ณ  A๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ๋น„์œจ์ด ๋‹ค์†Œ ์ž‘์•„์ง€๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์—ˆ๋‹ค. ํŠนํžˆ, ๊ตํ†ต๋Ÿ‰์ด 10,000๋Œ€ ์ด์ƒ์œผ๋กœ ๋งŽ์€ ๊ฒฝ์šฐ ๋ฐ˜๋ณต๋˜๋Š” ์ฐจ๋Ÿ‰ํ•˜์ค‘์œผ๋กœ ์ธํ•ด ํ”ผ๋กœ๊ฐ€ ๋ˆ„์ ๋˜์–ด ๊ท ์—ด์ด๋‚˜ ์ฒ˜์ง ๋“ฑ ๊ต๋Ÿ‰์˜ ์‚ฌ์šฉ์„ฑ์ด ์ €ํ•˜๋˜๊ณ  ๋…ธ๋ฉด์˜ ํŒŒ์† ๋“ฑ๊ณผ ๊ฐ™์€ ๊ฒฐํ•จ๋„ ๋ฐœ์ƒํ•  ๊ฐ€๋Šฅ์„ฑ์ด ํฌ๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ๋Ÿฌํ•œ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜ํƒ€๋‚œ ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค. ์ด์ฒ˜๋Ÿผ ๊ตํ†ต๋Ÿ‰์ด ๋งค์šฐ ๋งŽ๊ฑฐ๋‚˜ ์ ์€ ๊ฒฝ์šฐ ์•ˆ์ „๋“ฑ๊ธ‰์˜ ์ €ํ•˜ ๊ฒฝํ–ฅ์ด ๋‚˜ํƒ€๋‚ฌ๋‹ค.

Fig. 10. Safety Grade According to Completion Year of Bridges
../../Resources/KSCE/Ksce.2023.43.3.0397/fig10.png
Fig. 11. Safety Grade According to Average Daily Traffic
../../Resources/KSCE/Ksce.2023.43.3.0397/fig11.png
Table 8. Variable Importance and Permutation Feature Importance of the Random Forest Using Random Under-sampling

Rank

Variable importance

Permutation feature importance

1

Service period

Service period

2

Bridge length

Bridge length

3

Average daily traffic

Average daily traffic

4

Facility class

Facility class

5

Bridge width

Separation of northbound and southbound lanes

5. ๊ฒฐ ๋ก 

์ด ์—ฐ๊ตฌ์—์„œ๋Š” ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰ ์˜ˆ์ธก์„ ์œ„ํ•ด ๋จธ์‹ ๋Ÿฌ๋‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ธ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์™€ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ๋ฅผ ์ด์šฉํ•˜์—ฌ ๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ์ˆ˜์ง‘๋œ ๊ต๋Ÿ‰ ๋ฐ์ดํ„ฐ์—์„œ ๋ณ€์ˆ˜ ์ถ”๊ฐ€, ์ œ๊ฑฐ, ์ถ•์†Œ ๋ฐ ๋‹ค์ค‘๊ณต์„ ์„ฑ ๊ฒ€ํ†  ๊ณผ์ •์„ ๊ฑฐ์ณ ๋ชจ๋ธ ๊ตฌ์ถ•์— ํ•„์š”ํ•œ ์ตœ์ ์˜ ๋ณ€์ˆ˜๋“ค์„ ๋„์ถœํ•˜์˜€๋‹ค. ๊ฐœ๋ฐœ๋œ ๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ ํ‰๊ฐ€ ์‹œ ์ผ๋ฐ˜์ ์ธ ํ‰๊ฐ€ ์ง€ํ‘œ๊ฐ€ ์•„๋‹Œ ๋ฒ”์ฃผ ๊ฐ„ ๋ถˆ๊ท ํ˜• ๋ฐ์ดํ„ฐ์— ์ ํ•ฉํ•œ ํ‰๊ฐ€ ์ง€ํ‘œ๋“ค์„ ํ†ตํ•ด ๋ชจ๋ธ์ด ์ค€์ˆ˜ํ•œ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๋ณด์œ ํ•˜๋Š”์ง€ ํŒ๋‹จํ•˜์˜€๋‹ค. ๋˜ํ•œ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์˜ ๋ถˆ๊ท ํ˜• ๋ฌธ์ œ๋ฅผ ๊ฐœ์„ ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง, ๋žœ๋ค ์˜ค๋ฒ„ ์ƒ˜ํ”Œ๋ง, SMOTETomek ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์„ ๊ฐ๊ฐ ์ ์šฉํ•˜์—ฌ ๊ฒฐ๊ณผ๋ฅผ ๋น„๊ตํ•˜์˜€๋‹ค. ์ด ์—ฐ๊ตฌ์—์„œ ๋„์ถœ๋œ ์ฃผ์š” ๊ฒฐ๋ก ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

(1) ๊ฒฐ์ •๋‚˜๋ฌด ๊ธฐ๋ฐ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ด์šฉํ•˜์—ฌ ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜์˜€๊ธฐ ๋•Œ๋ฌธ์— ๋ชจ๋ธ ํ˜•์„ฑ ๊ณผ์ •์—์„œ ๊ฒฐ๊ณผ์— ์ค‘์š”ํ•œ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” ๋ณ€์ˆ˜๋“ค์„ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์™€ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ์—์„œ ๊ณตํ†ต์ ์œผ๋กœ ๊ณต์šฉ๊ธฐ๊ฐ„, ๊ต๋Ÿ‰์—ฐ์žฅ, ๊ตํ†ต๋Ÿ‰ ๋ฐ ์‹œ์„ค๋ฌผ์ข…๋ณ„๋“ฑ๊ธ‰๊ตฌ๋ถ„์ด ์ฃผ์š” ์ธ์ž๋กœ ํ™•์ธ๋˜์—ˆ๋‹ค. ํŠนํžˆ ๊ต๋Ÿ‰์˜ ๋…ธํ›„ํ™”์™€ ์ง์ ‘์ ์œผ๋กœ ๊ด€๋ จ๋œ ๊ณต์šฉ๊ธฐ๊ฐ„์€ ๋ชจ๋ธ์—์„œ ๊ฐ€์žฅ ํฐ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ์ด์™€ ๊ฐ™์ด ๋ณ€์ˆ˜ ์ค‘์š”๋„๋ฅผ ํ†ตํ•ด ํ™•์ธํ•œ ์ฃผ์š” ๋ณ€์ˆ˜๋“ค์€ ์‹ค์ œ ์˜ˆ์ƒ๋˜๋Š” ๊ต๋Ÿ‰์˜ ๊ฑฐ๋™์œผ๋กœ๋ถ€ํ„ฐ ๋ถ„์„ํ•ด ๋ณด์•„๋„ ์—ญ์‹œ ์•ˆ์ „๋“ฑ๊ธ‰์— ํฐ ์˜ํ–ฅ์„ ๋ฏธ์น  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋˜์—ˆ๋‹ค.

(2) ์ผ๋ฐ˜์ ์ธ ์ •ํ™•๋„๋กœ ๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ๊ฒƒ์€ ์™œ๊ณก๋œ ๊ฒฐ๊ณผ๋ฅผ ์‚ฐ์ถœํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ํ‰๊ฐ€์— ์ ํ•ฉํ•œ ํ˜ผ๋™ํ–‰๋ ฌ ๊ธฐ๋ฐ˜์˜ ๊ท ํ˜• ์ •ํ™•๋„, ์žฌํ˜„์œจ, ROC ๊ณก์„  ๋ฐ AUC์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์ง€ํ‘œ๋“ค์„ ํ™œ์šฉํ•˜์—ฌ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜์˜€๋‹ค. ๊ทธ ๊ฒฐ๊ณผ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด๋ณด๋‹ค ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ๋ฅผ ์ ์šฉํ•œ ๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ด ์ „๋ฐ˜์ ์ธ ์„ฑ๋Šฅ ํ‰๊ฐ€ ์ง€ํ‘œ ์ธก๋ฉด์—์„œ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋‚˜ํƒ€๋ƒˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ๊ฐ€ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด๋ณด๋‹ค ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰ ์˜ˆ์ธก์— ๋”์šฑ ์ ํ•ฉํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ํŒ๋‹จ๋˜์—ˆ๋‹ค. ๋˜ํ•œ, ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์—์„œ๋Š” ๋‘ ๊ฐ€์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ๋ชจ๋‘์—์„œ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง์ด ๋Œ€์ฒด๋กœ ์šฐ์ˆ˜ํ•œ ์˜ˆ์ธก๋ ฅ์„ ๋ณด์˜€์œผ๋ฉฐ, ํŠนํžˆ ๋…ธํ›„ํ™”๊ฐ€ ๋น„๊ต์  ์‹ฌํ•˜์—ฌ ์œ ์ง€๊ด€๋ฆฌ ์ธก๋ฉด์—์„œ ์ค‘์š”ํ•œ C, D๋“ฑ๊ธ‰์˜ ์žฌํ˜„์œจ์ด ์›”๋“ฑํžˆ ๋›ฐ์–ด๋‚ฌ๋‹ค. ๊ฒฐ๋ก ์ ์œผ๋กœ ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰ ์˜ˆ์ธก์„ ์œ„ํ•ด์„œ๋Š” ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์ด ์ ์šฉ๋œ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋ฐ”๋žŒ์งํ•˜๋‹ค๊ณ  ํŒ๋‹จ๋œ๋‹ค.

(3) C, D๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์„ C, D๋“ฑ๊ธ‰ ๊ทธ๋Œ€๋กœ ๋ถ„๋ฅ˜ํ•˜์—ฌ ์˜ˆ์ธกํ•  ํ™•๋ฅ ์ธ ์žฌํ˜„์œจ์€ ๋žœ๋ค ์–ธ๋” ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์ด ์ ์šฉ๋œ ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ๋ชจ๋ธ์—์„œ 83.4%๋กœ, ๊ธฐ์กด์˜ ์ด์ง„ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์—์„œ์˜ 67.3%๋ณด๋‹ค 16.1%p ํ–ฅ์ƒ๋œ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์—ˆ๋‹ค. ์ด๋Š” ์ด ์—ฐ๊ตฌ์— ์ ์šฉํ•œ ๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ด ๋‘ ๊ฐ€์ง€ ๋ถ„๋ฅ˜๋งŒ ๊ฐ€๋Šฅํ•œ ์ด์ง„ ๋ถ„๋ฅ˜ ๋ชจ๋ธ๊ณผ ๋น„๊ตํ•  ๋•Œ ๋”์šฑ ๋‹ค์–‘ํ•œ ๊ต๋Ÿ‰ ์•ˆ์ „๋“ฑ๊ธ‰์„ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์žฅ์ ๊ณผ ๋”๋ถˆ์–ด ์ค‘์š” ๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ์˜ˆ์ธก๋ ฅ์ด ์šฐ์ˆ˜ํ•จ์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค.

(4) ์ œ์•ˆ๋œ ๊ธฐ๋ฒ•์„ ์ ๊ฒ€์ด ์‹ค์‹œ๋˜์ง€ ๋ชปํ–ˆ๊ฑฐ๋‚˜ ์ ๊ฒ€ ์‹œ๊ธฐ๊ฐ€ ์ดˆ๊ณผ๋œ ๊ต๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ์— ์ ์šฉํ•˜๋ฉด ํ˜„์žฌ ๋˜๋Š” ํŠน์ • ์‹œ๊ธฐ์˜ ๊ต๋Ÿ‰ ์•ˆ์ „๋“ฑ๊ธ‰์„ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋‹ค. ํŠนํžˆ, ์ œ์•ˆ๋œ ๊ธฐ๋ฒ•์€ C, D๋“ฑ๊ธ‰ ๊ต๋Ÿ‰์˜ ์˜ˆ์ธก๋ ฅ์ด ์šฐ์ˆ˜ํ•˜๋ฏ€๋กœ ๊ต๋Ÿ‰์˜ ์ ์ ˆํ•œ ๋ณด์ˆ˜๋ณด๊ฐ• ์‹œ๊ธฐ ์ถ”์ • ๋ฐ ์œ ์ง€๊ด€๋ฆฌ ์˜ˆ์‚ฐ ์‚ฐ์ถœ์— ์œ ์šฉํ•˜๊ฒŒ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค. ์ด ์—ฐ๊ตฌ๋Š” ์ผ๋ฐ˜๊ตญ๋„์ƒ ๊ต๋Ÿ‰์˜ ๋ถ„์„์— ์ง‘์ค‘ํ–ˆ์ง€๋งŒ, ์ถ”ํ›„ ๊ณ ์†๊ตญ๋„๋‚˜ ์ง€๋ฐฉ๋„์ƒ ๊ต๋Ÿ‰์˜ ์•ˆ์ „๋“ฑ๊ธ‰ ๋ถ„์„์—๋„ ํ™•์žฅํ•˜์—ฌ ์ ์šฉํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ํŒ๋‹จ๋œ๋‹ค.

References

1 
Bektas, B. A., Carriquiry, A. and Smadi, O. (2013). โ€œUsing classification trees for predicting national bridge inventory condition ratings.โ€ Journal of Infrastructure Systems, Vol. 19, No. 4, pp. 425-433, https://doi.org/10.1061/(ASCE)IS.1943-555X.0000143.DOI
2 
Chung, S. H., Lim, S. R. and Chi, S. H. (2016). โ€œDeveloping an estimation model for safety rating of road bridges using rule-based classification method.โ€ Journal of KIBIM, Vol. 6, No. 2, pp. 29-38, https://doi.org/10.13161/kibim.2016.6.2.029 (in Korean).DOI
3 
Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carrรฉ, G., Marquรฉz, J. R. G., Gruber, B., Lafourcade, B., Leitรฃo, P. J., Mรผnkemรผller, T., McClean, C., Osborne, P. E., Reineking, B., Schrรถder, B., Skidmore, A. K., Zurell, D. and Lautenbach, S. (2013). โ€œCollinearity: A review of methods to deal with it and a simulation study evaluating their performance.โ€ Ecography, Vol. 36, No. 1, pp. 27-46, https://doi.org/10.1111/j.1600-0587.2012.07348.x.DOI
4 
Facility Management System(FMS) (2021). https://www.fms.or.kr/com/mainForm.do (Accessed: September 12, 2021).URL
5 
Gรฉron, A. (2019). Hands-on machine learning with Scikit-learn, Keras, and TensorFlow, 2nd Ed., O'Reilly Media, Inc.URL
6 
Grandini, M., Bagli, E. and Visani, G. (2020). Metrics for multi-class classification: An overview, arXiv:2008.05756, Available at: https://arxiv.org/abs/2008.05756 (Accessed: July 25, 2022).URL
7 
He, H. and Garcia, E. A. (2009). โ€œLearning from imbalanced data.โ€ IEEE Transactions on Knowledge and Data Engineering, Vol. 21, No. 9, pp. 1263-1284, https://doi.org/10.1109/TKDE.2008.239.DOI
8 
Hosmer, D. W. and Lemeshow, S. (2000). Applied logistic regression, 2nd Ed., John Wiley & Sons, Inc.URL
9 
Hur, Y. K., Lee, H. I., Shin, J. Y. and Park, C. H. (2010). โ€œA research for the determinant factors of safety ratings in road-bridge.โ€ Journal of the Korea Institute for Structural Maintenance and Inspection, Vol. 14, No. 6, pp. 229-237, https://doi.org/10.11112/jksmi.2010.14.6.229 (in Korean).DOI
10 
Kang, B. H. (2016). โ€œSuggestions for improving the com- petitiveness of safety inspection industry.โ€ KSCE Magazine, Vol. 64, No. 10, pp. 12-15 (in Korean).URL
11 
Kang, S. H., Choi, S. I., Kim, H. R. and Lee, J. S. (2016). โ€œA study on performance evaluation of infrastructure safety and maintenance.โ€ Korean Journal of Construction Engineering and Management, Vol. 17, No. 2, pp. 80-89, https://doi.org/10.6106/ KJCEM.2016.17.2.080 (in Korean).DOI
12 
Kazemitabar, S. J., Amini, A. A., Bloniarz, A. and Talwalker, A. (2017). โ€œVariable importance using decision trees.โ€ Proceedings of the 31st International Conference on Neural Information Processing Systems(NIPS 2017), pp. 425-434.URL
13 
Kim, S. J. and Yoon, M. O. (2018). โ€œA study on the improvement program of bridge safety management through public-private governance.โ€ Journal of the Korean Society of Hazard Mitigation, Vol. 18, No. 1, pp. 145-156, http://doi.org/10.9798/KOSHAM.2018.18.1.145 (in Korean).DOI
14 
Korea Concrete Institute(KCI) (2009). Standard specification for concrete, KCI (in Korean).URL
15 
Lee, H. H., Kyung, K. S. and Jeon, J. C. (2010). โ€œFatigue life estimation method considering traffic properties for steel highway girder bridge.โ€ Journal of Korean Society of Steel Construction, Vol. 22, No. 3, pp. 209-218 (in Korean).URL
16 
Lee, H. H., Shin, B. G., Lee, Y. I. and Kim, Y. M. (2019a). โ€œSuggestion of priority decision method for performance evaluation based on risk index for small and medium sized bridges.โ€ Journal of the Korea Institute for Structural Maintenance and Inspection, Vol. 23, No. 6, pp. 70-76, https://doi.org/10.11112/ jksmi.2019.23.6.70 (in Korean).DOI
17 
Lee, I. K. and Kim, D. H. (2015). โ€œHighway bridge inspection period based on risk assessment.โ€ Journal of the Korea Institute for Structural Maintenance and Inspection, Vol. 19, No. 3, pp. 64-72, https://doi.org/10.11112/jksmi.2015.19.3.064 (in Korean).DOI
18 
Lee, J. H., Lee, K. Y., Ahn, S. M. and Kong, J. S. (2018). โ€œProposal of maintenance scenario of feasibility analysis of bridge inspection using Bayesian approach.โ€ Journal of the Korean Society of Civil Engineers, Vol. 38, No. 4, pp. 505-516, http://doi.org/10.12652/Ksce.2018.38.4.0505 (in Korean).DOI
19 
Lee, K. N., Lim, J. T., Bok, K. S. and Yoo, J. S. (2019b). โ€œHandling method of imbalance data for machine learning: Focused on sampling.โ€ The Journal of the Korea Contents Association, Vol. 19, No. 11, pp. 567-577, https://doi.org/10.5392/JKCA.2019.19.11.567 (in Korean).DOI
20 
Martinez, P., Mohamed, E., Mohsen, O. and Mohamed, Y. (2020). โ€œComparative study of data mining models for prediction of bridge future conditions.โ€ Journal of Performance of Constructed Facilities, Vol. 34, No. 1, 04019108, https://doi.org/10.1061/(ASCE)CF.1943-5509.0001395.DOI
21 
Ministry of Land, Infrastructure and Transport(MOLIT) (2021a). Guidelines for safety and maintenance of facilities, MOLIT (in Korean).URL
22 
Ministry of Land, Infrastructure and Transport(MOLIT) (2021b). National bridge standard data, MOLIT, Available at: https://www.data.go.kr/data/15081953/fileData.do (Accessed: September 12, 2021) (in Korean).DOI
23 
Ministry of Land, Infrastructure and Transport(MOLIT) (2021c). Special act on the safety control and maintenance of establishments, MOLIT (in Korean).URL
24 
Ministry of Land, Infrastructure and Transport(MOLIT) (2021d). Yearbook of road bridge and tunnel statistics, MOLIT, Available at: https://bti.kict.re.kr/bti/publicMain/main.do (Accessed: March 8, 2022) (in Korean).URL
25 
Nguyen, T. T. and Dinh, K. (2019). โ€œPrediction of bridge deck condition rating based on artificial neural networks.โ€ Journal of Science and Technology in Civil Engineering, Vol. 13, No. 3, pp. 15-25, https://doi.org/10.31814/stce.nuce2019-13(3)-02.DOI
26 
Oh, S. T., Lee, D. J. and Lee, J. H. (2010). โ€œA condition rating method of bridges using an artificial neural network model.โ€ Journal of the Korean Society for Railway, Vol. 13, No. 1, pp. 71-77 (in Korean).URL
27 
Provost, F. and Fawcett, T. (2013). Data science of business: What you need to know about data mining and data-analytic thinking, O'Reilly Media, Inc.URL
28 
Ratner, B. (2009). โ€œThe correlation coefficient: Its values range between +1/-1, or do they?โ€ Journal of Targeting, Measurement and Analysis for Marketing, Vol. 17, pp. 139-142, https://doi.org/ 10.1057/jt.2009.5.DOI
29 
Scikit-learn developers (2007-2022). https://scikit-learn.org/.URL
30 
Truicฤƒ, C. O. and Leordeanu, C. A. (2017). โ€œClassification of an imbalanced data set using decision tree algorithms.โ€ U.P.B. Scientific Bulletin, Series C: Electrical Engineering and Computer Science, Vol. 79, Iss. 4, pp. 69-84.URL
31 
Yonhapnews (2023). Most of the bridges in the first new town have passed more than 30 years...Citizens are worried about deterioration. https://www.yna.co.kr/view/AKR20230406092900061 (Accessed: April 12, 2023) (in Korean).URL