Mobile QR Code QR CODE : Journal of the Korean Society of Civil Engineers

Journal of the Korean Society of Civil Engineers

ISO Journal TitleKSCE J. Civ. Environ. Eng. Res.

Open Access, Bi-monthly

Main Menu

Journal Search

[

Surveying and Geo-Spatial Information Engineering

]

KSCE Journal of Civil and Environmental Engineering Research

대한토목학회 Vol. 43, No. 6, p.883-896

ISSN (print) :

1015-6348

ISSN (online) :

2287-934X

Received : 27 July 2023Revised : 19 August 2023Accepted : 19 August 2023

DOI :

https://doi.org/10.12652/Ksce.2023.43.6.0883

k-Nearest Neighbors 분류기를 이용한 복합 지표 산불피해 영역 탐지

Mapping Burned Forests Using a k-Nearest Neighbors Classifier in Complex Land Cover

이한나 (Hanna Lee) ¹iD 윤공현 (Konghyun Yun) ²iD 김기홍 (Gihong Kim) ³^†iD

종신회원 · 강릉원주대학교 방재연구소 연구원 (Gangneung-Wonju National University · leehn77@hanmail.net)
종신회원 · 연세대학교 공학연구원 연구원 (Yonsei University · ykh1207@yonsei.ac.kr)
종신회원 · 교신저자 · 강릉원주대학교 건설환경공학과 교수 (Corresponding Author ․ Gangneung-Wonju National University · ghkim@gwnu.ac.kr)

† :

Corresponding Author

License :

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

초록

인간 활동 영역이 산지 곳곳에 퍼져 있는 한국에서는 산불이 주거지역이나 각종 시설물을 위협하는 경우가 잦다. 따라서 산불 이후 대책 마련과 피해 복구를 위해 피해 범위를 빠르게 파악할 필요가 있으며, 이러한 경우 원격탐사가 유용한 도구가 될 수 있다. 본 연구에서는 2019년 4월에 발생한 고성·속초 산불 피해지역에 k-nearest neighbor (kNN) 알고리즘을 적용하여 피해 범위를 탐지하는 실험을 수행하였다. 다양한 인공지물을 포함하는 지표와 숲이 혼재된 지역 특성을 고려하여 적절한 공간 해상도와 시간 해상도를 제공하는 Sentinel-2 multispectral instrument(MSI) 자료를 사용하였다. Sentinel-2 MSI의 여섯 밴드와 정규식생지수(NDVI), 정규탄화지수(NBR)를 분류 특성으로 사용하였다. 산불 피해지역과 비피해 지역에서 무작위로 추출된 2,000개 지점 정보를 이용하여 kNN 분류기를 훈련시켰다. 분류 성능을 높이기 위해 데이터에서 특이값을 제거하고 임상도를 병용하였다. 다양한 이웃(neighbor) 수와 분류 특성 조합을 적용하여 산불 후 데이터를 이용한 실험과 산불 전후 데이 터 차이를 이용한 실험을 수행하였다. 산불 전후 데이터 차이를 이용하였을 때 더 우수한 분류 성과를 얻을 수 있었지만, 산불 후 데이터만을 이용한 경우에도 피해지역의 범위를 파악할 수 있었다.

ABSTRACT

As human activities in Korea are spread throughout the mountains, forest fires often affect residential areas, infrastructure, and other facilities. Hence, it is necessary to detect fire-damaged areas quickly to enable support and recovery. Remote sensing is the most efficient tool for this purpose. Fire damage detection experiments were conducted on the east coast of Korea. Because this area comprises a mixture of forest and artificial land cover, data with low resolution are not suitable. We used Sentinel-2 multispectral instrument (MSI) data, which provide adequate temporal and spatial resolution, and the k-nearest neighbor (kNN) algorithm in this study. Six bands of Sentinel-2 MSI and two indices of normalized difference vegetation index (NDVI) and normalized burn ratio (NBR) were used as features for kNN classification. The kNN classifier was trained using 2,000 randomly selected samples in the fire-damaged and undamaged areas. Outliers were removed and a forest type map was used to improve classification performance. Numerous experiments for various neighbors for kNN and feature combinations have been conducted using bi-temporal and uni-temporal approaches. The bi-temporal classification performed better than the uni-temporal classification. However, the uni-temporal classification was able to detect severely damaged areas.

키워드

산불, 피해 탐지, k-Nearest Neighbor, 분류, Sentinel-2

Key words

Forest fire, Damage detection, k-Nearest neighbor, Classification, Sentinel-2

1. Introduction
2. Materials and Methods
2.1 Study Area
2.2 Multispectral Satellite Data and Indices
2.3 Forest Type Map
2.4 Reference Map
2.5 Methods
2.5.1 Spectral Indices and Differenced Images
2.5.2 Preprocessing
2.5.3 Training Dataset
3. Results
3.1 Number of Neighbors for kNN Classifier
3.2 Feature Combination
3.3 Application Dates
4. Discussion
5. Conclusions

1. Introduction

Climate change is a global problem, and the world must be unified in its response. At the same time, a regional system designed to suit local characteristics is needed to respond to disasters caused by climate change. The frequency and intensity of forest fires are increasing owing to climate change ^{(Dennison et al., 2014)}. This study details a technique that can quickly and easily map forest fire damage in an area where various land cover types are mixed with forests, that is, with “complex land cover” using satellite data.

The spring in Korea from March to May is very dry. Every spring, the wind blows hard in the east coast region, where a large mountain range is located, causing numerous forest fires. There were 2,211 forest fires in 2019 and 1,619 in 2020, of which more than 50% occurred in spring ^{(National Fire Agency, 2021)}. Korea has few plains; two-thirds of its land is mountainous and is where most forests are located. For this reason, the term “mountain fire” is used synonymously with “forest fire” in Korea. Moreover, due to the high population density, human activities are conducted throughout the mountain regions. Forest fires are therefore dangerous in Korea, and any damage they cause needs to be investigated and remediated immediately. This situation complicates the detection of fire-damaged areas as regions contain various types of land cover, including forests.

In Korea, damage from forest fires was measured using field surveys until research on forest fire damage detection using remote sensing was introduced in the mid-2000s. At that time, Landsat Thermal Mapper (TM) data were generally used for research in Korea. Studies focused on the differences in spectral indices pre- and post-fire and the digital numbers of each band immediately after a forest fire ^{(Choi et al., 2006;} ^{Won et al., 2007)}.

Open source access and proper spectral resolution are the major advantages of Sentinel-2 and Landsat-8; therefore, they have been commonly used in recent forest fire damage detection ^{(Hawbaker et al., 2017;} ^{Lee et al., 2017;} ^{Roy et al., 2019;} ^{Bar et al., 2020;} ^{Knopp et al., 2020;} ^{Sim et al., 2020)}. PlanetScope, which has high temporal and spatial resolution, and Korea Multi-Purpose Satellite-3 (KOPMSAT-3) were also employed for forest fire damage detection in Korea ^{(Chung and Kim 2020;} ^{Won et al., 2019)}. With the development of data processing technology, many researchers have imported machine learning or deep learning methods for forest fire damage detection ^{(Mithal et al., 2018;} ^{Pinto et al., 2020)}. Techniques such as decision tree, random forest (RF), logistic regression (LR), gradient boosting, support vector machine (SVM), and convolutional neural networks have been tested and validated ^{(Hawbaker et al., 2017;} ^{Weaver et al., 2018;} ^{Roy et al., 2019;} ^{Bar et al., 2020;} ^{Knopp et al., 2020;} ^{Sim et al., 2020)}. Most of recent studies have focused on determining the burned region over a large area. The burned area mapping in large mountainous ^{(Fornacca et al., 2018;} ^{Bar et al., 2020)} or continental-scale areas ^{(Hawbaker et al., 2017;} ^{Mithal et al., 2018;} ^{Roteta et al., 2019;} ^{Roy et al., 2019)}, or studies conducted on several fire spots across the globe ^{(Knopp et al., 2020;} ^{Pinto et al., 2020)} are different from mapping in an area composed of small and diverse patches of land cover, as in our case. As an exception, ^{Sim et al. (2020)} evaluated the performance of RF, LR, and SVM models for wildfires that occurred on the east coast of Korea and found that the RF model performed the best; however, its accuracy was poor in lightly damaged areas.

In this study, the k-nearest neighbors (kNN) algorithm was applied to Sentinel-2 multispectral instrument (MSI) data to detect fire-damaged areas on the east coast of Korea. Owing to the aforementioned characteristics of mountainous regions in Korea, high temporal resolution and adequate spatial resolution are required for satellite data. A low spatial resolution may result in the poor accuracy and impractical classification results. The Sentinel-2 data satisfied the requirements in terms of both spatial and temporal resolution. These data have a temporal resolution of 2-3 d in the study area and a spatial resolution of 10 m (visible spectra and near infrared) or 20 m (short-wave infrared (SWIR)). We used six Sentinel-2 MSI bands (bands 2, 3, 4, 8, 11, and 12) and two spectral indices (normalized differential vegetation index (NDVI) and normalized burn ratio (NBR)) calculated using them. The kNN algorithm is a simple machine learning classification technique that is used in various fields ^{(Nigsch et al., 2006;} ^{Kara et al., 2017)}. The main reason for the popularity of kNN is that it makes no assumptions about the distribution of the underlying data ^{(Fix and Hodges, 1989)}. This is a considerable advantage in this study because we used spectral reflectance and indices with various distributions.

2. Materials and Methods

The kNN algorithm is a classification method that assigns unclassified samples to the same category as the nearest classified samples ^{(Cover and Hart, 1967)}. Areas affected by forest fires are expected to show similar spectral characteristics; therefore, burned area mapping is a field well suited for the application of the kNN algorithm. Assume that a kNN classifier uses n features to classify the degree of damage to each pixel in an image. A feature is a distinct trait that is used to describe each pixel. First, the classifier learns the positions of the previously classified pixels (training dataset) in an n-dimensional space consisting of n axes corresponding to n features. When the coordinates of an unclassified pixel are determined in this space, the classifier finds k preclassified pixels nearest to the unclassified pixel. The unclassified pixel is classified into the category with the most pixels out of k preclassified pixels. A uniform weight was applied to the training dataset and the Euclidian distance metric was used.

We conducted numerous classification experiments, as shown in Fig. 1. While comparing the performance of the bi- and uni-temporal approaches, we observed the performances according to the number of neighbors applied to the kNN classifier and the feature combination. In the case of the uni-temporal approach, we also tested how the classifier, which was trained on a dataset on a specific date, performed on other date images. Each experiment was performed using the process illustrated in Fig. 2.

Fig. 1. Flowchart of Experiments. The Date of Fire was April 4. Differenced Data (April 3-8) was used for Bi-temporal Experiments

Fig. 2. Experimental Procedure

2.1 Study Area

The study area is a 10.5 km × 11.7 km (latitude 128.48 °N- 128.60 °N, longitude 38.17 °S-38.27 °S) region on the east coast of South Korea (Fig. 3(a)). The elevation ranges from 10 m to 670 m above sea level. According to the forest type map (Section 2.3), the forest in this area comprises 47% pine, 18% deciduous broad-leaved trees, and 14% coniferous mixed trees, in addition to other types. Artificial surfaces are also present among the forested areas. Fig. 3(b) shows the complex land cover of the study area. Approximately 49% of the area is forested, and cultivated land and rural villages account for a large proportion. Resorts, golf courses, and other facilities are mixed with the forests.

At approximately 7 pm on April 4, 2019, a fire broke out in a mountain in this area and spread rapidly due to strong winds, threatening the city. It was extinguished at approximately 8 am the next day. The damaged area was estimated to be approximately 13 km2 ^{(Won et al., 2019)}, and more than 900 people lost their homes.

Fig. 3. Study Area. On the Right is the Sentinel-2 True Color Image Sensed on April 15, 2019: (a) Location of Study Area, (b) Various Land Cover of the Study Area

2.2 Multispectral Satellite Data and Indices

Sentinel-2 MSI level-2A data, collected on April 3, 5, 8, 15, and 20, 2019, covering the study area were downloaded from the Copernicus Open Access Hub. This is a list of all available April data for the study area, excluding those with clouds. The Sentinel-2 level-2A main output is an orthoimage Bottom-Of-Atmosphere (BOA)-corrected reflectance product ^{(European Space Agency, 2015)}.

Among the 13 spectral bands of the Sentinel-2 product, three visible spectra (bands 2, 3, and 4), band 8 (near-infrared, NIR), band 11 (short wave infrared1, SWIR1), and band 12 (short wave infrared2, SWIR2) were used as features (Table 1) for the kNN classifier. The Sentinel-2 MSI level2A product contains 10 m resolution data for bands 2, 3, 4, and 8, while the resolution of bands 11 and 12 is only up to 20 m. Band 11 and 12 data were resampled to a 10 m resolution in this study.

Table 1. Features for kNN Classifier

	Feature Name	Data Source	Spatial resolution
Uni- temporal	Red	Sentinel-2 MSI band 4	10m
	Green	Sentinel-2 MSI band 3	10m
	Blue	Sentinel-2 MSI band 2	10m
	NIR	Sentinel-2 MSI band 8	10m
	SWIR1	Sentinel-2 MSI band 11	Resampled to 10m
	SWIR2	Sentinel-2 MSI band 12	Resampled to 10m
	NDVI	Calculated from band 4 and 8	10m
	NBR	Calculated from band 12 and 8	10m
Bi- temporal	dRed	${R}_{{April}3}-{R}_{{April}8}$	10m
	dGreen	${G}_{{April}3}-{G}_{{April}8}$	10m
	dBlue	${B}_{{April}3}-{B}_{{April}8}$	10m
	dNIR	${NIR}_{{April}3}-{NIR}_{{April}8}$	10m
	dSWIR1	${SWIR}1_{{April}3}-{SWIR}1_{{April}8}$	10m
	dSWIR2	${SWIR}2_{{April}3}-{SWIR}2_{{April}8}$	10m
	dNDVI	${NDVI}_{{April}3}-{NDVI}_{{April}8}$	10m
	dNBR	${NBR}_{{April}3}-{NBR}_{{April}8}$	10m

2.3 Forest Type Map

During the spring farming season, land cover often changes significantly over a short period, especially in areas with rice paddies and dry fields. For the bi-temporal approach, arable land far from forest fires can frequently be classified as damaged. To prevent this error, a forest type map (Fig. 4) was used to exclude non-forest areas from the damaged area. The forest type map, which was produced by the National Institute of Forest Science, is updated approximately every ten years, but land cover in Korea changes on a much shorter time scale. Therefore, the forest type map alone is insufficient, and the classifier must learn various land cover types to accurately map burned areas.

Fig. 4. Forest Type Map (Colored by Tree Species)

2.4 Reference Map

The burn severity map, derived from KOMPSAT-3 2 m resolution multi spectral bands ^{(Won et al., 2019)}, was used as a reference (Fig. 5). ^{(Won et al., 2019)}, with the support of the National Institute of Forest Science and the Korea Forest Service, investigated the 2019 east coast forest fires to map the fire-damaged area and burn severity. This reference map classifies damage grades as extreme, high, or low. ‘Extreme’ means that the tree canopies are completely burned, and ‘high’ means that more than 60% of the canopy has withered due to heat. ‘Low’ referes to a surface fire in which the canopies mostly survived. In this study, for convenience, we renamed these damage levels as DL3, DL2, and DL1. The undamaged area corresponds to the DL0.

Fig. 5. Reference Map(Won et al., 2019)

2.5 Methods

The experimental process is shown in Fig. 2 and details are described below.

2.5.1 Spectral Indices and Differenced Images

For April 3rd, 5th, 8th, 15th, and 20th, a dataset consisting of eight features-six band reflectance and two indices (Table 1) was prepared for uni-temporal classification. NDVI and NBR were calculated from the Sentinel-2 MSI band reflectance as follows:

(1)

${NDVI}=\dfrac{{b}{and}8-{b}{and}4}{{b}{and}8+{b}{and}4},\: {NBR}=\dfrac{{b}{and}8-{b}{and}12}{{b}{and}8+{b}{and}12}$

Next, six differenced reflectance and two differenced indices (Table 1) were prepared for bi-temporal classification.

2.5.2. Preprocessing

As discussed in the introduction, the data used for the kNN algorithm do not require any mathematical assumptions. However, because kNN depends on the distance between samples, it is affected by the feature scales. A feature with a wide range dominates the classification results.

Level-2A data derive from corresponding Level-1C products. Level-1C products provide TOA-normalized reflectances in which the physical values range from 10-4 to 1, but values higher than 1 can be observed in some cases due to specific angular reflectivity effects ^{(Gascon et al., 2017)}. Each pixel value of Level-1C and Level-2A is an integer multiplied by 10,000 by the reflectance. In this study, this integer was regarded as reflectance. As a result, the reflectance ranges from 1 to 10,000, but may have a value higher than 10,000 in some cases.

In our dataset, the reflectance of each band was mostly distributed below 5,000, but it increased to approximately 20,000 in a few pixels, namely outliers. NDVI and NBR ranged from -1 to 1. In the case of differenced reflectance, the range extends to approximately 25,000. Fig. 6(a) shows the distributions of reflectance of “Red” and “NIR” features. In particular, long tails on both sides make the distribution of the differenced reflectance very sharp. The wide range of reflectance is due to the diversity of land cover in the study area. Most outliers appear on artificial structures such as resorts, buildings, barns, and markets (Fig. 7). The characteristic that outliers mainly appear on the artificial surface is also seen in single date images.

Outliers should be removed to improve classification performance. Outliers do not have a mathematical definition and may be defined in different ways for various models ^{(Klebanov, 2016)}. In this study 0.05% on the upper side in the single-date reflectance distribution and 0.05% on both sides (total 0.1%) in differenced reflectance distributions were defined as outliers and removed. No outliers were removed from the indices. These outlier boundaries were determined by analyzing the histograms and images (Fig. 7 and 8). The outlier-removed reflectance data and indices were normalized using a minimum-maximum scaler in the range of 0 to 1. The histograms of the preprocessed features are shown in Fig. 6(b).

Fig. 6. Histograms of “Red” and “NIR” Features. The Left Two Show the Distributions of Single Date Data (April 8) and the Right Two Show the Distributions of Differenced Data (April 3-April 8): (a) Raw Data, (b) Preprocessed (Outliers Removed and Normalized) Data

Fig. 7. Outliers on a Differenced Image (April 3-April 8, Band 4). Red Pixels are the Outliers which are Defined as 0.1% at Both Ends of Tails of the Distribution (0.05% each Side)

Fig. 8. Outlier Boundaries on Histograms. A Histogram on a Logarithmic Scale (Lower) Highlights the Outliers that are Hardly Recognizable in the Linear Scale Histogram (Upper): (a) Band 4 (Red) Reflectance, (b) Differenced band 4 Reflectance

2.5.3. Training Dataset

A total of 2,000 sample points, 1,000 from fire-damaged areas, and 1,000 from undamaged areas, were randomly extracted to build a training dataset (Fig. 9). A uni-temporal training dataset was constructed by extracting the feature (Table 1) values corresponding to the coordinates of these sample points from the April 8 image. These feature values are stored as attribute information for the corresponding fields of the points. The bi-temporal training dataset was extracted from the differenced images, which were obtained by subtracting the image pixel values collected on April 8th from the image pixel values collected on April 3rd. In addition to the 16 feature fields, the attribute table also has two more fields for damage level and forest type map.

Fig. 9. Training Dataset

3. Results

Fig. 10 shows the classification result of bi-temporal classification in which all eight features (Table 1) were used, and nine neighbors were applied to the kNN classifier. It can be seen that a number of isolated pixels were classified as DL1 or DL2 in the undamaged area (Fig. 10[A]). These commission errors contribute significantly to the deterioration of classification performance. Although the majority of commission errors occurring in non-forest areas have been eliminated using the forest type map (section 2.3), some are still found, as shown in Fig. 10[A]. This is because the forest type map has a 10-year update cycle and does not reflect changes of land cover within that 10-year period. A large number of these errors can be removed later using image processing techniques, which are not covered in this study.

The less damaged the area, the lower the match rate between the classified result and the reference image. This is clearly shown in the error matrix (Tables 2 and 3). The producer and user accuracies of DL1 and DL2 were lower than those of DL3. While overall accuracy is the most representative measure that summarizes classification performance, it is also a very ambiguous measure that does not reveal any inertial information ^{(Alberg et al., 2004;} ^{Story and Congalton, 1986)}. In this study, the overall accuracy was not considered because the classification accuracy of DL0, which occupies a large proportion of the study area, determines the overall accuracy. Instead, classification performance was evaluated based on Cohen's kappa coefficient κ, which measures the agreement between two classifiers ^{(Cohen, 1960;} ^{Landis and Koch, 1977;} ^{Nichols et al., 2010)}.

Fig. 10. Comparison of the Reference and the Result Classified Through Bi-temporal, 8-feature, 9-neighbor Classification. DL0 (Undamaged) was Set to be Transparent

Table 2. Error Matrix of Bi-temporal, 8-feature, and 9-neighbor Classification

Reference Classified	Burn severity				Sum	User's accuracy
Reference Classified	DL0	DL1	DL2	DL3	Sum	User's accuracy
DL0	1,067,267	10,316	12,658	6,568	1,096,809	0.9731
DL1	6,424	5,329	4,635	1,944	18,332	0.2907
DL2	9,066	4,857	17,901	10,223	42,047	0.4257
DL3	5,860	1,026	9,658	40,887	57,431	0.7119
Sum	1,088,617	21,528	44,852	59,622	1,214,619
Producer's accuracy	0.9804	0.2475	0.3991	0.6858
Overall accuracy	0.9315
Cohen's Kappa	0.6224

Table 3. Error Matrix of Uni-temporal, 8-feature, and 9-neighbor Classification

Reference Classified	Burn severity				Sum	User's accuracy
Reference Classified	DL0	DL1	DL2	DL3	Sum	User's accuracy
DL0	1,062,369	12,532	16,386	7,765	1,099,052	0.9666
DL1	4,334	2,376	2,907	1,811	11,428	0.2079
DL2	17,557	4,820	16,433	9,019	47,829	0.3436
DL3	5,905	1,810	9,189	41,215	58,119	0.7091
Sum	1,090,165	21,538	44,915	59,810	1,216,428
Producer's accuracy	0.9745	0.1103	0.3659	0.6891
Overall accuracy	0.9227
Cohen's Kappa	0.5795

3.1 Number of Neighbors for kNN Classifier

Various numbers of neighbors (N) were applied to the kNN classifier to examine the variation in performance. Ns were chosen as odd numbers to prevent a tie. The more N is, the better the performance. However, when N exceeds 5, the increase rate of Cohen’s Kappa decreases significantly, and when N exceeds 9, the increase becomes smaller than 0.01 (Fig. 11).

Fig. 11. 8-feature Classification Performance according to Number of Neighbors

3.2 Feature Combination

Feature combinations were designed as the table at the bottom of Fig. 12. Fig. 12 shows the performance of the 9-neighbor classification for each feature combination.

The bi-temporal approach showed the best performance when all eight features were used (8f in Fig. 12). There was no noticeable change in performance if one of the three visible spectral features was removed (7f-b-d). Even if all R, G, and B are removed (5f), there is only a slight deterioration in performance. In contrast, removing SWIR1(7f-a) degraded the performance. For the uni-temporal approach, removing all three visible spectra results in the best performance (5f).

Fig. 12. 9-neighbor Classification Performance according to Feature Combinations

3.3 Application Dates

A classifier trained with the April 8th dataset was applied on the datasets taken from April 5th, 8th, 15th, and 20th for classification. As expected, the classification performance on April 8th was the best, but the other date classification also identified the rough extent of damage area well (Fig. 13). In Korea, vegetation grows rapidly in April, and it is observed that the area classified as DL1 or DL2 on April 5th, because it had not yet sprouted, was later excluded from the damaged area. Fig. 14 shows the classification performance quantitatively. The DL1 and DL2 regions have low producer accuracies. In the case of DL3, unlike other groups, the producer's accuracy was highest on April 5th, immediately after the fire, and then gradually decreased.

Fig. 13. Result of Uni-temporal 8-feature 9-neighbor Classification. The Classifier was Trained with the April 8th Dataset and Applied on Datasets Collected on April 5th, 8th, 15th and 20th. DL0 (Undamaged) was Set to be Transparent

Fig. 14. Performance of 8-feature 9-neighbor Classification according to Application Date. The Classifier was Trained with the April 8th Dataset

4. Discussion

The kNN classifier was applied to an area with complex land cover and was able to map fire damaged areas. Depending on the classification conditions, the kappa values ranged from 0.58 to 0.62 in the bi-temporal classification and 0.54 to 0.58 in the uni-temporal classification (Fig. 11 and 12). The performance of the uni-temporal classifier deteriorated when applied to other images not included in the training data (Fig. 14), but it was able to identify the boundary of the damaged area in the classified image (Fig. 13).

In a previous study (Sim et al., 2020), the user’s and producer’s accuracies for RF, LR, and SVM were 85-95% in undamaged areas and 65-85% in extremely damaged areas, while the range in accuracy for lightly damaged areas, from 15-60%, was much larger. Our results were not significantly different with 97-98% in undamaged areas, 68-71% in extremely damaged areas, and 24-29% in lightly damaged areas (Table 2). Slightly different study conditions complicate a direct comparison, but the kNN classifier seems to have performance similar to RF, LR, and SVM. Poor classification accuracy in lightly damaged areas was a common problem in both studies because of the difficulty distinguishing between lightly damaged and undamaged areas. However, lightly damaged areas are rare and it may be the case that a sufficient number of high quality samples were not included in the training dataset to allow for a full characterization of this class. A possible solution is to consider using a 2-class classification scheme that only distinguishes between damaged/undamaged areas, or to divide the damaged area into two classes instead of three. The high accuracy of the kNN classifier for undamaged areas is presumably because relatively more samples were allocated to the DL0 area than to DL1, DL2, and DL3 areas. This imbalance was intended to account for the large amount of variation present in multiple land cover types. For an accurate comparison of different approaches, more experimental data will be needed.

As expected, the bi-temporal classification showed better performance than the uni-temporal classification (Fig. 11 and 12). However, with uni-temporal classification, it is possible to determine the outline of the damaged areas of DL2 and DL3. DL1, which suffered minor fire damage, was roughly detected. There is no disagreement that a bi-temporal approach incurs more costs than a uni-temporal approach. In some early studies, the bi-temporal approach’s superiority of performance did not cover these costs; thus, a uni-temporal approach was suggested ^{(Weber et al., 2008)}. Now, as data accessibility and data processing technology are more advanced, the difference in cost is greatly reduced, and hence, it is less burdensome to choose a bi-temporal approach for a small improvement. In the case of Korean forest fires, there is a high probability that the satellite data immediately before and after the fire are available, and the cost of data processing is relatively low as the burned areas are not large. Therefore, if sufficient satellite data are available, there is no reason to avoid the bi-temporal approach. However, if it is difficult to collect data due to weather conditions, a uni-temporal approach can also provide useful information about forest fires.

As a result of the experiment on the number of neighbors for the kNN classifier, it was observed that the improvement rate decreased when the number of neighbors exceeded 9 (Fig. 11). The optimal number of neighbors according to the number of samples and the number of classes for a region with specific geographical and social characteristics seems to be another interesting area of research.

According to the experiments on feature combinations (Fig. 12), the use of Red, Green, and Blue features slightly reduced the uni-temporal classification performance. In the bi-temporal classification, these features help a little, and there is no big loss even if all are excluded. In contrast, when SWIR1 was excluded, the classification performance was significantly reduced.

The kNN algorithm is a machine learning technique that has been used for a long time. Compared to newer machine learning techniques, it has the advantage of being easy to understand and simple to apply, and it is worth examining how it performs in detecting areas affected by forest fires within complex land cover. Although experiments in which the classifier was applied to imagery acquired on various dates (test IDs P805 and P815 in Fig. 1) were included, the training and application of the classifier were performed within the same area in our experiment, and it is highly likely that performance will deteriorate when it is applied to other regions. In future studies, the generality of the classifier can be improved by including multiple sites and the above discussion regarding feature combinations, appropriate number of neighbors, and outliers may be helpful in guiding the experimental setup.

5. Conclusions

In this study, experiments were conducted under the assumption that the area damaged by forest fire in a complex area comprising a mixture of forest and artificial land cover could be quickly identified using the k-nearest neighbors (kNN) algorithm. The two requirements for the kNN algorithm are the number of neighbors and the feature combination for the classifier. A combination of nine neighbors and eight features (Red, Green, Blue, NIR, SWIR1, SWIR2, and two spectral indices, NDVI and NBR) showed appropriate performance. Although this combination did not always exhibit the best performance, it ensured consistently good performance throughout all our experiments. When the 9-neighbor 8-feature kNN classifier was applied over the images of different days following the forest fire, the Cohen’s kappas of the classification results ranged between 0.52 and 0.64. Although this quantitative figure seems insufficient, the resulting classification images sufficiently recognized the boundaries of the fire-damaged area.

Forest fires are a persistent phenomenon on the east coast of Korea. This study demonstrated the possibility of establishing a stable automated system that can quickly detect the boundaries of fire-damaged areas by applying a prepared kNN classifier when a forest fire occurs in the future. Furthermore, owing to the removal of outliers and the use of forest type map, this classification system is expected to be effective even in complex areas with artificial land covers. Information on the boundaries of burned areas, that is, the burned area map, can be used as a basic reference for on-the-spot investigations and countermeasures immediately after a forest fire, and can be provided to residents or related parties in the surrounding area who need information.

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. 2021R1A6A1A03044326) and the Ministry of Science and ICT (No. RS-2023-00252622).

References

Alberg, A. J., Park, J. W., Hager, B. W., Brock, M. V. and Diener-West, M. (2004). “The use of ‘overall accuracy’ to evaluate the validity of screening or diagnostic tests.” Journal of General Internal Medicine, Springer, Vol. 19, No. 5, pp. 460-465, https://doi.org/10.1111/j.1525-1497.2004.30091.x.

Bar, S., Parida, B. R. and Pandey, A. C. (2020). “Landsat-8 and Sentinel-2 based Forest fire burn area mapping using machine learning algorithms on GEE cloud platform over Uttarakhand, Western Himalaya.” Remote Sensing Applications: Society and Environment, Elsevier, Vol. 18, 100324, https://doi.org/10.1016/j.rsase.2020.100324.

Choi, S. P., Kim, D. H. and Lee, S. K. (2006). “The abstaction of forest fire damage area using factor analysis from the satellite image data.” Journal of Korean Society for Geospatial Information System, KSIS, Vol. 14, No. 1, pp. 13-19.

Chung, M. and Kim, Y. (2020). “Analysis on Topographic Normalization Methods for 2019 Gangneung-East Sea Wildfire Area Using PlanetScope Imagery.” Korean Journal of Remote Sensing, KSRS, Vol. 36, No. 2_1, pp. 179-197, https://doi.org/10.7780/KJRS.2020.36.2.1.7.

Cohen, J. (1960). “A coefficient of agreement for nominal scales.” Educational and Psychological Measurement, Sage, Vol. 20, No. 1, pp. 37-46, https://doi.org/10.1177/001316446002000104.

Cover, T. and Hart, P. (1967). “Nearest neighbor pattern classifica- tion.” IEEE Transactions on Information Theory, IEEE, Vol. 13, No. 1, pp. 21-27, https://doi.org/10.1109/TIT.1967.1053964.

Dennison, P. E., Brewer, S. C., Arnold, J. D. and Moritz, M. A. (2014). “Large wildfire trends in the western United States, 1984-2011.” Geophysical Research Letters, AGU, Vol. 41, No. 8, pp. 2928-2933, https://doi.org/10.1002/2014GL059576.

European Space Agency (2015). Sentinel-2 User Handbook. European Space Agency.

Fix, E. and Hodges, J. L. (1989). “Discriminatory analysis. nonparametric discrimination: Consistency properties.” International Statistical Review /Revue Internationale de Statistique, ISI, Vol. 57, No. 3, pp. 238-247, https://doi.org/10.2307/1403797.

Fornacca, D., Ren, G. and Xiao, W. (2018). “Evaluating the best spectral indices for the detection of burn scars at several post-fire dates in a mountainous region of Northwest Yunnan, China.” Remote Sensing, MDPI, Vol. 10, No. 8, 1196, https://doi.org/10.3390/rs10081196.

Gascon, F., Bouzinac, C., Thépaut, O., Jung, M., Francesconi, B., Louis, J., Lonjou, V., Lafrance, B., Massera, S., Gaudel- Vacaresse, A., Languille, F., Alhammoud, B., Viallefont, F., Pflug, B., Bieniarz, J., Clerc, S., Pessiot, L., Trémas, T., Cadau, E., De Bonis, R., Isola, C., Martimort P. and Fernandez, V. (2017). “Copernicus Sentinel-2A calibration and products validation status.” Remote Sensing, MDPI, Vol. 9, No. 6, 584, https://doi.org/10.3390/rs9060584.

Hawbaker, T. J., Vanderhoof, M. K., Beal, Y.-J., Takacs, J. D., Schmidt, G. L., Falgout, J. T., Williams, B., Fairaux, N. M., Caldwell, M. K., Picotte, J. J., Howard, S. M., Stitt, S. and Dwyer, J. L. (2017). “Mapping burned areas using dense time-series of Landsat data.” Remote Sensing of Environment, Elsevier, Vol. 198, pp. 504-522, https://doi.org/10.1016/j.rse.2017.06.027.

Kara, L. Z., Laksaci, A., Rachdi, M. and Vieu, P. (2017). “Data-driven kNN estimation in nonparametric functional data analysis.” Journal of Multivariate Analysis, Elsevier, Vol. 153, pp. 176-188, https://doi.org/10.1016/j.jmva.2016.09.016.

Klebanov, L. B. (2016). “Big outliers versus heavy tails: What to use?” ArXiv:1611.05410 [Math, Stat], ArXiv, http://arxiv.org/abs/1611.05410.

Knopp, L., Wieland, M., Rättich, M. and Martinis, S. (2020). “A Deep learning approach for burned area segmentation with Sentinel-2 data.” Remote Sensing, MDPI, Vol. 12, No. 15, 2422, https://doi.org/10.3390/rs12152422.

Landis, J. R. and Koch, G. G. (1977). “The measurement of observer agreement for categorical data.” Biometrics, International Biometric Society, Vol. 33, No. 1, pp. 159-174, https://doi.org/10.2307/2529310.

Lee, S. J., Kim, K. J., Kim, Y. H., Kim, J. W. and Lee, Y. W. (2017). “Development of FBI(Fire Burn Index) for Sentinel-2 images and an experiment for detection of burned areas in Korea.” Journal of the Association of Korean Photo-Geographers, The Association of Korean Photo-Geographers, Vol. 27, No. 4, pp. 187-202, https://doi.org/10.35149/JAKPG.2017.27.4.012 (in Korean).

Mithal, V., Nayak, G., Khandelwal, A., Kumar, V., Nemani, R. and Oza, N. (2018). “Mapping burned areas in tropical forests using a novel machine learning framework.” Remote Sensing, MDPI, Vol. 10, No. 1, 69, https://doi.org/10.3390/rs10010069.

National Fire agency (2021). 2020 Fire Statistical Yearbook, https://www.nfds.go.kr/bbs/selectBbsDetail.do?bbs=B21&bbs_ no=7948&pageNo=1 (in Korean).

Nichols, T. R., Wisner, P. M., Cripe, G. and Gulabchand, L. (2010). “Putting the kappa statistic to use.” The Quality Assurance Journal, Vol. 13, Nos. 3-4, pp. 57-61, https://doi.org/10.1002/qaj.481.

Nigsch, F., Bender, A., van Buuren, B., Tissen, J., Nigsch, E. and Mitchell, J. B. O. (2006). “Melting point prediction employing k-nearest neighbor algorithms and genetic parameter optimization.” Journal of Chemical Information and Modeling, Vol. 46, No. 6, pp. 2412-2422, https://doi.org/10.1021/ci060149f.

Pinto, M. M., Libonati, R., Trigo, R. M., Trigo, I. F. and DaCamara, C. C. (2020). “A deep learning approach for mapping and dating burned areas using temporal sequences of satellite images.” ISPRS Journal of Photogrammetry and Remote Sensing, Elsevier, Vol. 160, pp. 260-274, https://doi.org/10.1016/j.isprsjprs.2019.12.014.

Roteta, E., Bastarrika, A., Padilla, M., Storm, T. and Chuvieco, E. (2019). “Development of a Sentinel-2 burned area algorithm: Generation of a small fire database for sub-Saharan Africa.” Remote Sensing of Environment, Elsevier, Vol. 222, pp. 1-17, https://doi.org/10.1016/j.rse.2018.12.011.

Roy, D. P., Huang, H., Boschetti, L., Giglio, L., Yan, L., Zhang, H. H. and Li, Z. (2019). “Landsat-8 and Sentinel-2 burned area mapping—A combined sensor multi-temporal change detection approach.” Remote Sensing of Environment, Elsevier, Vol. 231, 111254, https://doi.org/10.1016/j.rse.2019.111254.

Sim, S., Kim, W., Lee, J., Kang, Y., Im, J., Kwon, C. and Kim, S. (2020). “Wildfire severity mapping using sentinel satellite data based on machine learning approaches.” Korean Journal of Remote Sensing, KSRS, Vol. 36, No. 5_3, pp. 1109-1123, https://doi.org/10.7780/KJRS.2020.36.5.3.9 (in Korean).

Story, M. and Congalton, R. G. (1986). “Accuracy assessment: A user’s perspective.” Photogrammetric Egineering and Remote Sensing, American Society for Photogrammetry and Remote Sensing, Vol. 52, No. 3, pp. 397-399.

Weaver, J., Moore, B., Reith, A., McKee, J. and Lunga, D. (2018). “A comparison of machine learning techniques to extract human settlements from high resolution imagery.” Proceedings of IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium, IEEE, Valencia, Spain, pp. 6412-6415, https://doi.org/10.1109/IGARSS.2018.8518528.

Weber, K. T., Seefeldt, S., Moffet, C. and Norton, J. (2008). “Comparing fire severity models from post-fire and pre/post-fire differenced imagery.” GIScience & Remote Sensing, Taylor & Francis, Vol. 45, No. 4, pp. 392-405, https://doi.org/10.2747/1548-1603.45.4.392.

Won, M., Jang, K., Yoon, S. and Lee, H. (2019). “Change detection of damaged area and burn severity due to heat damage from gangwon large fire area in 2019.” Korean Journal of Remote Sensing, KSRS, Vol. 35, No. 6_2, pp. 1083-1093, https://doi.org/10.7780/KJRS.2019.35.6.2.5 (in Korean).

Won, M. S., Koo, K. S. and Lee, M. B. (2007). “An quantitative analysis of severity classification and burn severity for the large forest fire areas using normalized burn ratio of landsat imagery.” Journal of the Korean Association of Geographic Information Studies, KAGIS, Vol. 10, No. 3, pp. 80-92 (in Korean).

Article Information (continued)

[

Surveying and Geo-Spatial Information Engineering

]

키워드 :

키워드

Keyword :

산불

Keyword :

피해 탐지

Keyword :

k-Nearest Neighbor

Keyword :

분류

Keyword :

Sentinel-2

Key words :

Key words

Keyword :

Forest fire

Keyword :

Damage detection

Keyword :

k-Nearest neighbor

Keyword :

Classification

Keyword :

Sentinel-2

This display is generated from NISO JATS XML with jats-style.xsl. The XSLT engine is Saxonica.

JKSCEKSCE JOURNAL OF CIVIL AND ENVIRONMENTAL ENGINEERING RESEARCH

Journal of the Korean Society of Civil Engineers

ISO Journal TitleKSCE J. Civ. Environ. Eng. Res.

Journal Search

Journal XML

Journal Information

Mapping Burned Forests Using a k-Nearest Neighbors Classifier in Complex Land Cover

초록

ABSTRACT

키워드

Key words

1. Introduction

2. Materials and Methods

Fig. 1. Flowchart of Experiments. The Date of Fire was April 4. Differenced Data (April 3-8) was used for Bi-temporal Experiments

Fig. 2. Experimental Procedure

2.1 Study Area

Fig. 3. Study Area. On the Right is the Sentinel-2 True Color Image Sensed on April 15, 2019: (a) Location of Study Area, (b) Various Land Cover of the Study Area

2.2 Multispectral Satellite Data and Indices

Table 1. Features for kNN Classifier

2.3 Forest Type Map

Fig. 4. Forest Type Map (Colored by Tree Species)

2.4 Reference Map

Fig. 5. Reference Map(Won et al., 2019)

2.5 Methods

2.5.1 Spectral Indices and Differenced Images

(1)

2.5.2. Preprocessing

Fig. 6. Histograms of “Red” and “NIR” Features. The Left Two Show the Distributions of Single Date Data (April 8) and the Right Two Show the Distributions of Differenced Data (April 3-April 8): (a) Raw Data, (b) Preprocessed (Outliers Removed and Normalized) Data

Fig. 7. Outliers on a Differenced Image (April 3-April 8, Band 4). Red Pixels are the Outliers which are Defined as 0.1% at Both Ends of Tails of the Distribution (0.05% each Side)

Fig. 8. Outlier Boundaries on Histograms. A Histogram on a Logarithmic Scale (Lower) Highlights the Outliers that are Hardly Recognizable in the Linear Scale Histogram (Upper): (a) Band 4 (Red) Reflectance, (b) Differenced band 4 Reflectance

2.5.3. Training Dataset

Fig. 9. Training Dataset

3. Results

Fig. 10. Comparison of the Reference and the Result Classified Through Bi-temporal, 8-feature, 9-neighbor Classification. DL0 (Undamaged) was Set to be Transparent

Table 2. Error Matrix of Bi-temporal, 8-feature, and 9-neighbor Classification

Table 3. Error Matrix of Uni-temporal, 8-feature, and 9-neighbor Classification

3.1 Number of Neighbors for kNN Classifier

Fig. 11. 8-feature Classification Performance according to Number of Neighbors

3.2 Feature Combination

Fig. 12. 9-neighbor Classification Performance according to Feature Combinations

3.3 Application Dates

Fig. 13. Result of Uni-temporal 8-feature 9-neighbor Classification. The Classifier was Trained with the April 8th Dataset and Applied on Datasets Collected on April 5th, 8th, 15th and 20th. DL0 (Undamaged) was Set to be Transparent

Fig. 14. Performance of 8-feature 9-neighbor Classification according to Application Date. The Classifier was Trained with the April 8th Dataset

4. Discussion

5. Conclusions

Acknowledgements

References

Article Information (continued)

키워드

Key words

JKSCEKSCE JOURNAL OF CIVIL AND
ENVIRONMENTAL ENGINEERING RESEARCH