Applicability Evaluation of a DIM Mesh-Based Automated True Orthoimage Generation
Method for Various Types of Aerial Imagery
(Song, YeongSun)1iD
(Yun, Konghyun)2†iD
-
(Member ․ Professor, Department of Geospatial Big Data, Inha Technical College ․ point196@inhatc.ac.kr)
-
(Member ․ Corresponding Author ․ Senior Researcher, Gangneung Industry-Academic Cooperation
Foundation, Kangwon National University ․ khyun1010@gmail.com)
Copyright 2026 by the Korean Society of Civil Engineers
Keywords
True orthoimage, Dense Image Matching, DIM mesh structure, Digital Building Model, Aerial photogrammetry
1. Introduction
As the demand for geospatial information applications—including digital twins, smart
cities, and national territory monitoring—continues to expand, the importance of image-based
geospatial products that support the construction and updating of high-precision 3D
spatial data is increasingly recognized. Orthoimages offer high intuitiveness and
superior readability compared to vector-based digital maps, and are thus widely used
across various public and private services. However, conventional orthoimages fail
to adequately eliminate the relief displacement caused by vertical structures such
as buildings.
A true orthoimage (True Orthoimage) corrects not only terrain-induced relief displacement
but also that caused by artificial structures such as buildings, thereby overcoming
the limitations of conventional orthoimages and providing an advanced image product
(Amhar et al., 1998). Owing to these characteristics, true orthoimages hold high potential for applications
including precise urban spatial mapping, 3D-based administrative tasks, digital twin
updates, and disaster monitoring. The production of true orthoimages fundamentally
depends on the construction of 3D precision digital elevation data incorporating Digital
Building Models (DBM) and on the processing of occlusion areas. However, the conventional
3D stereo-plotting approach relies heavily on expensive equipment and skilled operators,
and requires substantial time for building model construction. Although this approach
is viable at the pilot project level, it presents significant constraints for expansion
into a continuous nationwide production system (NGII, 2019).
In recent years, the automated generation of digital surface models and meshes from
high-overlap aerial imagery using Dense Image Matching (DIM) technology has attracted
attention as an alternative approach to true orthoimage production. The DIM mesh-based
method offers advantages for implementing an automated production system by minimizing
the separate workflow required for building model construction. Nevertheless, technical
and institutional challenges remain, including the accurate representation of building
boundaries, the correction of occlusion areas, and the need to redefine quality inspection
criteria.
Regarding occlusion detection, various methods have been proposed, including Z-buffer-based,
height-based, and angle-based approaches. Habib et al. (2007, 2018) refined the procedure for generating true orthoimages from frame aerial imagery and
LiDAR data. These studies are significant in that they established the geometric foundations
for true orthoimage production.
With respect to digital surface model generation, airborne laser scanning has long
served as the primary data source; however, advances in high-overlap aerial imaging
and computer vision algorithms have brought DIM-based surface reconstruction to prominence
as a key alternative. Haala and Rothermel (2012), Haala (2013), and Haala et al. (2016) presented the potential of high-density image matching for generating high-precision
digital elevation models and outlined the direction of algorithmic development. In
this process, the SIFT feature extraction algorithm proposed by Lowe (2004), the Semi-Global Matching (SGM) technique by Hirschmüller (2008), the Structure-from-Motion (SfM)-based 3D reconstruction technology of Westoby et al. (2012), and the SURE software developed by Rothermel et al. (2012) have served as core enabling technologies.
Meanwhile, the sawtooth effect at building boundaries and boundary instability have
been identified as key problems in DIM-based true orthoimage generation. This occurs
because the surface generated during the image matching process does not sufficiently
preserve the geometric discontinuities at real building edges, necessitating the refinement
of post-processing and quality inspection frameworks.
Research on deep learning-based true orthoimage generation and correction has also
been actively pursued. Shin et al. (2021) proposed a generative model-based approach for generating true orthoimages from airborne
LiDAR data. Bittner et al. (2017) proposed a framework utilizing convolutional neural networks (CNN) to extract buildings
from orthoimages and DIM point clouds. More recently, novel 3D scene representation
paradigms have emerged in the computer vision community, including Neural Radiance
Fields (NeRF) (Mildenhall et al., 2020) and 3D Gaussian Splatting (Kerbl et al., 2023). These approaches have demonstrated impressive photorealistic reconstruction capabilities;
however, they generally require dense view sampling, are computationally intensive
at training, and have not yet been validated for the metric accuracy and georeferenced
product quality required by national mapping workflows. In contrast, DIM mesh-based
pipelines built on aerial photogrammetric imagery offer mature aerial triangulation,
established quality control, and the geometric consistency required for orthorectified
deliverables. Nevertheless, research that comprehensively evaluates the applicability
of DIM mesh-based automated approaches in multi-sensor environments for nationwide
production remains relatively limited.
In response, this study aims to empirically evaluate the applicability of a DIM mesh-based
automated true orthoimage generation method, and to examine its technical validity
through comparison with the conventional 3D stereo-plotting approach. Furthermore,
this study analyzes the processing characteristics of different software packages
and error correction procedures and proposes refined quality inspection items suited
to DIM mesh-based products.
2. True Orthoimage Generation Technology
2.1 True Orthoimage Production Workflow
True orthoimage generation differs from conventional orthoimage production in that
it requires two additional processes: the construction of 3D precision digital elevation
data and the processing of occlusion areas. First, surface data incorporating structures
such as buildings must be constructed and used for image orthorectification. Subsequently,
occlusion areas arising from central projection must be appropriately removed or replaced.
Fig. 1. True Orthoimage Production Workflow
Occlusion areas manifest as regions where the rear faces of structures are obscured
due to the imaging geometry, and if these areas are not properly handled, double-mapping
artifacts occur. Therefore, the detection and correction of occlusion areas in true
orthoimage production directly affect the visual completeness and interpretability
of the final product.
Fig. 2. Occlusion Area and Double Mapping
2.2 Characteristics of Digital Building Model Generation Methods
2.2.1 3D Stereo-Plotting Method
The conventional stereo-plotting method assumes 3D depiction; however, inconsistencies
in results may arise due to differences in interpretation and representation among
operators during the plotting process. For example, when two or more buildings with
different heights are situated adjacent to one another, the conventional method—which
focuses on plotting building outlines—results in discontinuous height values at the
corners of individual buildings. This can produce logically inconsistent linear structures
in the 3D plotted data rather than accurately representing building rooftop surfaces,
and such outputs are unsuitable for true orthoimage production.
For true orthoimage production, therefore, it is necessary to apply the 3D precision
stereo-plotting method, which defines objects based on building rooftop surfaces.
Under this method, the number of points and lines to be depicted increases in accordance
with the Level of Detail (LoD) criteria, and working time and cost increase proportionally.
Furthermore, a higher degree of image overlap than that required for conventional
orthoimage production must be secured. In the 3D stereo-plotting process, building
rooftop surfaces and individual corners must be precisely depicted, and surfaces sharing
the same height value must be represented as a single object unit. For facilities
with overlapping upper and lower structures, such as overpasses, object separation
based on structural hierarchy is required.
Fig. 3. Comparison of Conventional and 3D Precision Stereo-Plotting
2.2.2 3D Object Modeling Method
The 3D object modeling method is a modeling approach that does not require a stereo
plotter, enabling general technicians to describe buildings without digital mapping
equipment. Various commercial software packages are currently available for 3D model
production, and their outputs are utilized for true orthoimage production and the
construction of 3D geospatial information. Representative software packages used in
Korea include PLW, CC-Modeler, GeoModeler, The Builder, and Citycapture.
2.2.3 DIM Mesh-Structure Method
Although Digital Surface Models (DSM) have traditionally been generated using airborne
laser scanning, advances in high-overlap aerial imaging and computer vision have made
it possible to generate DSMs using Dense Image Matching (DIM). Conventional photogrammetry-based
DSM generation involves aerial triangulation and bundle block adjustment, followed
by DSM construction from 3D point data. In contrast, the computer vision-based approach
involves feature extraction, image matching, automated aerial triangulation, bundle
block adjustment, and dense point cloud generation to construct the DSM. In this process,
SIFT for feature extraction, SfM for camera position and 3D structure reconstruction,
and SGM for dense image matching are primarily employed.
2.3 Software Characteristics for True Orthoimage Generation
Various commercial software packages can be used for true orthoimage generation. This
study compared Bentley iTwin Capture Modeler, nFrame SURE, and Trimble Inpho. Trimble
Inpho provides precise orthorectification and editing capabilities well-suited to
conventional photogrammetry-based workflows, while iTwin Capture Modeler and nFrame
SURE offer strengths in automated matching and mesh generation from high-overlap imagery.
Differences among the software packages are observed in aerial triangulation integration,
mesh generation quality, boundary correction functionality, and post-processing convenience,
and these differences affect the production efficiency and editing workload of the
final true orthoimage.
Table 1. Comparative Analysis of Software for True Orthoimage Generation
|
Row Header
|
iTwin Capture Modeler
|
nFrame SURE
|
Trimble Inpho
|
Agisoft Metashape
|
Pix4D mapper
|
|
AT Computation
|
○
|
×
|
○
|
×
|
×
|
|
External AT
|
× (xml parsing)
|
Trimble Inpho(.pri) Match-AT (*.xml), etc.
|
○
|
Inpho Project File (*.pri) BINGO (*.dat) Bundler (*.out), etc.
|
○
|
|
Orthophoto Editing
|
○
|
×
|
○
|
×
|
×
|
|
Supported Data Types
|
Aerial, Drone, Terrestrial, Video Imagery
|
Aerial, Drone, Terrestrial Imagery
|
Aerial, Drone Imagery
|
Aerial, Drone, Terrestrial, Laser Scan, Video, Satellite Imagery
|
Aerial, Drone, Terrestrial Imagery
|
|
Output Products
|
True orthoimage, Point cloud, 3D model, etc.
|
True orthoimage, Point cloud, 3D model, etc.
|
True orthoimage, Point cloud, 3D model, etc.
|
True orthoimage, Point cloud, 3D model, etc.
|
True orthoimage, Point cloud, 3D model, etc.
|
3. Data Acquisition and Experimental Design
3.1 Data Acquisition and Aerial Triangulation
In this study, aerial imagery was acquired with 80 %×80 % overlap in a single-pass
configuration for true orthoimage production, securing sufficient image overlap for
DIM-based automated processing. The cameras employed were the second-generation aerial
cameras Falcon (378 images) and DMC2 (150 images), and the third-generation aerial
cameras DMC3 (98 images) and Osprey (120 images). For aerial triangulation, 49 ground
control points were distributed across the study area. The number of points used as
check points varied by camera depending on point identifiability in each sensor’s
imagery, ranging from 42 to 47 (see Table 2).
Fig. 4. The Study Area for Aerial Triangulation
Aerial triangulation was performed using Match-AT for imagery from each camera. Quality
verification showed that all cameras achieved planimetric RMSE values below 0.03 m
and vertical RMSE values below 0.04 m, confirming stable accuracy. In particular,
the DMC3 achieved the lowest planimetric RMSE of 0.006 m and a vertical RMSE of 0.014
m, verifying that the ground control and orientation results were sufficiently reliable.
Table 2. Quality Assessment of Aerial Triangulation
|
Camera Type
|
No. of Check Points
|
Planimetric
|
Vertical
|
|
Max. Error (m)
|
RMSE (m)
|
Max. Error (m)
|
RMSE (m)
|
|
Falcon
|
47
|
0.044
|
0.020
|
0.068
|
0.023
|
|
DMC2
|
42
|
0.021
|
0.010
|
0.061
|
0.024
|
|
DMC3
|
42
|
0.013
|
0.006
|
0.053
|
0.014
|
|
Osprey
|
47
|
0.048
|
0.029
|
0.105
|
0.036
|
3.2 True Orthoimage Production for Comparison
For relative comparison, true orthoimages were produced using both the conventional
3D stereo-plotting method and the DIM mesh-structure method. In the conventional approach,
building model construction and orthorectification were performed using Geo3Di and
Trimble Inpho, while the automated approach utilized Bentley iTwin Capture Modeler
and nFrame SURE for image matching, mesh generation, and orthoimage generation.
Processing the full workflow for 378 Falcon images using iTwin Capture Modeler version
23 required approximately 24 hours, while nFrame SURE-based processing on a single-node
system required approximately 28 hours. These results demonstrate that the DIM mesh-based
approach possesses an automated character well-suited to large-scale data processing.
True orthoimages generated using the DIM mesh-structure method tend to exhibit poorly
defined building boundary lines due to the interpolation inherent in the image matching
process. Consequently, boundary processing issues arise for tall buildings, areas
with pronounced terrain undulations, or the presence of vegetation.
Post-processing-based editing and correction of the initially generated true orthoimages
is feasible even without separately constructing 3D precision digital elevation data.
By comprehensively incorporating the orientation elements and positional information
utilized in the orthoimage production process, boundary errors can be corrected within
a relatively short time. Fig. 8 presents the correction results obtained by applying image editing to the identified
boundary errors.
Fig. 5. True Orthoimages Generated by 3D Stereo-Plotting: Falcon Imagery (Top Row) and DMC3 Imagery (Bottom Row)
Fig. 6. True Orthoimages Generated Using the DIM Mesh-Based Approach from DMC2 (First Row), Falcon (Second Row), DMC3 (Third Row), and Osprey (Fourth Row)
4. Results and Analysis
4.1 Assessment of Building Relief Displacement Positional Accuracy
To quantitatively evaluate the quality of the initial true orthoimages prior to editing,
planimetric positional errors were analyzed based on 10 check points established on
building rooftops. Qualitative quality was also examined through overlay with 1/1,000
digital maps.
The results showed that Falcon and DMC2 achieved favorable accuracy levels with mean
errors of approximately 0.15 m, while Osprey recorded a mean error of 0.189 m and
DMC3 recorded 0.359 m. The elevated mean error for DMC3 is primarily attributed to
a single outlier of 1.267 m at one check point. This outlier corresponds to a tall
building located in a complex rooftop area where mesh smoothing during DIM matching
displaced the rooftop edge; excluding this point, the DMC3 mean error reduces to approximately
0.258 m, comparable to the Osprey result. With this caveat, the DIM mesh-based approach
was assessed to have secured the positional accuracy required for practical application.
It should be noted that this assessment is based on the rooftop check points used
for the DIM mesh-based products and does not constitute a strict head-to-head comparison
with the conventional 3D stereo-plotting approach, which would require identical check
points, the same target area, and the same evaluation procedure to be applied to both
methods. Such a controlled cross-method comparison is identified as a follow-up task.
To support the interpretation of the sensor-specific accuracy differences, an exploratory
statistical review was performed on the n = 10 check-point measurements per camera.
Using a Student’s t-distribution with nine degrees of freedom ($t_{0.025} = 2.262$),
the 95 % confidence intervals for the mean planar error were approximately [0.02,
0.29] m for Falcon, [0.03, 0.27] m for DMC2, [0.02, 0.36] m for Osprey, and [0.00,
0.73] m for DMC3. The DMC2, Falcon, and Osprey intervals overlap substantially, indicating
that their mean accuracies are not statistically distinguishable at this sample size.
The wider DMC3 interval is driven by a single outlier (1.267 m); when this point is
excluded, the DMC3 mean error reduces to approximately 0.258 m with a standard deviation
of 0.252 m, placing it on the same order of magnitude as the other sensors. These
results suggest that the DIM mesh-based pipeline is robust across the tested cameras,
while also indicating that the present sample size of ten is sufficient for technical
evaluation but limited for population-level inference; expanded validation is identified
as future work in Section 5.
Fig. 7. Boundary Delineation Errors in True Orthoimages Generated Using the DIM Mesh-Based Approach
Fig. 8. Results of Boundary Editing for true Orthoimages Generated Using the DIM Mesh-Based Approach
Table 3. Measured Relief Displacement at Building Corners Using the DIM Mesh-Based Approach (iTwin Capture Modeler)
|
Image type
|
Sensor
|
Mean Error(m)
|
Maximum Error (m)
|
Standard Deviation (m)
|
|
2G.
|
Falcon
|
0.152
|
0.394
|
0.189
|
|
DMC2
|
0.148
|
0.271
|
0.168
|
|
3G.
|
DMC3
|
0.359
|
1.267
|
0.512
|
|
Osprey
|
0.189
|
0.515
|
0.237
|
4.2 Visual Quality Comparison by Area Type
A comparison of the 3D stereo-plotting method and the DIM mesh-structure method across
three area types—industrial, apartment residential, and detached residential zones—revealed
that visual differences between the two methods were minimal in the industrial zone,
where building forms are relatively simple. In the apartment zone, the overall level
of relief displacement correction was also satisfactory. This comparison was conducted
using Falcon imagery (second-generation) and DMC3 imagery (third-generation); Trimble
Inpho was used for the 3D stereo-plotting method, and Bentley iTwin Capture Modeler
for the DIM mesh-structure method.
In the industrial zone, DIM mesh-based results demonstrated generally adequate correction
of building rooftop relief displacement. Minor differences in boundary representation
were observed in some buildings with irregular rooftop geometries or significant height
differences between adjacent structures. In the apartment zone, no significant difference
between the two methods was found in the level of relief displacement correction for
high-rise buildings.
In contrast, in the detached residential zone—where building forms are complex and
vegetation is present nearby—the DIM mesh-based products exhibited somewhat insufficient
representation of fine building boundary details, associated with the smoothing of
boundary discontinuities during automated matching. Quantitatively, this manifested
as residual planar offsets at building edges on the order of 0.2-0.4 m, exceeding
the 0.15 m mean error observed for simpler industrial-zone buildings, and is consistent
with the elevated DMC3 mean error reported in Section 4.1. Nevertheless, the overall
level of orthorectification and spatial interpretability was maintained at a level
comparable to the conventional method. In particular, errors in 3D precision digital
elevation data attributable to complex rooftop geometries and vegetation effects were
found to cause ambiguous building boundary delineation, which was more pronounced
in cases of greater building heights or larger elevation differences with adjacent
vegetation, but correctable through software post-processing functions. Overall, the
DIM mesh-structure method demonstrated variability in output completeness depending
on area type, while proving to be a practical alternative in terms of workflow automation.
Fig. 9. Comparison of True Orthoimages: 3D Stereo-Plotting (Left) vs. DIM Mesh-Based Approach (Right): (a),(b) Industrial Zone; (c),(d) Apartment Zone; (e),(f) Detached Residential Zone
4.3 Comparison of Error Correction Methods
The methods for correcting errors in the initial true orthoimages generated using
the DIM mesh-based approach can be broadly categorized into two types: editing the
3D precision digital elevation data, and directly editing the generated true orthoimages.
The 3D precision digital elevation data editing approach involves correcting the surface
model itself using auxiliary data such as boundary line shapefiles and LiDAR, and
then regenerating the true orthoimage. This approach offers the advantage of concurrently
securing a corrected 3D model and ensuring high structural consistency; however, it
requires additional data and expert personnel and entails a substantial re-processing
burden. Positional accuracy verification can be performed based on the corrected surface
model. Representative software packages supporting this approach include nFrame SURE,
Pix4D Mapper, and Agisoft Metashape.
The direct true orthoimage editing approach involves correcting the initial output
using image editing functions, requiring fewer additional data inputs and offering
greater operational flexibility. It was found to be comparatively more efficient in
large-scale operational systems and more advantageous for rapid product correction.
In this study, the Retouching function of Bentley iTwin Capture Modeler was used to
perform editing tasks including the removal of moving objects, the refinement of building
rooftop boundary lines, and the filling of reflective or unmapped areas. In the detached
residential zone, the editing process required approximately 5 to 10 minutes per building,
substantially lower than the approximately 17 person-days per 1:5,000 map sheet required
by the 3D stereo-plotting method. To enable a like-for-like comparison, all productivity
figures were converted to a common per-sheet basis. The 24- to 28-hour automated processing
time corresponds to approximately one day of machine time per coverage block, the
bulk of which proceeds unattended. Assuming selective editing of approximately 100
priority structures (high-rise, landmark, and publicly significant buildings) per
1:5,000 sheet at the observed 5-10 min per building, the resulting human effort amounts
to roughly 1.5 to 2 person-days per sheet. The DIM mesh-based approach therefore reduces
the per-sheet human effort to approximately 3-4 person-days when accounting for both
processing supervision and selective editing, compared with 17 person-days for the
3D stereo-plotting method, representing a productivity improvement of roughly four-
to fivefold per sheet. It is therefore considered practically reasonable to apply
selective editing focused on buildings with high public significance, landmark buildings,
and high-rise structures.
Fig. 10. Comparison of Error Correction Workflows: (Top) 3D Precision Digital Elevation Data Editing Approach; (Bottom) True Orthoimage Direct Editing Approach
4.4 Redefinition of Quality Inspection Items
The DIM mesh-structure method is based on automated matching and surface reconstruction
using high-overlap imagery, making it difficult to directly apply the existing quality
inspection framework. Items that were critical for conventional orthoimages—such as
the use of nadir imagery, image connectivity, color and brightness consistency, and
digital elevation model inspection—may be of relatively lower importance in the DIM-based
automated environment.
Conversely, the conformance of building boundary lines and the adequacy of occlusion
area processing emerge as core inspection items governing the quality of DIM mesh-based
products. The quality inspection framework should therefore be reconstructed not through
simple maintenance of existing items, but through selective reduction and the addition
of new items reflecting differences in the generation method. Beyond the qualitative
inclusion/exclusion judgments summarized in Table 4, the redefined inspection items also call for quantitative tolerance thresholds.
Drawing on the rooftop accuracy results obtained in this study (mean planar errors
of 0.148–0.359 m at the building scale), preliminary tolerance candidates for two
newly emphasized inspection items can be considered: (i) for boundary smoothing arising
during DIM mesh matching, a planar offset tolerance on the order of one to two times
the ground sampling distance for the relevant scale (e.g., approximately 0.10-0.25
m for 1:1,000 to 1:5,000 mapping) is plausible; and (ii) for occlusion area handling,
the proportion of unfilled or doubly mapped pixels relative to the total building
footprint area provides a measurable indicator. The exact threshold values must be
calibrated against larger samples and against the specific application requirements
of each product, and their formal specification is identified as a follow-up study
task.
Table 4. Quality Inspection Item Analysis for DIM Mesh-Based True Orthoimages
|
Quality Inspection Items
|
Quality Element
|
DIM Mesh Required
|
|
Data Type
|
Quality Item
|
Measurement Content
|
|
|
|
Orthoimage
|
Image Connectivity
|
Seamline
|
Logical Consistency - Topology
|
×
|
|
Mosaic Processing
|
Logical Consistency - Topology
|
×
|
|
Adjacent Disconnection
|
Logical Consistency - Topology
|
×
|
|
Center Image Usage
|
Building Overlap
|
Logical Consistency - Topology
|
○
|
|
Building Tilt
|
Logical Consistency - Topology
|
×
|
|
3D Structures and Feature Shape
|
3D Structure Distortion
|
Logical Consistency - Topology
|
○
|
|
Feature Distortion
|
Logical Consistency - Topology
|
○
|
|
Color/Brightness Consistency
|
Color/Brightness Match
|
Logical Consistency - Concept
|
×
|
|
Image Correction
|
Adjacent Area Break
|
Logical Consistency - Topology
|
×
|
|
Spatial Resolution
|
Pixel GSD
|
Logical Consistency - Concept
|
○
|
|
Digital Elevation Model
|
Digital Elevation Model
|
Break / Step Error
|
Positional Accuracy - Relative
|
×
|
|
Terrain Correction Status
|
Temporal Accuracy - Temporal Measurement
|
×
|
○ = Inspection Required × = Not Required
5. Conclusions
This study evaluated the applicability of a DIM mesh-based automated true orthoimage
generation method and examined its technical validity through comparison with the
conventional 3D stereo-plotting approach. First, the DIM mesh-based method enables
automation of most processes and achieved mean planimetric accuracy levels of 0.148–0.359
m for second- and third-generation aerial cameras, indicating positional accuracy
comparable to the conventional approach from a practical application perspective.
Second, the visual quality comparison by area type confirmed that the industrial and
apartment residential zones exhibited quality comparable to the conventional approach,
while the need for additional boundary correction was identified in complex boundary
areas such as detached residential zones. This suggests that a post-processing strategy
centered on boundary refinement is critical for improving DIM mesh-based product quality.
Third, the comparison of error correction procedures found that the direct true orthoimage
editing approach was relatively more advantageous than the 3D precision digital elevation
data editing approach in terms of operational efficiency and field applicability.
Fourth, a transition to a quality inspection framework suited to the DIM mesh-based
production method is necessary, requiring partial reduction of existing inspection
items and the incorporation of building boundary conformance and occlusion area processing
adequacy as new items.
In addition, true orthoimages have significant potential as training data for the
future integration of AI-based image interpretation technologies. Generated from original
aerial imagery, true orthoimages represent terrain and surface features with reduced
geometric distortion, thereby providing reliable datasets for the training and validation
of AI-based geospatial analysis models, such as object detection, semantic segmentation,
and change detection. In this respect, true orthoimages extend beyond their conventional
role as orthorectified end products and can serve as fundamental data resources for
intelligent national geospatial information infrastructures and automated geospatial
data processing systems.
Nevertheless, this study was conducted on a limited pilot area with a restricted combination
of sensors, and the building rooftop accuracy assessment was based on 10 check points
per camera, which is sufficient for technical evaluation but limited for statistical
generalization. For broader generalization of the findings, follow-up validation incorporating
diverse terrain conditions, building densities, seasonal and illumination conditions,
and a larger and more representative check-point sample is necessary. In addition,
sensitivity analyses examining the effects of sensor characteristics, image overlap,
ground control point configuration, and post-processing methods on product quality
are required as future research tasks. Building on these foundations, the formulation
of nationwide quality criteria and operational guidelines, including quantitative
tolerance thresholds for boundary representation and occlusion handling, will be pursued
in subsequent studies.
Acknowledgments
This research was supported by Basic Science Research Program through the National
Research Foundation of Korea (NRF) funded by the Ministry of Education (RS-2025-25399776).
References
Amhar, F., Jansa, J., Ries, C. (1998). The generation of true orthophotos using a
3D building model in conjunction with a conventional DTM, International Archives of
Photogrammetry and Remote Sensing, 32(Part 4), 16-22.

Bittner, K., Cui, S., Reinartz, P. (2017). Building extraction from remote sensing
data using fully convolutional networks, International Archives of the Photogrammetry,
Remote Sensing and Spatial Information Sciences, XLII-1/W1, 481-486.

Haala, N. (2013). The landscape of dense image matching algorithms, Photogrammetric
Week '13, 271-284.

Haala, N., Rothermel, M. (2012). Dense multi-stereo matching for high quality digital
elevation models, Photogrammetrie, Fernerkundung, Geoinformation, 2012(4), 331-343.

Haala, N., Rothermel, M., Cavegn, S. (2016). High density aerial image matching: State-of-the-art
and future prospects, Photogrammetric Week '15, 625-630.

Habib, A. F., Kim, E. M., Kim, C. J. (2007). New methodologies for true orthophoto
generation, Photogrammetric Engineering and Remote Sensing, 73(1), 25-36.

Habib, A., Xiong, W., He, F., Yang, H. L., and Crawford, M. (2018). True orthophoto
generation from aerial frame images and LiDAR data: An update, Remote Sensing, 10(4).

Hirschmüller, H. (2008). Stereo processing by semi-global matching and mutual information,
IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 328-341.

Kerbl, B., Kopanas, G., Leimkühler, T., and Drettakis, G. (2023). 3D Gaussian splatting
for real-time radiance field rendering, ACM Transactions on Graphics, 42(4).

Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints, International
Journal of Computer Vision, 60(2), 91-110.

Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., and
Ng, R. (2020). NeRF: Representing scenes as neural radiance fields for view synthesis,
, 405-421.

National Geographic Information Institute NGII (2019). Research Report on the Advancement
of Standard Workflows for True Orthoimage Production.

Rothermel, M., Wenzel, K., Fritsch, D., and Haala, N. (2012). SURE: Photogrammetric
surface reconstruction from imagery, , 1-9, Berlin.

Shin, Y. H., Kim, T., and Eo, Y. D. (2021). True orthoimage generation using airborne
LiDAR data with generative adversarial network-based deep learning, Sensors, 21(4).

Westoby, M. J., Brasington, J., Glasser, N. F., Hambrey, M. J., and Reynolds, J. M.
(2012). Structure-from-Motion photogrammetry: A low-cost, effective tool for geoscience
applications, Geomorphology, 179, 300-314.
