diff --git a/PROJ-DTRK/image/separator.gif b/PROJ-DTRK/image/separator.gif deleted file mode 100644 index 4d41cdc..0000000 Binary files a/PROJ-DTRK/image/separator.gif and /dev/null differ diff --git a/PROJ-DTRK/index.html b/PROJ-DTRK/index.html index c69935a..03f7ed8 100644 --- a/PROJ-DTRK/index.html +++ b/PROJ-DTRK/index.html @@ -355,10 +355,6 @@
In addition to the constant updates of the models, changes in the reference frame can also lead to large-scale corrections of the land register models. These global corrections are then made even more complex by the federal laws that impose a high degree of correctness and accuracy.
In the context of the introduction of the new reference frame DM.flex [2] for the Swiss land register, being able to assess the applied changes on the geographical model appear as an important aspect. Indeed, changing the reference frame for the land register models is a long and complex technical process that can be error prompt. We also show in this research project how the difference detection algorithm can be helpful to assess and verify the performed corrections.
-In this research project, the difference detection algorithm implemented in the STDL 4D framework is applied on INTERLIS data containing the official land register models of different Swiss Canton. As introduced, two main directions are considered for the difference detection algorithm :
Through the first direction, the difference detection algorithm is presented. Considering the difference models it allows computing, it is shown how such model are able to extract information in between the models in order to emphasize the ability to represent, and then, to verify the evolution of the land register models.
The second direction focuses on demonstrating that difference models are a helpful representation of the large-scale corrections that can be applied to land register during reference frame modification and how they can be used as a tool to assess the modifications and to help to fulfil the complex task of the verification of the corrected models.
-For the first research direction, the land register models of the Thurgau Kanton are considered. They are selected in order to have a small temporal distance allowing to focus on a small amount of well-defined differences :
This first section focuses on short-term differences to show how difference models work and how they are able to represent the modifications extracted out of the two compared models. The following images give an illustration of the considered dataset, which are the land register models of Thurgau Kanton :
One can see how difference models can be used to track down modifications brought to the land register in a simple manner, while keeping the information of the unchanged elements between the two compared models. This demonstrates that information that exists between models can be extracted and represented for further users or automated processes. In addition, such difference models can be computed at any scale, considering small area up to the whole countries.
-On the previous section, the difference models are computed using two models only separated of a few days, containing only a small amount of clear and simple modifications. This section focuses on detecting differences on larger models, separated by several years. In this case, the land register of the Canton of Geneva is considered :
Typically the swimming pool register is updated either by taking building/demolition permits into account, or by manually checking its multiple records (4000+ to date) against aerial images, which is quite a long and tedious task. Exploring the opportunity of leveraging Machine Learning to help domain experts in such an otherwise tedious tasks was one of the main motivations behind this study. As such, no prior requirements/expectations were set by the recipients.
The study was autonomously conducted by the STDL team, using Open Source software and Open Data published by the Canton of Geneva. Domain experts were asked for feedback only at a later stage. In the following, details are provided regarding the various steps we followed. We refer the reader to this page for a thorough description of the generic STDL Object Detection Framework.
-Several steps are required to set the stage for object detection and eventually reach the goal of obtaining - ideally - even more than decent results. Despite the linear presentation that the reader will find here-below, multiple back-and-forths are actually required, especially through steps 2-4.
As the ground-truth data we used turned out not to be 100% accurate, the responsibility for mismatching predictions has to be shared between ground-truth data and the predictive model, at least in some cases. In a more ideal setting, ground-truth data would be 100% accurate and differences between a given metric (precision, recall, \(F_1\) score) and 100% should be imputed to the model.
-All the predictions having a score \(\geq\) 5% obtained by our best model were exported to Shapefile and shared with the experts in charge of the cadastre of the Canton of Geneva, who carried out a thorough evaluation. By checking predictions against the swimming pool register as well as aerial images, it was empirically found that the threshold on the minimum score (= thr
) should be set as high as 97%, in order not to have too many false positives to deal with. In spite of such a high threshold, 562 potentially new objects were detected (over 4652 objects which were known when this study started), of which:
The analysis reported in this document confirms the opportunity of using state-of-the-art Deep Learning approaches to assist experts in some of their tasks, in this case that of keeping the cadastre up to date. Not only the opportunity was explored and actually confirmed, but valuable results were also produced, leading to the detection of previously unknown objects. At the same time, our study also shows how essential domain expertise still remains, despite the usage of such advanced methods.
As a concluding remark, let us note that our predictive model may be further improved. In particular, it may be rendered less prone to false positives, for instance by:
diff --git a/PROJ-HETRES/images/F17A_all_VarImp.tif b/PROJ-HETRES/images/F17A_all_VarImp.tif new file mode 100644 index 0000000..3e1988e Binary files /dev/null and b/PROJ-HETRES/images/F17A_all_VarImp.tif differ diff --git a/PROJ-HETRES/images/F17B_all_VarImp.tif b/PROJ-HETRES/images/F17B_all_VarImp.tif new file mode 100644 index 0000000..ac12dd1 Binary files /dev/null and b/PROJ-HETRES/images/F17B_all_VarImp.tif differ diff --git a/PROJ-HETRES/images/F18_sample_removal.png b/PROJ-HETRES/images/F18_sample_removal.png new file mode 100644 index 0000000..16e6992 Binary files /dev/null and b/PROJ-HETRES/images/F18_sample_removal.png differ diff --git a/PROJ-HETRES/images/F5_profiles.pdf b/PROJ-HETRES/images/F5_profiles.pdf new file mode 100644 index 0000000..e21a52e Binary files /dev/null and b/PROJ-HETRES/images/F5_profiles.pdf differ diff --git a/PROJ-HETRES/index.html b/PROJ-HETRES/index.html index 05f7667..307c1fa 100644 --- a/PROJ-HETRES/index.html +++ b/PROJ-HETRES/index.html @@ -607,9 +607,11 @@The statistical tests were performed on the original and filtered pixels.
-Two low pass filters were tested: -- Gaussian with a sigma of 5; -- Bilinear downsampling with scale factors of 1/3, 1/5 and 1/17, corresponding to resolutions of 9, 15 and 50 cm.
+Two low pass filters were tested:
+In the original and the filtered cases, the pixels for each GT tree were extracted from the images and sorted by class. Then, the corresponding NDVI is computed. Each pixel has 5 attributes corresponding to its value on the four bands (R, G, B, NIR) and its NDVI.
First, the per-class boxplots of the attributes were executed to see if the distinction between classes was possible on one or several bands or on the NDVI.
Then, the principal component analysis (PCA) was computed on the same values to see if their linear combination allowed the distinction of the classes.
The results of the RF that are presented and discussed are: (1) the optimization and ablation study, (2) the ground truth analysis, (3) the predictions for the AOI and (4) the performance with downgraded data.
The introduction presents the background and the objectives of the projects, but also introduces the input data and its specific features.
For machine learning, training data quality has strong influence on model performance. With the training label, domain experts from FSO selected data points that are more reliable and representative. These 348'474 tiles and their neighbors composed the training and testing dataset for machine learning methodology.
-As suggested by domain experts, exploratory data analysis (EDA) is of significance to understand the data statistics and find the potential internal patterns of class transformation. The EDA is implemented from three different perspectives: distribution, quantity and probability. With the combination of the three, we can find that there do exist certain trends in the transformation of both land cover and land use classes.
For the land cover, main findings are:
@@ -707,10 +699,6 @@The distribution statistics, the quantity statistics and the probability matrices have shown to validate and complement each other during the exploratory analysis of the data.
-The developed method should be integrated in the OFS framework for change detection and classification of land use and land cover illustrated in Figure 12. The interesting parts for this project are highlighted in orange and will be presented in the following.
@@ -862,10 +850,6 @@
The Experiments section covers the results obtained when performing the planned simulations for the temporal-spatial module and the integration module.
The study demonstrates that the image-level contains more information related to change detection compared with temporal-spatial neighbors (FCN row in the Table 5). However, performance improvement from the temporal-spatial module when combined with image-level data, achieving 0.438 in weighted metric in the end (FCN+RF and FCN+FCN).
Regarding the composition of different models for the two modules, FCN is proved to be the best one for the temporal-spatial module, while RF and FCN have similar performance in the integration module. The choice of integration module could be influenced by the data format of other potential modules. This will be further studied by the FSO team.
-This project studied the potential of historical and spatial neighbor data in change detection task for the fifth interpretation process of the areal statistic of FSO. For the evaluation of this specific project, a weighted metric was defined by the FSO team. The temporal-spatial information was proved not to be as powerful as image-level information which directly detects change within visual data. However, an efficient prototype was built with 6% performance improvement in weighted metric combining the temporal-spatial module and the image-level module. It is validated that integration of modules with different source information can help to enhance the final capacity of the entire workflow.
The next research step of the project would be to modify the current implementation of ConvRNN. If the numerical relationship is removed from the synthetic image data, ConvRNN should have similar performance as FCN theoretically. Also, CNN is worth trying to validate whether the temporal pattern matters in this dataset. Besides, by changing the size of the synthetic images, we can figure out how does the number of neighbour tiles impact the model performance.
-The Statistical Office mandated the STDL to perform researches on the possibility to automatically gather the construction year by analysing the swisstopo [3] National Maps [4]. Indeed, the Swiss national maps are known for their excellency, their availability on any geographical area, and for their temporal cover. The national maps are made with a rigorous and well controlled methodology from the 1950s and therefore they can be used as a reliable source of information to determine the buildings' construction year.
The STDL was then responsible for performing the researches and developing a proof-of-concept to provide all the information needed to the Statistical Office for them to take the right decision on considering national maps as a reliable way of assigning a construction year for the buildings lacking information.
-Extracting the construction date out of the national maps is a real challenge, as the national maps are a heavy dataset, they are not easy to be considered as a whole. In addition, the Statistical Office needs the demonstration that it can be done in a reliable way and within a reasonable amount of time to limit the cost of such process. They are also subjected to strict tolerances on the efficiency of the construction years extraction through an automated process. The goal of at least 80% of overall success was then provided as a constraint to the STDL.
As a result, the research specifications for the STDL were:
@@ -449,10 +445,6 @@In this research project, two datasets were considered: the building register itself and the national maps. As both datasets are heavy and complex, considering them entirely for such a research project would have been too complicated and unnecessary. It was then decided to focus on four areas selected for their representativeness of Swiss landscape:
One can see that a large portion of the 20th century can be covered using the maps with a very good resolution of around five to six years between the maps.
-In this research project, the main focus was put on the national maps to extract the construction year of buildings as the maps are sources on which we can rely and assess the results. The only drawback of the maps is their limited temporal coverage, as they only start to be available in the 1950s.
This is the reason why another experimental approach was also added to address the cases of building being built before the 1950s. This secondary approach focused on a statistical methodology to verify to which extent it could be possible to assign a construction date even in the case no maps are available.
@@ -601,10 +589,6 @@In order to detect construction year of buildings, we need to be able to track them down on the maps across the temporal coverage. The RBD is providing the reference list of the building, each coming with a federal identifier (EGID) and a position. This position can then be used to track down the building on maps for its appearance or morphological change.
As the maps are already selected, as the research areas, this research approach can be summarised in the following way:
@@ -888,10 +872,6 @@As the availability of the topographic/national maps does not reach the integrity of all building's year of construction in the registry, an add-on was developed to infer this information, whenever there was this need for extrapolation. Usually, the maps availability reaches the 1950s, whilst in some cities the minimum year of construction can be in the order of the 12th century, e.g. The core of this statistical model is based on the Concentric Zones Model (Park and Burgess, 1925)[6] extended to the idea of the growth of the city from the a centre (Central Business District - CBD) to all inner areas. The concept behind this statistical approach can be seen below using the example of a crop of Basel city:
This add-on allows extrapolating the predictions to beyond the range of the topographical maps. Its predictions are limited, but the accuracy reached can be considered reasonable, once there is a considerable lack of information in this prediction range. Nor the dates in the RBD, nor the topographic maps can be fully trusted, ergo 15.6 years of error for the older buildings is acceptable, especially by considering the relative lack of spread in errors distribution. If a suggestion for improvement were to be given, a method for smoothing the intYEARpolator predictions could be interesting. This would possibly shift the distribution of the error into closer to a gaussian with mean zero. The dangerous found when searching for such an approach is that the year of construction of buildings does not seem to present a smooth surface, despite the spatial dependence. Hence, if this were to be considered, a balance between smoothing and variability would need to found.
We also demonstrated a completely different perspective on how the spatial and temporal dimensions can be joined as the random variable predicted through spatial methodology was actually time. Therefore a strong demonstration of the importance of time in spatially related models and approaches was also given. The code for the intYEARpolator was developed in Python and it runs smoothly even with this quite big proportion of data. The singular case it can be quite time-demanding is in the case of high proportion of prediction points (missing values). It should also be reproducible to the whole Switzerland with no need for modification. A conditional argument is the use of concentric zones, that can be excluded in case of a total different pattern of processing time.
-The source code of the proof-of-concept for national maps can be found here :
Switzerland's direct payment system is the basis for sustainable, market-oriented agriculture. The federal government supports local farms in the form of various types of contributions and enables farming families to claim an adequate income. (cf. Art. 104 BV)
@@ -456,10 +452,6 @@Sileage bale stacks are clearly visible on the newest 2019 layer of the 10cm Swissimage orthophoto provided by Swisstopo. A few hundred of these stacks were manually digitized as vector polygons with QGIS in a semi-automatic approach.
@@ -527,10 +519,6 @@@@ -615,10 +603,6 @@
The contact person at the agricultural office, Mr. T. Froehlich describes the detections as very accurate with a very low percentage of wrong detections. As a GIS product the detections layer can be used in the standard workflow in order to cross-check base datasets or to perform updates and corrections.
@@ -626,10 +610,6 @@Most farmers adhere to the policies and false declaration of areas followed by sanctions is extremely rare. Silage bales are therefore not the first priority when monitoring the advancements and updates considering the LN layer. Nevertheless these new detections allow the end users at the agricultural office to direct their eyes more quickly at relevant hotspots and spare them some aspects of the long and tidious manual search that was performed in the past.
Silage bales are by far not the only object limiting the extent of the cultivable subsidized land. A much larger area is consumed by farm yards – heterogenous spaces around the central farm buildings. Monitoring the growth of these spaces into the LN layer would greatly diminuish the manual workload at the agricultural office. As these spaces might also be detectable by a similar approach, this project will continue to investigate the potential of the STDL Object Detection Framework now into this direction.
-Abstract: The Canton of Thurgau entrusted the STDL with the task of producing swimming pool detections over the cantonal area. Specifically interesting was to leverage the ground truth annotation data from the Canton of Geneva to generate a predictive model in Thurgau while using the publicly available SWISSIMAGE aerial imagery datasets provided by swisstopo. The STDL object detection framework produced highly accurate predictions of swimming pools in Thurgau and thereby proved transferability from one canton to another without having to manually redigitize annotations. These promising detections showcase the highly useful potential of this approach by greatly reducing the need of repetitive manual labour.
-Until February 2021 the Swiss Territorial Data Lab developed an approach based on Mask RCNN Deep Learning algorithms for the detection of objects on aerial images, with swimming pools serving as a demonstration object. The official cadastres of the Canton of Thurgau include – among many other objects – the registration of larger private swimming pools that are permanently anchored in the ground.
The challenge is to keep the cadastre up to date on a regular basis which is usually done manually by surveying or verification with aerial imagery. Because the Canton of Thurgau (unlike the Canton of Geneva) does not maintain an own specific register of swimming pools, this study primarily serves as a technology demonstration.
A secondary goal encompasses detailed knowledge transfer from the data scientist team at the STDL to the cantonal authorities such as providing insight and interpretation guidance into the performance metrics and raising awareness for the prerequisites of the detector framework.
-@@ -523,10 +515,6 @@
After training, tile by tile the entire “Prediction AoI” as well as the ground truth datasets presented to the final model for prediction generation. From a minimum confidence threshold up to 100% the model produces a segmentation mask for each swimming pool detection delimiting its proposed outer boundary. This boundary can be vectorized and transformed back from image space into map coordinates during post-processing. Through this process we can accumulate a consistent GIS-compatible vector layer for visualization, counting and further analysis.
In case of the ground truth data the resulting vector layer can be intersected with the original input data (especially the “Test Subset”) to obtain unbiased model performance metrics. In case of a well-performing model the resulting vector layer can then be intersected with the “Prediction AoI”-derived Thurgau dataset to identify missing or surplus swimming pools in the cadastre.
-@@ -579,10 +567,6 @@
In the city of Frauenfeld a sample district was chosen for manual evaluation by a STDL data scientist. Even though this task should ideally be performed by a local expert this analysis does provide some insight on the potential errors currently existing within the cadastre as well as the object detection quality. Within the sampled area a total of 99 identifiable swimming pool objects were found to be present.
@@ -596,10 +580,6 @@We can conclude that the use of annotation data gathered in another canton of Switzerland allows for highly accurate predictions in Thurgau using the freely and publicly available SWISSIMAGE dataset. We demonstrate that such a transferrable approach can therefore be applied within a relatively short time span to other cantons without the effort of manually digitizing objects in a new area. This is supported by the assumption that SWISSIMAGE is of the same consistent radiometrical and spatial quality we see in Thurgau over the whole country.
Manual evaluation will stay paramount before authorities take for example legal action or perform updates and changes to the cadastre. Nevertheless a great amount of workload reduction can be achieved by redirecting the eyes of the experts to the detected or undetected areas that are worth looking at.
-Abstract: Trees are essential assets, in urban context among others. Since several years, the Canton of Geneva maintains a digital inventory of isolated (or "urban") trees. This project aimed at designing a methodology to automatically update Geneva's tree inventory, using high-density LiDAR data and off-the-shelf software. Eventually, only the sub-task of detecting and geolocating trees was explored. Comparisons against ground truth data show that the task can be more or less tricky depending on how sparse or dense trees are. In mixed contexts, we managed to reach an accuracy of around 60%, which unfortunately is not high enough to foresee a fully unsupervised process. Still, as discussed in the concluding section there may be room for improvement.
-Human societies benefits from the presence of trees in cities and their surroundings. More specifically, as far as urban contexts are concerned, trees deliver many ecosystem services such as:
@@ -879,10 +875,6 @@We refer the reader to the official documentation for further information.
-As already stated, in spite of the thorough and ambitious objectives of this project (cf. here), only the
Figure 3.1 shows some of the tree detection trials we performed, using Terrascan and DFT. Each trial corresponds to a different set of parameters and is represented either by gray dots or colored diamonds in a precision-recall plot (see the image caption for further details).
@@ -1697,10 +1685,6 @@
Despite all the efforts documented here above, the results we obtained are not as satisfactory as expected. Indeed, the metrics we managed to attain all sectors combined indicate that tree detections are neither reliable (low precision) nor exhaustive (low recall). Still, we think that results may be improved by further developing some ideas, which we sketch in the following.
The work documented here was the object of a Forum SITG which took place online on March 29, 2022. Videos and presentation materials can be found here.
-This project was made possible thanks to a tight collaboration between the STDL team and some experts of the Canton of Neuchâtel (NE), the Canton of Geneva (GE), the Conservatoire et Jardin botaniques de la Ville de Genève (CJBG) and the University of Geneva (UNIGE). The STDL team acknowledges key contributions from Marc Riedo (SITN, NE), Bertrand Favre (OCAN, GE), Nicolas Wyler (CJBG) and Gregory Giuliani (UNIGE). We also wish to warmly thank Matthew Parkan for developing, maintaining and advising us on the Digital Forestry Toolbox.
diff --git a/PROJ-TREEDET/resources/LAS_Preprocess_2021_cleaning.fmw b/PROJ-TREEDET/resources/LAS_Preprocess_2021_cleaning.fmw index fc604df..429187a 100644 --- a/PROJ-TREEDET/resources/LAS_Preprocess_2021_cleaning.fmw +++ b/PROJ-TREEDET/resources/LAS_Preprocess_2021_cleaning.fmw @@ -1,2479 +1,2479 @@ -#! -#!