How do you deal with conflation of datasets?
– Derek, GIS technician at a gas utility
Questions answered by Evan Bollard, GlobalPOS
For the purposes of this column I will try to answer this from a GNSS perspective rather than a GIS one.
Data conflation in a GIS sense is handled in many different ways and has been the subject of large amounts of research into ways to address the combining of various datasets with various quality and levels of information. I would like to address this in connection with GNSS and field data collection. With the advances in portable hardware it is now possible to readily take very large datasets into the field in a functional way. That is to say, the limitations placed on this in the past by low processor speeds and short battery life to a large extent have been removed from the equation. What this means is that it is possible to address data conflation at the point of data collection as one method that can be used in producing merged datasets.
As field data collection can be quite labour intensive anyway, it can allow for datasets to be collected that are a real-time update of the formal existing data base. Details regarding observational uncertainty can be included automatically based on the correct storage of this information directly from the operator and based on the instrument type being used. As an example, using cloud-based connections it is possible to directly update the main database held on the server at head office. This can allow the data to be corrected where new information or higher accuracy is available at the point of observation. This can be preferable to collecting a new dataset that needs to be later merged with the older or existing dataset/s, which is also labour intensive and costly.
One of the advantages of this field data collection methodology is in the ability to recognise where additional field information is required. The operator can immediately determine whether or not further actions need to be taken to collect data to either improve positional accuracy or in identifying or describing features and their condition. This can save significantly in data collection costs that would have been incurred by revisiting site/s to revise data after discrepancies were identified in an office situation when trying to resolve a conflation issue.
If the correct structure is implemented within the database then it can be possible to also ensure that new, less correct data is not being used to overwrite older but more correct data as this data collection occurs. The concept is that there is one dataset and not multiples and that the dataset can be updated in real-time. This is not necessarily easy, due to structural and operational requirements, however it is one way to deal with data conflation.
As part of this discussion, this brings us to positional uncertainty, with regard to coordinate position results collected by GNSS. In the past various methods have been utilised to specify the accuracy of surveys and coordinates. Surveyors typically used a class and order structure that was used to indicate the quality of results obtained from observations in control surveys. It is now being promoted that the use of positional uncertainty should be utilised on a more widespread basis, especially in a GIS sense to, as a minimum, tag coordinates with a quality indicator that may allow relative positioning from disparate datasets to be more reasonably performed.
As part of the overall progress towards the improvement of datasets and the minimisation of conflation in a GNSS sense, there has been considerable effort put forward in developing an Australian Strategic Plan for GNSS, published by the Australian Spatial Consortium and available on the CRCSI website (www.crcsi.com.au). The development of a more unified approach to the progress of GNSS use within Australia is outlined with a possible direction to follow. Although published in 2012, it is probably worth another look to either further follow this path or make adaptations to its current course to ensure a forward momentum and that opportunities are not lost.