How did our artificial “brain” work?
To put it simply, our task was to “teach” a computer programme to recognise similar formations in other images based on existing satellite and aerial (orthophoto) images of dump sites. This approach, called supervised machine learning, first involved getting the data about geographic locations of known dump sites and images of these locations. We also needed the most recent satellite images of the watercourse of the Lim River throughout Serbia we would use to apply the “learnt” model of artificial intelligence and try to discover locations of wild dump sites.
For the purpose of this experiment, we downloaded data on 150 legal and 3350 illegal dump sites in Serbia from the Environmental Protection Agency (SEPA) Also, thanks to the Ministry of Environmental Protection, we received images of the Republic Geodetic Authority for the period from 2007 through 2013, when the highest number of data points on dump sites was collected. Out of that, we selected almost 500 high-quality images of confirmed dump sites, which served us to “study” the model of artificial intelligence.
When the model was ready, we tried to use it to identify dump sites on adequate satellite images of the watercourse of the Lim River in Serbia. At the time of the experiment, the most recent cloudless images were the ones made by Pléiades satellite in the course of the second half of 2019, with the resolution of 50 cm (1 dot on the screen represents 50 cm in nature).
Our model detected 1443 areas that could be dump sites on these images. Experts engaged by ESA visually verified these areas and narrowed down the choice to 61 potential dump site locations. That is a significant increase compared to 23 that SEPA recorded 7-8 years ago. We shared these data with officials of the Ministry of Environmental Protection and the Municipality of Priboj, the biggest settlement along the Lim River Basin in Serbia, to be subject to field checks in the coming period as a part of broader efforts to resolve pollution problems in the Lim River basin.
What have we learned out of the experiment?
Since our AI model did not show a high level of accuracy in automatic dump site detection, we were unable to prove the initial assumption. As a reminder, the assumption was that satellite imagery could be used for fast and automated landfill sites mapping by using AI.
What was missing? Supervised machine learning models require a large amount of input data. In our case, we had at our disposal a very small number of already confirmed dump sites in the Lim River basin (only 23 in the SEPA data set), so we had to train the model using the examples of confirmed dump sites in other parts of Serbia. However, it turned out that even over 400 additional images of the confirmed dump sites did not represent a sufficient-size sample. Namely, when observing a large geographical area, dump sites are relatively rare and small in area. In addition, they are visually similar to some other phenomena such as bare land, piles of bulk construction material on construction sites or certain types of goods in warehouses.
In order to increase the accuracy of the model, it would have to be pre-trained with a larger number of images of confirmed dump sites, as well as to use images of much higher resolution - at least 30 cm, and preferably 20 cm or even better. The accuracy of the model could be improved by applying additional techniques. For example, based on the assumption that illegal dump sites appear near transport corridors, we could take data on the network of streets, local and regional roads in Serbia, and exclude from the search area all areas that are, for instance, more than a few tens of meters away from some road.
Although the fulfilment of these requirements is not impossible, it however requires significantly increased resources. Higher resolution satellite or aerial imagery is still quite expensive and would require far more storage space and computer processing power, so the question of the cost-effectiveness of such an approach could be raised, at least at this point in time.
The encouraging thing is that the trends are in our favour - in recent years we have witnessed very rapid development of space and cloud technologies, as well as of artificial intelligence. Satellite launches and increasingly powerful computing resources are becoming more accessible, so it is quite possible that we will have the conditions to revise this experiment, with a much more precise end result in the foreseeable future.
We also hope that our example of the application of machine learning technologies to solve environmental problems will inspire others to explore, and thus contribute to faster improvement of the method and its use on a wider scale.
A detailed technical report on the experiments is available on the website of the European Space Agency.
Blog was originally published on Business Intelligence Review.