Food for thought when feeding machine learning algorithms for vegetation management

Machine learning and artificial intelligence have strong potential in vegetation management, but there is a lot that can go wrong.

My last blog discussed the opportunity for transmission and distribution utilities to use artificial intelligence (AI) to improve the efficiency and accuracy of vegetation management (VM) programs. While the end goal of analytics-based VM is the same as with other predictive and condition-based maintenance programs—such as in a gas-fired turbine—the analytics will be very different. Predictive maintenance on a gas turbine relies on the analysis of accurate time series data from the many sensors installed at the generation plant, all neatly structured and appended with rich metadata. The machine learning algorithm will scrutinize each stream of data for anomalies and alert maintenance teams each time certain parameters are breached. Using a feedback loop, the algorithm improves by identifying false positives and maximizing asset availability.

AI-based VM presents a unique problem

VM is different. It relies on video and still imaging of a utility right-of-way, along with field records, inspection reports, and a raft of data from external sources: microclimate, land use, weather, and more. In addition to the high complexity of these datasets, the data is collected less frequently, resulting in many months or years between each data point. Also, the decisions AI must make are more complex for VM than in a generation asset: An AI algorithm in a gas turbine will have defined parameters for each data point. For example, to provide the insight that “vibration in a wind turbine’s nacelle indicates an outage will occur in the next 5 days,” the algorithm will start life with a set of parameters, which will, over time, be refined to improve the accuracy of outage prediction and strip out false positives. It is more difficult to define the parameters for predictive VM, where algorithms will rely on more subjective interpretation of image data.

The accuracy of VM AI will rely on the quality of its image captioning: the ability to accurately append images with a description of what is occurring in each image. The more detailed the caption, the more able a utility will be to optimize its VM program. For example, the following captions provide increasing value to a utility:

This is a tree
This tree is an ash
This tree is an ash and it is 7 meters from a power line
This tree is an ash and given its predicted growth, this tree will touch a transmission line in 3 months’ time

Utilities must recognize and manage the issue of bias in learning data

The quality of image recognition, in particular the field of image captioning, relies heavily on the quality of unbiased training data. The subject of bias is receiving a lot of attention in the field of AI because of the significant negative impact it can have on the outcome of AI projects. Amazon’s sexist AI recruitment tool gained significant exposure in 2018 when it was discovered that the AI selected men over women.

Of more relevance to VM and image captioning is the (frankly horrific) Norman AI built by Massachusetts Institute of Technology (MIT). Researchers at MIT intentionally built the world’s first AI-powered psychopath to demonstrate the requirement for a high quality and objective set of learning data for image captioning. The Norman AI (named after the fictional owner of the Bates Motel in the film Psycho) was fed images from the darkest parts of the web and was then given the famous Rorschach inkblot test. Norman’s results were then compared with another AI that learned from the entire web. What the regular AI recognized as “a close up of a vase with flowers,” Norman interpreted as “a man is shot dead.”

Herein lies the problem: AI can only know what it knows already. A VM algorithm must be able to identify different genera of trees, their likely growth rates given local conditions, what a diseased tree looks like, how that disease will affect the likelihood of falling near a line, and so on. Much of this interpretation is subjective, giving rise to significant potential for biased inputs that affect analytic outcomes. Utilities must recognize these risks at the start of a program and closely manage them throughout.

Stuart Ravens is a principal research analyst contributing to Navigant Research’s Digital Transformation service. Ravens has been an analyst for more than 20 years. For the past 10 years, his work has focused on the use of technology by utilities. He has played a lead role in the delivery of custom research and advisory work for many utilities, IT vendors, service providers, and renewables specialists.