Last month at the Utility Analytics Summit, four data scientists provided examples and realistic perspective on how they’ve optimized model-building for their internal customers at Southern California Edison (SCE).
Two key aspects of model-building emphasized by the SCE data scientists were:
- The need for data scientists to ensure their internal customers trust the advanced analytics models being employed;
- The need for data scientists to educate their internal customers about the scope and range of ways of using these models.
Herodotus’ King and the Oracle
A story told by Herodotus, the father of modern history, came to mind after learning about the work of these four Southern California Edison data scientists, because of an interesting parallel between modern analytical models and the mystical oracles utilized for insights in ancient times. In much the same way that SCE’s data scientists described how their internal customers need to trust and understand the limitations and scope of the advanced analytics models they rely upon, Herodotus described how an ancient king needed to trust the oracle he was going to rely upon. And from his story told by Herodotus also comes similar lessons about the importance of understanding the limitations and scope of the model or oracle being relied upon. A king devised a clever test to ensure he had a trustworthy oracle: on a specific day, the king sent emissaries out to test the major oracles of the ancient world. He gave his emissaries specific instructions to count 100 days from their day of departure, and to all ask their respective oracles the same question on the same day, namely to describe precisely what the king what he was doing on that 100th day. On that day, secretly, the king prepared a strange food combination in a brass pot with a brass lid.
According to Herodotus, the Oracle at Delphi gave the following, correct, answer:
I can count the sands, and I can measure the ocean;
I have ears for the silent, and know what the dumb man meaneth;
Lo! on my sense there striketh the smell of a shell-covered tortoise,
Boiling now on a fire, with the flesh of a lamb, in a cauldron,
Brass is the vessel below, and brass the cover above it.
Over the years, the king relied upon the Oracle at Delphi, and he won many battles. The king became increasingly confident, despite the ambiguity of the predictions of the oracle. When the oracle told him he would “destroy an empire,” the king remained confident that he would continue to win battles, despite the increasing ambiguity of the messages the oracle was giving him.
In the end, after building a great kingdom, the king lost the portion of his empire where he had expanded too quickly. The lesson for the king was that he should have interpreted nuances in the oracle’s messages more carefully, and asked the oracle better questions. The oracle was right that the king would destroy empires, but one of the empires he had destroyed was not his enemy’s, but instead was one of his own.
Bridging the gap between analytics and business
The presentation by Southern California Edison at Utility Analytics Summit last month was titled “Bridging the Gap Between Analytics and Business.” During the session, the first of the four presenters, Dr. Alejandro Komai, described a common theme across their recent successful applications of Machine Learning (ML), namely “to find insights by building models.” Applications have ranged from classic ones, such as employing ML-based analysis of equipment failure data to build a better predictive failure model, to less straightforward applications which yielded a range of valuable insights.
“When building models, a key goal for us as data scientists,” stated Dr. Komai, “is to get decision-makers to ask the right questions of us.” Dr. Komai added that part of this process has involved educating internal customers about details regarding Machine Learning and related models, and, conversely, for data scientists to learn everything they can regarding the business processes and associated data. “Decision-makers’ awareness of the Machine Learning toolkit, and data scientists’ awareness of the full context for the business processes and data, leads to more value,” Dr. Komai added, “because it enables us to better prepare the data for the model.”
Komai described the above phase as “the first collaborative loop” at the top of the diagram below:
Dr. Komai and his colleagues discussed the importance of focusing on testable hypotheses, when developing models. Business understanding must be translated into a deeper understanding of the underlying data. As a result, it is essential for data scientists to engage with business owners at a level of sufficient detail to ensure the output of the resulting model is sensible and provides results that are actionable rather than abstract. The importance of using analytics and collaboration with business owners to go beyond “gut feel” or intuitive rules of the past was demonstrated.
Specific case studies were then presented by Dr. Eric X. Wang, who modeled failure mechanisms for cable and conductors, and Ms. Sophie Lellis-Petrie, who gave several examples of feature importance trees and related methods of categorical data analysis, including an example of deep analysis of text responses from customer satisfaction surveys. In these and other case study areas, a commonality was the presence of multiple kinds of data, such as continuous data (e.g. thickness of cable, or amount of remaining insulation) versus categorical data, as well as the importance of having models capture interactions resulting from simultaneous operation of multiple influencing factors.
Success in predicting failures in Dr. Wang’s cable study, as well as in identifying actionable factors to improve customer satisfaction in Ms. Lellis-Petrie’s survey study, both demonstrated the role played by the SCE data scientists’ team efforts at bridging the gap between model-building and business units’ needs.
“The keys to success are showing our internal customers how the models we build for them perform, and communicating with the business unit personnel on a regular basis. We’ve found that by helping them look at the models from different perspectives, and by explaining to them the mathematical basis for surprising, unpredicted insights provided by the model, we’ve really been able to increase their acceptance. We’ve been finding huge opportunities where Data Science has been able to shine beyond mere classification problems. SCE is dedicated to using the best, most cutting-edge tools to improve our customers’ experience, and to uncover more insights for what lies ahead.”