Skip to main content
Artificial IntelligenceGrid

Large Language Models: Applications, Limitations and Potential Risks for Power Grids

Using Large Language Models for the Power Grid

Large Language Models (LLMs) have rapidly advanced in recent years, demonstrating remarkable capabilities in natural language processing and even code generation. This paper explores the potential applications of LLMs within power grid systems, aiming to enhance efficiency and automation. The ComEd Grid Analytics team is actively developing various LLM applications for business needs. While this looks promising, this paper also examines the limitations and potential risks associated with integrating LLMs into critical power grid infrastructure.

1. Introduction

The application of Large Language Models (LLMs) in various sectors has become a topic of significant interest. This paper focuses on the application of LLMs to power grids, a critical infrastructure that ensures the smooth operation of numerous aspects of modern life. The importance of this topic cannot be overstated, as the power grid is an essential base for functioning cities, industries and services. The integration of LLMs into power grids could have a significant impact on the way we manage and distribute energy by offering enhanced efficiency, predictive capabilities and automation. The ComEd Grid Analytics team works on developing some LLM applications designed to help in speedy processing and classification of outage messages and data democratization (i.e., making data more accessible, see Section 4 below) to facilitate work with data. However, as with any technological advancement, the application of LLMs in power grids also presents potential limitations and risks. Understanding these challenges is crucial to ensuring safe, reliable and efficient use of LLMs in this vital sector. This paper aims to go deeper into these aspects, looking at recent LLM hypes more critically and providing a comprehensive overview of benefits and pitfalls of applying LLMs to power grids.

2. History and recent advancements of LLMs

Initial LLM development (1950s – 1990s) relied on rule-based systems, limited in their ability to handle the complexities of natural language. The 1990s saw a shift towards statistical language models, but computational power remained a constraint.

The transformer architecture was first introduced in a revolutionary paper by Vaswani et al. (2017). The novel neural network architecture is based on a mechanism called “self-attention” that helps understand relationships between elements in a sequence (like words in a sentence) much more effectively. Unlike Recurrent Neural Networks (RNNs) that process information sequentially, transformers utilize a specialized matrix multiplication, a core part of self-attention, which allows for independent calculations involving different elements of the sequence simultaneously. This capability significantly speeds up a training process, especially for long sequences. This architecture made a crucial impact not only on LLMs but for many other fields of machine learning in general (Lin et al., 2022).

Google’s BERT (2018) (Devlin et al., 2018) was a landmark model that demonstrated the power of a transformer’s attention mechanism for pre-training language models by employing a bidirectional approach. Deep learning and large datasets revolutionized LLMs with the advent of the Generative Pre-trained Transformer (GPT). OpenAI’s GPT-3 (Brown et al., 2020) scaled up the transformer concept dramatically, demonstrating even more impressive language generation capabilities. It became the largest LLM at the time, setting new performance standards. Other impressive LLMs like LLaMA 3 (Touvron et al., 2023), Gemini (Anil et al., 2023), Claude (Anthropic, 2024), GPT-4 (Achiam et a., 2023), Grok (x.Ai, 2024), and others demonstrated the rapid progress and accessibility of this technology.

Organizations leverage LLMs for diverse tasks. They power chatbots and virtual assistants for customer support and open-ended conversations. In programming, LLMs aid in code generation and debugging. They excel at sentiment analysis, discerning emotion and opinions from text, which helps companies improve customer satisfaction. LLMs classify and cluster text to reveal trends, summarize lengthy materials, paraphrase content and can be used as brainstorming tools for content generation. They can function as a tailored search, provide content recommendation and help in identifying relevant references. These days LLMs effectively function in many human languages and programming languages as copilot tools. However, it’s important to remember that LLMs prioritize language mastery over factual accuracy, so one needs to always verify responses before using them as references.

3. Limitations and potential risks of using LLMs in power grids

3.1 Goals, challenges, and specifics of power grid

The core goals of utility power grids include ensuring reliable and resilient delivery of electricity, maintaining safety for the public and our workforce and providing quality customer support. It is a critical national infrastructure. Disruptions to electricity supply may cascade into vital sectors such as healthcare, communications, transportation and public services. Safety requirements are paramount, demanding high-quality engineering solutions to prevent accidents and protect critical infrastructure. To achieve reliability, grids must be designed to minimize outages and their impact. Improving resiliency requires quick restoration and resistance to disruptions in power delivery.

Power grids follow comprehensive safety measures like physical barriers to protect substations, redundancy of critical components and protocols for handling hazardous materials or high-voltage equipment. Alongside physical security, cybersecurity takes center stage ensuring robust protection against malicious attacks that could disrupt power delivery. Cybersecurity threats are evolving these days and have a potential for widespread damage through targeted attacks on the grid. Proactive cybersecurity strategies such as regular vulnerability assessments, penetration testing, intrusion detection systems, multi-factor authentication and access controls are widely used in the field.

In this regard, we must remember that while a deployment of LLMs comes with significant potential, their limitations and potential risks must be carefully considered. We will also examine the role that LLMs can play in providing potential benefits for power grids.

3.2 LLM limitations

There are many discussions in the community and papers talking about limitations of LLMs, including sometimes harmful output content, legal, ethical and educational challenges (Fabiano, 2023), (Bilan, 2024), (O’Neil & Connor, 2023), (Porsdam Mann et al., 203), (Samuelson, 2023). In this section we focus only on those limitations that are important in application to power grids specifically.

  • Generative Stochasticity: LLMs generate text probabilistically, meaning there’s a degree of randomness in their responses. This can result in inconsistencies, making it difficult to ensure reliability and predictability in performance. The same question being asked twice could lead to different answers.
  • Challenges in Incorporating Domain-Specific Knowledge: LLMs, trained on text data, lack grounding in real-world experiences and physical reality. This can lead to nonsensical or factually incorrect outputs, generation of mis- and disinformation, especially when queried on topics outside their training domain or close to borders of its knowledge. Power grids rely on complex engineering principles and domain- specific data that LLMs struggle to fully grasp, especially when trained primarily on general text. It requires dedicated models which do not exist on the market yet.
  • Difficulty in Ensuring Reliability and Safety: The probabilistic nature of LLMs output conflicts with the strict reliability and safety standards required in a power grid. Unexpected outputs could lead to sub-optimal decisions or even risks to grid equipment.
  • Potential Regulatory Non-Compliance: There is a challenge of ensuring LLM applications align with existing and evolving regulations. As LLMs are integrated into various applications, ensuring compliance with data privacy, anti-discrimination, and other regulations becomes a complex issue.
  • Vulnerability to Adversarial Attacks: LLMs can be susceptible to specifically crafted inputs aimed at causing them to malfunction or generate incorrect outputs, posing security risks if used in critical systems.
  • Limited Reasoning, Math, and Analytical Abilities: LLMs are known to be error prone in math, which is a serious problem if they are applied, even indirectly, to crucial calculations related to power flow, load balancing and fault analysis. LLMs are also known to produce logical mistakes which may lead to a serious challenge in coding abilities or writing significantly advanced queries using Structured Query Language (SQL).

Researchers from a few US engineering institutes investigated the application of advanced LLM models (Achaim et al., 2023), (Anil et al., 2023), (Anthropic, 2024) to the field of control engineering (Kevian et al., 2024) using a set of well-established control engineering benchmarks. The LLMs demonstrated surprising capabilities on certain control engineering tasks, sometimes outperforming traditional methods on specific benchmarks. However, LLMs struggled with tasks requiring complex mathematical reasoning or domain-specific knowledge. They could also generate incorrect solutions without providing a clear indication of failure.

Due to these limitations and potential risks, the application of LLMs in power grids requires extremely cautious development, extensive testing, and robust safeguards. Their use must be weighed against the inherent need for robust, traditional deterministically precise solutions to ensure the safe and reliable operation of the power grid.

4. Possible LLM applications

While the integration of LLMs into power grid infrastructure demands some care (see Section 3), there are still many potential applications where their language processing strengths could offer value in supporting, rather than directly controlling, grid operations. Let us list a few of them here:

  • Textual Information Processing: LLMs can process large volumes of text-based data including equipment manuals, technical reports and regulatory documents. They could aid in quickly extracting relevant information and identifying potential action items.
  • Document Summarization: Summarizing lengthy technical documentation or incident reports can deliver valuable information faster to grid operators and engineers. LLMs can provide initial summaries, allowing specialists to quickly focus on the most critical information.
  • Content Generation: LLMs can help generate summaries of meetings, reports and customer instructions. This application could save time and improve the accessibility of information for grid personnel.
  • Text Classification and Clustering: LLMs can automatically categorize incoming customer inquiries, trouble tickets or sensor reports. This could help prioritize issues and streamline response protocols within power companies.
  • New Hire Onboarding: Onboarding can be time-consuming for everyone. Traditionally, it involves IT tickets, registrations, and document review. Dedicated chatbots can streamline this by automating tasks and providing real-time support for new hires. LLMs can guide registrations, answer FAQs, and direct them to resources.
  • Customer Service: LLMs could significantly enhance chatbots and virtual assistants. By automating responses to routine inquiries, outage reports and troubleshooting steps, LLMs can streamline processes and potentially reduce customer service workloads for support staff. LLMs can help managing customer communications through social media and emails through proper categorization, identification, and escalation of critical messages, ensuring timely responses to important customer concerns.
  • Security Threat Detection: Artificial Intelligence (AI) models can be used to proactively identify potential security threats to both personal and physical assets. By analyzing social media posts and emails, these models can detect cases of extreme violence or attempts to expose sensitive details about grid infrastructure. Early detection of such threats allows for a faster and more coordinated response, minimizing potential damage.
  • Core Coding Assistance: LLMs may help generate programming code snippets or even entire functions based on natural language descriptions or comments. LLMs can generate repetitive code structures, help understand complex codes, refactor codes, debugging assistance and generate documentation. This speeds up development and creates a more efficient work environment for analytical and development teams.
  • Data Democratization: Modern LLMs can interpret human language by converting it into technical commands or codes. It can provide an opportunity to all people, even without SQL or programming experience, to work with data and see trends easily and in a timely manner. These technical operations can form a chain of actions with a desirable final outcome delivering business intelligence from data. For instance, a human can ask a question about key metrics (like SAIDI / SAIFI) in a certain region over a specific some period of time in natural language and have AI to draw a chart or return a table with results. While certain LLM tools have already been developed (LangChain, 2024)) to facilitate this work, and first results look very promising, our tests show that some work still needs to be done to either fine-tune initial models or optimize prompts to achieve a desirable accuracy.
  • Outage analytics: The goal here is to use LLMs to automatically process calls, voicemails, trouble reports, log files and tickets from employees or systems. As a first target, LLMs can help classify input information into outage classes. Then using these classified texts, one can extract specific items (like an outage location, issue description, actions taken, time) with the help of predefined prompts. Prompt optimization plays an important role here. While a direct “manual” optimization is possible, new programmatic tools became available recently which should alleviate prompt engineering and require only a few labeled examples (Khattab et al., 2023). Such tools also look promising for a multi-label text classification (D’Oosterlinck et al., 2024).

5. Responsible development with AI

It is important to note that the use of LLMs in even these support-oriented roles requires strict safeguards. Output must always be carefully reviewed by human experts before actions are taken.

Efforts to establish frameworks for the responsible development and use of AI and LLMs are gaining significant traction. The recent presidential Executive Order on ensuring responsible development of Artificial Intelligence highlights principles focused on transparency, risk mitigation, and accountability in AI deployment: “Artificial Intelligence must be safe and secure. Meeting this goal requires robust, reliable, repeatable, and standardized evaluations of AI systems, as well as policies, institutions, and as appropriate, other mechanisms to test, understand, and mitigate risks from these systems before they are put to use” (The White House, 2023). Similarly, the European Parliament’s proposed AI Act aims to regulate AI based on risk levels, with emphasis on transparency and clear guidelines for high-risk applications (European Parliament, 2023). To support these initiatives, companies like Microsoft emphasize the concept of Responsible AI. Their framework provides comprehensive principles including fairness, safety, privacy, inclusiveness, and accountability, offering guidance for the ethical design and use of AI systems (Microsoft, 2024). These efforts demonstrate a growing international push to ensure AI and LLMs are developed and applied safely while maximizing their benefits and minimizing potential harms.

6. Conclusions

Large language models demonstrate impressive capabilities in language processing, but their application to power grids demands considerable care and caution. Their inherent limitations, including a lack of domain-specific knowledge, probabilistic output and vulnerability to manipulation, pose novel risks for the reliability and safety of critical infrastructure. While they have potential for support roles like customer service, information processing, code development and answering some immediate business questions, LLMs must always be considered as tools that augment, rather than replace, human expertise and rigorous engineering practices. Strict safeguards, thorough verification, and a focus on transparency are essential when considering LLM use in this domain. As this technology evolves, further research into developing LLMs better suited to handle the demands of the power grids is needed to potentially harness their benefits safely and responsibly.

Acknowledgement

We thank ComEd and Exelon employees Jaime Ortega and Olexander Chumak for encouraging our work, support and helpful comments in the preparation of this paper. We thank the ComEd Customer Data Science team for useful feedback regarding customer applications. We also note that LLM models have been used for brainstorming, identifying and summarizing relevant papers, some style improvements, and text proofreading. 


References


About the authors

Dmitry Bandurin holds a PhD in particle physics and has worked as a research scientist at Fermilab, CERN and other physics laboratories worldwide. He joined ComEd’s Grid Advanced Analytics team in 2023 as a Principle Data Scientist. His interests include applying machine learning to monitor electrical power system health, timely detecting and predicting power line failures, using modern large language models for data structuring, developing optimization models, signal processing techniques and digital twins.

Norayr Matevosyan obtained his PhD in mathematics in 2003 from the Royal Institute of Technology, Stockholm, Sweden and has worked as a research scientist at multiple universities. Since 2015, he has been working as a data scientist at ComEd and is currently Sr. Manager of the Grid Advanced Analytics team. His current interests include developing third generation ML/AI methodologies for the grid, outage prediction and investment optimization models for assets as well as building digital twins, specialized advanced grid agents and federated learning systems.


ComEd an Exelon Company

    is a member of

UAI: Utility Analytics Institute