Christian Michael Franck, Chi-Ching Hsu, Yu Xiao, Pascal Bleuler, Gaetan Frusque, Mahir Muratovic, Tommaso Polonelli
One of the cornerstones of a reliable transmission and distribution (T&D) grid operation is fully functional components that can operate robustly and with a low outage rate under all specified operating conditions. Dependable maintenance strategies are thus indispensable and are applied by grid operators around the world. One of the present key challenges in many countries with a widely developed T&D grid system is aging components that reach their anticipated end of life. Asset management faces the question of whether the lifetime of components could be prolonged and the replacement could be delayed. For this, the health of the components needs to be assessed and is ideally continuously monitored. In addition to this, the currently ongoing transition of the entire energy system leads to a change and increase of stress on the T&D equipment. The integration of new renewable energy sources on all voltage levels leads to bidirectional power flows and increased variability. The higher demand for electric power not only increases power-flow levels on average, but also in particular, peak flows. The result of this changed and increased stress on the equipment is an accelerated aging component and the need for maintenance strategies to be adopted for this new situation.
Recent developments in low-cost and low-power data acquisition technology and machine learning (ML)-based algorithms, combined with rapidly increasing decentralized embedded computing power, offer the opportunity to develop improved and intelligent maintenance strategies based on continuous monitoring and a real-time health evaluation of the equipment. This paradigm changes the prospects of equipment maintenance, offering significantly reduced costs in asset management. The lifetime of properly dimensioned equipment is expected to be several decades, sometimes as many as 40–60 years. Many maintenance checks therefore only confirm the excellent condition of the component and are not necessary at this point in time. Thus, delaying not only the replacement but also a maintenance interval based on the health condition would be very welcome. Hence, the most commonly used maintenance concepts have changed from time-based checks to a new and more intelligent approach, oriented toward a just-in-time service.
This article aims to introduce the concepts of intelligent maintenance strategies, trends, and challenges of T&D equipment condition monitoring as well as the automatic estimation of equipment health.
One of the primary goals of every T&D system operator is the reliable performance of his or her network with a low interruption duration. This goal, in turn, requires fully functional and healthy components. During their lifetime after installation and commissioning, the components degrade and evolve from new to aged. Aging in this sense means a continuous deterioration of their operability or resilience against normal and abnormal voltage and current stress. This deterioration may lead to a failure of the equipment if it is not maintained early enough. The deterioration happens on all subcomponents with different rates, and the first one to fail will determine the prospective lifetime without maintenance. Proper asset maintenance is thus an essential task for all grid operators. Different operators use different maintenance strategies, which depend on many individual factors, such as eligible costs for maintenance and repair, required grid availability, and other historical factors. Two basic characteristics are used to classify different strategies, namely, whether the impact of a potential component failure is considered or not as well as whether the condition of the component is considered or not. The four most commonly used strategies are introduced in the following.
Maintenance strategies are exemplified using three different components with different degradation mechanisms and different lifetimes (see Figure 1). Component has the shortest lifetime and is the quickest to deteriorate. Component has the longest lifetime and deteriorates fast in the beginning, followed by a relatively stable period before it starts to deteriorate again and reaches its end of life. Component has a long lifetime and deteriorates slowly in the beginning, and faster with increasing age. The equipment’s condition is measured by a health indicator (HI): 100% is perfectly healthy, and 0% is nonoperational and indicates faulty equipment. As the fault is caused by the fastest-deteriorating subcomponent, it may be possible to replace or repair this subcomponent, perform maintenance, or replace other subcomponents and thus bring the overall equipment health back to a sufficiently high value so that the component can be used again in the grid.
Reactive maintenance, also referred to as corrective maintenance, is the simplest strategy to apply as the grid operator just waits until a component fails and inspects it after its failure. In the example presented in Figure 1, these would be the instances of T1, T3, and T5 for components , , and , respectively. The component is then inspected, and it is decided whether it can be repaired or has to be replaced. Obviously, this strategy leads to the regular outage of parts of the network and can force prolonged grid outage times. In addition to the cost of grid outage, this strategy necessitates having all components in stock to ensure a relatively timely replacement in case they cannot be repaired. Thus, and despite its simplicity, it is likely not the most cost-effective strategy. It can be applied for only noncritical assets or for those that are either inaccessible for maintenance (e.g., buried cables) and for which no condition-monitoring tools exist.
Preventive maintenance, also called calendar- or time-based maintenance, implies regularly scheduled inspection or replacement intervals based on recommendations from the manufacturer, experience by the network operator, or mandates from the regulator. This method is appropriate to catch all gradual and slow degradation processes such as wear and corrosion. The service intervals are made irrespective of the state of the equipment and need to be shorter than the time to failure from the visible start of the degradation. In the aforementioned example, regularly scheduled checks are performed at times T0, T2, and T4. The failure of component at instant T1 would not be prevented as its degradation is faster than expected from the maintenance schedule. The inspection of component at instant T4 would reveal a poor condition and trigger an overhaul or replacement before it would fail at T5. This maintenance would improve its condition and prolong its lifetime. However, the inspection of component at T2 would, in principle, not have been necessary. The failure of component would most likely not be detected as well as its condition at T2 was even better than that of component . More frequent service intervals prevent more failures but increase overall operational costs. The service intervals could change with equipment age, e.g., more frequent checks for older components, or with respect to the importance of the component in the overall system operation. With the aforementioned changing stress to the equipment, existing service intervals may have to be reconsidered and potentially adopted.
The idea of condition-based maintenance is to schedule maintenance depending on the condition of the equipment. If the condition is good, no maintenance is required. By this, disadvantages from both the reactive and preventive maintenance can be avoided, and lower service costs, combined with high system availability, can be achieved. There is no prolonged outage as the equipment is either serviced or replaced before it fails, and there are no (or fewer) unnecessary inspections when the equipment is still in good condition. In the example shown in Figure 1, this would be the instances when the health reaches the predefined horizontal blue threshold line. Different threshold levels can be defined for several subcomponents simultaneously, and reaching any of these would trigger an inspection. The disadvantage of condition-based monitoring is that after reaching one of the thresholds, a quick inspection is required as the remaining time to failure is not known. In the aforementioned example, quick maintenance would only be needed for components and after going below the predefined HI threshold. The time to failure after going below the threshold is longer for component , and the inspection could be performed later, thus its planning is easier. Consideration of the so-called remaining useful lifetime (RUL) prediction is therefore the next step toward intelligent maintenance.
Predictive maintenance, also called intelligent predictive maintenance, is the logical continuation of condition-based maintenance and uses advanced statistical methods to estimate RUL of the equipment; it ensures minimum system outage with the lowest possible maintenance costs. Ideal maintenance schedules can be proactively chosen ahead of time, and decisions on prolonged equipment use can be safely made. Predictive maintenance has already found widespread application in manufacturing industries, but it is still in its infancy in the electric power sector. The type of maintenance strategy for gas-insulated switchgear (GIS) is different for different voltage levels, but it is purely time based for ≈75% of the GIS and for ≈25%, a combination of time- and condition-based maintenance. Only for the highest voltage level ≥ 700 kV are more than ≈55% of the GISs maintained following a purely condition-based maintenance strategy.
To apply a predictive maintenance strategy, a set of sensors is required that (continuously) monitors the equipment, and a methodology needs to be specified that determines the equipment’s health condition and the RUL based on the data measured by these sensors. Today, if sensors are applied to power system equipment, only very simple condition-monitoring sensors such as counters (for a number of operations including circuit breakers or surge arresters) or temperature and pressure sensors are used in combination with predefined thresholds at which an alarm is triggered. For more advanced sensors, e.g., partial discharge monitoring sensors, human judgment is required to interpret the measured data.
Today’s R&D efforts focus on the aspects of sensors as well as the evaluation and interpretation of the measured data. The ideal monitoring equipment is based on cost-effective and low-power sensors that can be applied easily and nonintrusively, combined with an energy-efficient data acquisition system that is connected to the cloud for continuous and autonomous measurement. Advancements in Internet of Things (IoT)-type applications offer multifaceted opportunities for power system equipment monitoring. The possible causes of equipment failure are numerous, and it is impossible to predict them in their entirety based on deterministic models and the underlying physical mechanisms of degradation processes. Intelligent ML methods are thus applied with the aim of detecting anomalies from the sensor data and by this can detect the components’ degradation during operation and prevent failures. Both aspects are discussed in detail in the following sections. First, the equipment monitoring, sensor selection, and IoT connectivity are explained. Finally, ML methods in the T&D field are presented.
Equipment condition monitoring is fundamental to achieving the predictive maintenance strategy introduced in the previous section. In the following section, possible sensors for different applications used to monitor T&D equipment components are introduced and discussed. Furthermore, IoT methods used to connect widely deployed sensors throughout the power network are presented.
To give a full-scale health condition monitoring of the commonly distributed components, including transformers, circuit breakers, overhead lines, and so on, the following sensors are considered: temperature and humidity sensors, microphones, accelerometers, current and voltage probes, images or video recordings as well as partial discharge sensors. Here we elaborate on the example information that the sensors can provide on the health condition of components.
Temperature
The equipment’s temperature and its distribution reflect the heating and cooling situation, which can be used to analyze the state of equipment. For example, in circuit breakers, heating occurs due to the contact resistance, and changes in temperature provide information on the state of the nominal contact system. The temperature can be measured at discrete locations with sensors, or over the entire visible surface with infrared images or videos.
Humidity
Moisture usually has a negative impact on the equipment’s insulation performance and is considered a terminal aging cause in solid and many liquid insulation systems.
Microphone
The acoustic signal can be used to monitor every component in the power system, and it may also be the most intuitive example for understanding the value of anomaly detection. Every reader of this article is subjected to everyday noise patterns when opening the bedroom window, running a car engine, hearing the fridge compressor, and so on. Intuitively, we recognize deviations of these noise patterns and may initiate inspection of the device. The same applies for T&D equipment: an abnormal noise pattern can indicate a poor health state.
Image and Video Stream
Human experts always start an equipment inspection with a visual check. Worn, loose, displaced, corroded, or deformed parts of the components can easily be detected. For continuous monitoring, recorded images can be analyzed with computer vision algorithms, and deviations from previous images can be detected. The main challenge is the limited view of a camera and that multiple cameras may need to be installed. A disadvantage is that the amount of recorded data is much larger than for all other types of sensors, but an advantage is that it is intrinsically nonintrusive.
Accelerometer
Vibration signals can be measured by accelerometers. Compared to microphones, an accelerometer shows better environmental adaptability as the vibration signal is mainly transferred through solid structures. Thus, in substation environments with many different components, vibration signals are more suitable than acoustic signals as the source of the signal can be clearly correlated to each sensor. In addition, very low-frequency vibrations, e.g., wind-induced overhead line oscillations, are easier to measure with accelerometers.
One possible application of an accelerometer on a vacuum circuit breaker (VCB) is to identify key moments during the operation when specific events happen, e.g., the time when the latch starts to move, and the closing time, which are shown in Figure 2. Based on the specific features of the vibration signal, the event time points t1 (latch reaction) and t2 (first contact touch) can be extracted, and thus, the operating time can be obtained.
To pursue a predictive maintenance strategy in an electric power system, the mass scale deployment of sensors is required. Consequently, the features of low cost, small size, low power consumption, and environment adaptability are required. Unfortunately, these requirements are typically in conflict with sensor performance parameters like bandwidth, signal-to-noise ratio, and frequency response, and so a careful assessment needs to be performed for sensor selection.
Figure 3 shows an example comparison among four different accelerometers. Sensor 1 is a piezo-based high-performance accelerometer with an analog output. It has a high bandwidth, low current consumption in on state, and medium measurement range, but it is comparatively expensive. Sensor 2 is microelectromechanical systems (MEMS) based and has a much higher measurement range at a similar high bandwidth, is less expensive but requires higher power during operation. Sensor 3 is also a MEMS-type accelerometer designed for a lower measurement range. It operates with a lower bandwidth, but it is also cheaper. Sensor 4 is an ultralow-power MEMS accelerometer designed for low sampling rate applications, but it is also much less expensive. This example comparison shows that accelerometers vary vastly in performance and price. Moreover, due to different working principles, the sensor size can be very different. One of the ongoing research tasks is to determine the main features needed for predictive maintenance, which then enables the choice of the appropriate sensors. For example, the vibration signal of a VCB during switching contains frequencies up to 10 kHz, thus sensors 3 and 4 would not be suitable as the covered bandwidth is too small.
To underline this feature determination and sensor selection work, an example experimental comparison of a low-cost MEMS microphone and a high-performance microphone is presented (with the price difference of a factor of 200). Simultaneous measurements of corona noise from wetted, high-voltage overhead power lines are performed. The differences can be clearly seen from Figure 4. Due to its small size, the MEMS microphone is rather insensitive to the low-amplitude background noise, first phase [Figure 4(a)], and to low-frequency components [Figure 4(b) and (c)]. The latter can be better seen in the frequency spectrum shown in Figure 5. The frequency components of the acoustic signals agree with each other in the range from 100 to 1,000 Hz, but there is a significant difference in the lower frequency component below 50 Hz. This information is important for sensor selection: if the application is to determine the power frequency component, the low-cost sensor would not be suitable, but it would be suitable for measurements of the higher-frequency components.
In addition to selecting sensors at the best cost-to-required-performance ratio, the recorded data must also be evaluated. Traditionally, most of the sensors for T&D equipment supervision are galvanically connected to condition-monitoring units and operated using condition-based maintenance strategies. For future intelligent predictive maintenance schemes, sensors need to be deployed in mass scale, and this IoT-type wireless transmission has been demonstrated to be the most effective solution. However, different options exist to design the IoT platform with respect to data sampling, processing, and transmitting.
It is important to optimize the hardware and software co-design for low power operations to prolong the battery’s lifetime, and with it, the sensor’s runtime. This optimization includes all aspects of data collection, processing, and transmission. The energy consumption of the microcontroller unit (MCU) on the platform strongly depends on the operating mode, i.e., whether it is in sleep, standby, or operating mode, and the clock’s frequency during operation. The energy consumption of data transmission is strongly influenced by the selected wireless protocols, e.g., Bluetooth, Bluetooth Low Energy, Zigbee, Wi-Fi, long-range wide area network, or cellular-based communications. But the selection of the protocol also needs to consider the amount of data that needs to be transmitted (date rate) and the distance to the gateway. The data can be transmitted raw or preprocessed on the MCU before transmission, which, again, is a part of the overall optimization as the processing itself requires energy as well and is described here in more detail.
In the aforementioned scenario, energy harvesting has proven to be a solution that enables self-sustaining IoT nodes. Energy harvesting tries to acquire energy from external (ambient) sources such as thermal, kinetic, electromagnetic, or solar energy present in the vicinity of the node. The harvested power is typically very low, but even micro or milliwatts can be sufficient if the power consumption of the IoT sensor during its average lifetime is very low as well.
Figure 6 shows the typical structure of an IoT nodes-based online monitoring system, including widely distributed sensor nodes, wireless data transmission gateways, and local servers. In the system, the sensor nodes on the components to be monitored are equipped with MCUs and defined as “edge” (in the sense of being “far” away from the powerful computers and near the sensing element). In the most straightforward situation, the raw data from the edge is transmitted via the gateways to powerful servers for analysis, which could lead to inefficiencies and bandwidth limitations when scaling up the system. Moreover, wireless data transmission is, in general, more energy intensive than processing in loco. In turn, the concept of “edge computing” makes use of the (limited) computing resources available on the local MCUs to directly process the raw data. By that, the amount of transmitted and stored data can be significantly reduced, with a consequent increase in the system’s energy efficiency and the maximum number of supported devices.
The example of acoustic signal acquisition can illustrate the tradeoff between edge computing and raw data transmission in a quantitative way. If we acquire the signal with a sampling rate of 44 kHz, and each sample takes two bytes of storage space, then it takes 88 kB of transmission and storage per second for the raw data. However, in the case of performing edge computing and extracting the main frequency components, the amount of data can be reduced from 88 to approximately 2 kB.
As introduced in the previous section, the main step to go from condition-based to predictive maintenance strategies is the ability to predict the moment in the future when maintenance is required, thus, to estimate RUL of the monitored component. Sensors continuously collect data from the components, and with the recent advances in cost-effective sensors, data acquisition, and data transmission technology (see the previous sections), these datasets become huge and manifold. These datasets offer the opportunity to transition from a human-expert judgment of the equipment’s health to an automated judgment based on intelligent algorithms. In the following, we introduce the concept of ML in general, and in the context of example applications in the electric power industry as well as the concepts of HIs. A particular example of VCB drive monitoring is elaborated on in more detail.
ML describes methods that enable a computer to learn from data instead of being explicitly programmed. Even though popular ML methods such as artificial neural networks have been explored since the 1940s, only recent advancements in computational power have enabled the new push of ML in the last decade. Today, we see successful applications of ML, for example, in natural language processing, search engines, computer vision, online advertising, or general game playing like Chess and Go, to name only a few. For T&D system operators, however, equipment-condition monitoring is still mostly based on indicators extracted from signal processing or statistical methods, for which expert knowledge is usually required. Time-based preventive maintenance, in combination with simple deterministic alarm systems (condition based), remains the main maintenance strategy, and application of ML methods is still very limited.
The two main categories of supervised ML problems are regression and classification. Regression and classification maps input data into continuous and discrete outputs, respectively. The applications of methods in these categories for intelligent monitoring of power grid components are shown in Figure 7.
Classification methods can be used to detect and classify faults. Anomaly-detection algorithms could be used which, instead of learning all possible fault states, try to identify when a component starts to deviate from normal (healthy) behavior, indicating possible future failures. Autoencoder-based algorithms are a common ML approach for anomaly detection. Here, the algorithm tries to reconstruct the input data by learning from only the healthy data distribution. The difference between the original input data and the data reconstructed by the algorithm, the so-called reconstruction loss, is thus low for healthy data.
Besides classification problems, two very useful applications of regression methods in the electric power industry would be the derivation of HIs or the RUL. The HI is a score of the global machine’s state calculated by an ML algorithm based on the collected data, which should correlate to the machine’s degradation. In the example of circuit breakers, travel curve sensors are used to measure the motion of the breaker contacts, and several indicators could be derived from it, e.g., contact opening and closing time, speed, and acceleration. The indicators can be used for identifying component failures. Similarly, vibration signals during open and close operations are also used as an HI for the operating mechanism. HI construction can be formulated as a regression problem as the output score is continuous. In principle, regression problems are much more difficult than classification problems because the large, continuous solution space makes it difficult for ML algorithms to learn and generalize.
The other important regression problem is RUL prediction. RUL is the estimated lifetime for a machine until failure. With such a prediction for each component and its subcomponents in the electric power grid, optimal maintenance schedules could be arranged well in advance, which would save time and costs, while minimizing the system’s outage duration. The RUL prediction can also be formulated as a regression problem as RUL is a continuous output as well.
We listed and described various potentially useful ML applications in the electric power industry in the previous section, however, there are also several challenges that need to be solved beforehand. First, to perform fault detection or fault classification, we need many labeled datasets with faults of different types. However, the components in the electric grid have a very low failure rate and collecting a large dataset is difficult and time consuming. For instance, worldwide surveys confirm the high reliability of circuit breakers with only 0.3 major failures per 100 circuit breaker years. Also, the fault types can be manifold. There are many individual subcomponents in a single piece of equipment, such as circuit breakers or transformers, and each of them could fail. Moreover, the same subcomponent can degrade by more than one mechanism, and some faults are caused by the interaction among several subcomponents. This together makes it impossible to either simulate each of the failure cases or even collect enough experimental data. If sufficient fault data existed, labeling these measurements would be a tedious and error-prone process for humans, resulting in datasets that are not well labeled or labeled with inconsistent quality.
The second challenge is the domain gap. Even if we have the same machine and deploy the same sensor, the data collected might still be different due to different environmental operating conditions. For example, data might be different from summer to winter or from indoor to outdoor. Even if we monitor the three circuit breakers of a three-phase system that are installed side by side, the data collected from these three breakers could still be different due to differences in the manufacturing process. This makes ML methods very hard to generalize.
The third challenge is the general lack of data collection standardization in the electric power industry. As fault data are rare, one possible solution could be to combine different data sources. For example, combing data from similar types of equipment from different substations and utilities could be useful to increase the amount of collected data and improve generalization ability of the used ML models. However, due to a lack of data collection standardization, historically, each utility uses its own database and data format, and thus, combining different datasets becomes very difficult, if not impossible. In principle, establishing a data collection standard in the electric power industry across the globe would facilitate the application of ML methods for condition monitoring.
The fourth challenge is the lack of ground-truth labels. As mentioned previously, supervised ML models need to have labels during the training phase. But in the real world, it is impossible to know the true health condition of a machine, so to train a model that outputs HI is difficult. One solution to this could be using simulated data, for which we know the health condition, as a proxy to the real data for training ML algorithms and validating them on real data.
In summary, these challenges make the application of ML data-driven approaches in the electric power industry very difficult, and thus, ML applications are still limited. However, advancements in sensor technology and IoT connectivity have stimulated interest in predictive maintenance for electric power grid components, and R&D activities in academia, manufacturers, and utilities have grown in recent years. One example research project is introduced in the following section. To the authors’ knowledge, this is the first fully published and publicly available measured dataset of a complete run-to-failure experiment on a VCB.
The INCITE project (Intelligent maintenance of gas circuit breakers) aims at developing advanced and robust nonintrusive technologies for predictive maintenance of components in the electric power system by combining novel signal processing and ML methods. Circuit breakers are used as an exemplary case study, but the developed approaches will be transferrable to other types of switchgear and equipment. It is a joint activity among academia, manufacturers, and utilities and financially supported by federal funding.
In the first step, a complete mechanical run-to-failure dataset was generated in the laboratory using a three-pole VCB. A commercial VCB is equipped with an accelerometer, two coil current sensors (an opening and closing coil), a motor current measurement, and a contact separation measurement system to measure the opening/closing time during each operation. An accelerated life test for mechanical failures was carried out by performing opening and closing operations continuously and automatically every 3 min until the breaker would not open anymore. Overall, the VCB performed more than 25,000 close and open operations without current load within the time span of five months. This complete run-to-failure measurement enables us to monitor the evolution of the VCB’s condition and estimate the degradation over time. The terminal failure is caused by increased friction in the opening mechanism. This dataset is publicly available and can also be used by groups that do not have the possibility to perform experiments, or even long-term automated experiments.
One of the purposes of generating this run-to-failure dataset is to study algorithms that estimate an HI and RUL. ML algorithms are run on the dataset and study the VCB’s degradation. An example result is the successive and gradual increase in closing time. Figure 8(a) shows the closing time and the time between sending the trip signal until the two contacts touch each other, which is measured from the contact separation measurement on one of the three poles for the entire measurement sequence. Only one of the three poles is shown here for simplicity, but all three poles show a similar trend. The degradation trend can be clearly observed as the closing time increases with operation number. It starts from ≈60 ms, which is described in the operation manual as normal, to more than 70 ms toward the end of life.
It is well known that the closing time is a suitable HI for the VCB drive. In this context, it is interesting to observe that the breaker closing time can be deduced from two completely different measurements. Traditionally, the breaker is taken off the network and the closing time is measured directly at the two contacts via the contact resistance. However, as this contact separation measurement is intrusive, it can only be measured very irregularly at maintenance intervals when the breaker is separated from the network. An alternative approach is to attach a vibration sensor to the breaker drive and determine the closing time indirectly from the vibration signal (with signal processing methods like the one introduced in Figure 2). This method is nonintrusive: the breaker can remain connected to the network, and the closing time can be monitored continuously for every single operation. In our run-to-failure experiment, the breaker is not connected to the network and both measurements can be applied continuously at every operation.
Figure 8(a) shows that the closing time increases with increasing operation number and is an indicator of the drive aging. The difference in the closing times determined by the two methods described previously is plotted in Figure 8(b) and confirms that the closing time can be measured with similar quality by both methods. In this specific example, we did not yet apply ML methods for the data analysis, but it highlights the potential for continuous, nonintrusive online condition monitoring.
In the next step, a simple ML method is applied to the same dataset and the algorithm learns without additional input of domain knowledge. Figure 9 depicts the results from a principal component analysis (PCA), which is one of the most common ML approaches used to perform dimensionality reduction and clustering. In this analysis, a large set of features extracted from the measured vibration signals, e.g., peak value, root mean square, latch time, time to reach peak, main frequency components, and so forth, and a PCA algorithm learns the linear combination of these features, which maximizes the variance between the data points. This means that the algorithm aims to find a function to describe which measurements in the same group are more similar to each other than to those in other groups.
Figure 9 shows the result of the evaluated measurements from the first 2,000 close operations and the last 2,000 before failure, reduced to two principal components (PC1 and PC2). A clear separation between two clusters can be seen and used for further analysis, e.g., for estimating an HI or RUL. We can also observe that the cluster of the first 2,000 operations is more compact compared to the cluster of the last 2,000, implying that the data from the first 2,000 operations are much more similar and stable. As the breaker degrades, the data points become less tightly clustered in the principal component space.
The opportunities offered by low-power sensing technology, remote connectivity, and equipment health estimations are manifold, and we have witnessed the start of its use for intelligent monitoring of equipment in the electric power grid. Utilities have aimed to change to predictive maintenance strategies to reduce the costs of operation caused by maintenance, better plan the maintenance schedules based on the condition of the individual components, potentially increase the lifetime of healthy components beyond the initially planned scope, and adapt to the changes in equipment stress caused by energy transition and increasing integration of strongly fluctuating renewable energy sources. Significant R&D in this area can currently be seen, and its use is expected to grow exponentially.
We presented a project that aims to develop algorithms that predict the HI and RUL of high-voltage circuit breakers. A publicly available run-to-failure dataset was measured on one VCB with more than 25,000 opening and closing operations. Further mechanical run-to-failure measurement series with gas circuit breakers are ongoing to explore the transferability of algorithms among different types of breakers. In addition, an experiment on circuit breakers interrupting a current load will be performed as well. The ultimate goal is to deploy IoT-based sensor platforms to circuit breakers in substations and test the methods under real-world conditions.
C.-C. Hsu, Aug. 3, 2022, “Run-to-Failure Vacuum Circuit Breaker Mechanical Test Dataset,” distributed by ETH Zurich, doi: 10.3929/ethz-b-000544221.
“Deep in thought: Applying artificial intelligence in the grid,” IEEE Power Energy Mag., vol. 20, no. 3, pp. 1–92, May/Jun. 2022.
J. Dalzochio et al., “Machine learning and reasoning for predictive maintenance in Industry 4.0: Current status and challenges,” Comput. Ind., vol. 123, Dec. 2020, Art. no. 103298, doi: 10.1016/j.compind.2020.103298.
E. Hossain, I. Khan, F. Un-Noor, S. S. Sikander, and M. S. H. Sunny, “Application of big data and machine learning in smart grid, and associated security concerns: A review,” IEEE Access, vol. 7, pp. 13,960–13,988, Jan. 2019, doi: 10.1109/ACCESS.2019.2894819.
The project is funded by the Swiss Federal Office of Energy Research program Energy Research and Cleantech.
Christian Michael Franck is with the Swiss Federal Institute of Technology, Zürich, 8092, Switzerland.
Chi-Ching Hsu is with the Swiss Federal Institute of Technology, Zürich, 8092, Switzerland.
Yu Xiao is with Xi’an Jiaotong University, Xi’an, 710049, China. He is currently an exchange researcher at the Swiss Federal Institute of Technology, Zürich, 8092, Switzerland.
Pascal Bleuler is with the Swiss Federal Institute of Technology, Zürich, 8092, Switzerland.
Gaetan Frusque is with the Swiss Federal Institute of Technology, Lausanne, 1001, Switzerland.
Mahir Muratovic is with the Swiss Federal Institute of Technology, Zürich, 8092, Switzerland.
Tommaso Polonelli is with the Swiss Federal Institute of Technology, Zürich, 8092, Switzerland.
Digital Object Identifier 10.1109/MPE.2022.3230968