Farid Katiraei, Samaneh Morovati, Shadi Chuangpishit, Seyyed Ali Ghorashi
IMAGE LICENSED BY INGRAM PUBLISHING
The rapidly growing number of hyperscale data centers (DCs) with predominantly artificial intelligence (AI) types of loads in the current regulatory environment of promoting clean energy content suggests increasing the use of distributed energy resources (DERs). Some key factors are regulatory mandates to achieve carbon-free operation and reduce dependency on the electric grid. Although the addition of renewable resources in the form of DERs to DC infrastructure introduces significant challenges in design and operation, it also brings opportunities for implementing sophisticated control and automation to further capture the DER value for enhancing the grid reliability, resilience, and participation in demand response (DR) programs, depending on the location of DCs.
Utilizing gas or diesel engines for providing emergency backup power has increased DCs’ carbon footprints and made them closely comparable to the global carbon footprint of the airline industry. As a result, DC hyperscalers have been looking into increasing the penetration level of clean DERs at various sites. The DERs can be in the form of photovoltaic (PV) systems, large-scale battery energy storage systems (BESSs), and green fuel cell technologies. It will become essential to determine the possibility and impact of participating in the ancillary service market or incorporating control platforms for peak load management. The ultimate approach can also include achieving independency from the utility and/or internal distribution bottlenecks as well as the introduction of alternative ways to manage redundancy and reliability in design. Control platforms such as virtual power plants (VPPs) can provide some of the expected features and capabilities for increasing the value of the addition of DERs to DCs.
The primary goal of this article is to outline the techno-economic challenges of utilizing clean and reliable DERs as a part of DCs’ medium-voltage (MV) and low-voltage (LV) power distribution systems. As DER technologies and controllers are still evolving and due to the complexity of integration and the use of optimization schemes, comprehensive testing of the VPP platform for key applications such as DR and demand site management is becoming even more important. This article also describes advanced power system topologies and complex control architectures in the context of DC-VPP for single and multiple DCs with one or more flexible market application models. The business case of DC-VPP will be discussed, and methods of enhancing the implementation approaches will be defined to quantify the benefits (revenues and) from traditional controllers versus the more intelligent and robust controllers, such as AI-based schemes.
Historically, DCs have been considered as large consumers of electricity with more than 100 MW capacity per site, requiring high reliability and fully redundant solutions. DCs typically have varying reliability through electrical and mechanical infrastructure from the utility to the rack power supply. For example, at the utility level, common requirements are to have more than two independent transmission sources. At the DC substation level, the N+1 redundancies are nowadays replaced by 2N. MV distribution is already aggregating on-site generation in the form of generator farms or large-scale batteries. At an LV distribution, commonly used rack-level batteries are pressure tested with the development of aggregated in-row battery backup solutions.
Reliable multiple utility infeeds are becoming more and more scarce, and prime industrial/government locations are already congested and reaching power transmission and distribution (T&D) limits. Another problem of temporarily reduced reliability due to the U.S. grid restructuring, including the weakening of the grid by the large penetration of renewables, is seldom taken into account when defining annual reliability and availability. New trends in the deployment of diesel gensets for power backups are also observed. DCs that were previously counting on highly reliable power systems are reevaluating their strategy of adding on-site gensets due to recent weather-related events that brought down two independent sources. Hence, in some cases, the deployment of partial backup power with gensets and batteries is being used as a solution for very critical clusters running Generative AI (Gen AI) loads. The carbon-neutral initiatives of many mega/hyperscalers, which were supporting and purchasing power exclusively from the renewable sector, were recently replaced with real-zero initiatives. The approaches consider different/new design options that incorporate cleaner and environmentally sustainable resources.
Example initiatives are installing rooftop or ground-mount solar PV systems and pairing them with energy storage systems (ESSs). However, without exploring energy or capacity market participation use cases, the business case for a solar-plus storage system will not be economically viable at scales needed to manage DC loads. In fact, utilizing excess generation or managing DCs’ peak demand during normal (blue-sky) conditions may take priority over redesigning a large DC to a microgrid for reliability and resiliency. The key factor that will influence the size and source of backup power is a new trend in building modular DCs for edge computing. As illustrated in Figure 1, decarbonization, reliability, and load management (to achieve congestion relief) are the challenges that hyperscale DCs are facing that are influencing design for the next generation of DCs, which are discussed in more detail in this article.
Figure 1. Challenges for the next generation of DCs.
The growing demand for digitalization is the primary reason for building new DCs with a high level of security and reliability. New norms, such as 3D virtual work environments, and the accelerated use of gen AI digital platforms and cloud-based services are generating a surge in DC developments. In the past, IP traffic, server workload, and storage capacity were the key indicators for forecasting demand and its relationship with energy usage in DCs. However, as demonstrated in Data Center Handbook by Hwaiyu Geng, the way of transferring, storing, and managing data is explosively changing using data driven by 5G networks, the Internet of Things, and AI, all of which also require highly reliable, secure, and resilient data processing and storage infrastructure. To address these challenges, new designs for DCs should have a flexible and reliable topology from the power supply perspective. With data growing exponentially, DCs are impacted by significant increases in power consumption and carbon footprint. As shown in Figure 2, the U.S. market alone expects a tripled range from 2023 to 2030, which might be estimated to be about double that in some literature. Using sustainable and reliable energy resources, such as solar/wind resources paired with an ESS, to manage the load resilience and reduce the carbon footprint should be the path to be considered by all DCs.
Figure 2. Hyperscaler power consumption growth by 2030 (https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/investing-in-the-rising-data-center-economy).
In spite of this year’s stagnation in the number of DCs deployed, new exponential growth supporting gen AI and an application such as Metaverse is expected—and therefore, the amount of energy used is expected to be doubled every four years. Hence, the carbon footprint of this industry is growing at an exponential rate. Currently, DCs account for as much as 3% of global carbon emissions, based on a report by Energy Digital. Incorporating new building materials, deploying highly utilized racks, carefully selecting a cooling strategy, and utilizing clean energy backup strategy practices by adding clear sustainability targets to a DC’s modern electrical network should be some of the ultimate goals for a new design to reduce carbon footprints. Investing in grid-edge technologies deployed closer to the end user, such as automation tools, and/or deploying ESSs to provide a robust and flexible power supply are potential solutions for new DCs to incorporate in their topologies. However, management of the renewable resources to be ready and available for critical load supply during unexpected events is a challenge ahead.
There is an old saying that a chain is no stronger than the weakest link. In the DC market, uptime availability describe how resilient the DC infrastructure is. DC uptime, defined as the guaranteed annual availability of a DC, depends on the core business goals and translates into measures of “nines.” Consequently, a lot of time and money is invested improving redundancy, recovery processes, and certifications. As the “nines” uptime increases from two (99%), which reflects the downtime of 7 h and 22 min per month, to three (99.9%), four (99.99%), and five (99.999%), the downtime decreases. In general, five nines are considered a reasonably high reliability. And with six nines (or 99.9999%), an average customer would experience about 2.6 s of downtime per month (or fewer than 32 s per year). Since milliseconds of latency can lead to millions of dollars in lost revenue, hyperscalers continue to develop new topologies and technologies to remain competitive.
Preferences for the DC location can be challenged by grid congestion—a major obstacle to providing adequate secured supply to the critical load when grid overload occurs and transmission lines are not able to transfer enough power. Figure 3 presents the estimated number of DCs and locations in the top 10 congested PJM regions as a sample for the United States based on the PJM 2022 quarterly state of the market report. Each circle represents the number of DCs color-coded based on the congestion cost associated with the area, with the congestion cost increasing from yellow to red to purple. As seen, more than 90 DCs are in areas with transmission congestion. Due to capacity limits and the high cost of operation, the electricity supply may not be offered at a wholesale contract with a flat rate in such congested areas, which further adds to the cost of DC operation and the complexity with the application of time-of-day pricing and high surge charges for excessive demands.
Figure 3. The estimated number of DCs in PJM congested areas—sample locations (https://map.datacente.rs the congestion map is from the PJM 2022 quarterly state of the market report).
A possible resolution can be achieved through the deployment of clean generation that is a part of the supply mix along with applying reliable load management scenarios to balance demand and supply locally and minimize the import from the grid in such congested areas during peak hours with very high electricity prices. In this way, a DC’s critical load will be supplied with the added clean generation to reduce dependency on the utility grids. However, control and monitoring technologies to achieve reliability and dependability plus resource aggregation should be precisely implemented to overcome this challenge.
Rapid growth in enterprise and Internet computing has led to server load increment in DCs with load management challenges both in energy consumption and in data transfer and storage. A study performed by McKinsey & Company shows that the U.S. DC demand growth is estimated to be 10% annually until 2030. Reducing energy costs and critical load management due to grid failure is a major concern in the DC industry. Regarding the former concern, to ensure optimal performance, DCs should implement energy and environmental metering to benchmark and track performance. Moreover, avoiding a mission-critical disruption by deploying better load management through advanced control systems is a key factor for future of optimal operation of DCs.
DC load analysis to determine load characteristics (load type) and forecast load fluctuations (load profile) is another factor for the DC design and implementation of an optimized energy usage algorithm—enabled by a VPP. As examples, generic load characteristics of three major types of hyperscalers are summarized in Table 1, and the load breakdown percentage is shown in Figure 4. DC load types may vary in terms of size, distribution, and characteristics of critical and noncritical loads. The critical loads include IT devices, storage assets and network loads (servers), and some of the essential cooling systems associated with server rooms. The noncritical loads cover lighting, mechanical building supplies, and auxiliary loads that may include some of the noncritical cooling systems.
Figure 4. Hyperscaler load breakdown percentage. (Some data are extracted from “Demand Response Opportunities and Enabling Technologies for Data Centers: Findings From Field Studies,” Girish Ghatikar et al., Lawrence Berkeley National Laboratory, 2012.)
Table 1. Hyperscaler load characteristics (see the color coding in Figure 4).
The load profile varies based on the location, size, and type of activities. As an example, a hyperscaler’s DC for daily application subscribers might experience a flat load profile as the data flow and application usage are consistent during the entire day. A fluctuating load profile can be attributed to high-performance computing or search engines, in which their utilization varies significantly during the evenings, weekends, and holidays compared to other hours of the day. DCs with variable load profiles are capable of participating in DR programs during peak hours and taking the benefit of offering some of the on-site generation resources—deployed for peak demand management to capacity markets. Based on the load profiles in Figure 4, Hyperscaler 3 is a good candidate for implementing DR using VPP, while for Hyperscalers 1 and 2, the focus of VPP application selection should be predominantly on the reliability and decarbonization of the power supply.
The proposed solution to overcome the challenges discussed with the traditional design of DC is to increase the reliance on local power supply rather than implementing multiple utility infeeds with 100% redundancy. Electric utilities have started implementing similar approaches to avoid the high cost of T&D upgrades. The localized solutions are typically called non-wires alternatives (NWAs), which rely on the use of DERs to provide grid support functions—in contrast to implementing capital upgrade projects that are wired methods, such as increasing cable sizes and adding more substations and lines. The T&D deferral in an NWA approach is achieved through the deployment of renewable/clean DERs and pairing them with ESSs. The NWA approach, empowered by advanced monitoring, control, and automation systems (for instance, using a VPP), can achieve a high level of reliability, resilience, and dependability while minimizing carbon footprint when properly managed and coordinated with load/resource forecasting and daily operational need assessment.
Depending on the criticality of the loads, DC facilities need to have backup equipment and redundant design to face expected and unexpected grid events or internal equipment failure. Three main reliability targets for a DC design are N+1, 2N, or 2N+1 when accounting for distributed redundancy, all of which require the duplication of equipment (transformers, feeders, etc.) and a higher level of contracted grid capacity for redundancy (see Figure 5). In addition, to address long-term interruptions, standby backup power units may be utilized as alternative sources connected directly to the LV buses that supply critical load centers.
Figure 5. A schematic of conventional topologies used for DCs.
To meet the redundancy, each DC site is normally supplied from two fully redundant utility sources, which are equally sized to carry the entire peak DC load. For instance, if for a benchmark DC, the peak load is 120 MW, then there are two infeeds (Source A and Source B) designated for supplying the DC load. As shown in Figure 5, in normal operation, half of the DC load is normally assigned to one source, and the other half is fed from the second source. The load transfer schemes implemented at the high-voltage (HV) or MV bus can facilitate full access to the second source upon the loss of either source or the loss of a main transformer (in 2N design). However, having access to utility sources with 100% redundancy and enough reserve capacity in congested areas that are also in high demand for building new DCs will be a major challenge due to limited capacity. This presents a good opportunity for using VPP driven local DERs for load and reliability management.
To manage clean energy portfolio mandates, DC owners normally have contracts with renewable energy developers to offset their carbon footprint with financial support provided to such large-scale wind and solar farms. The concept is called net-zero carbon, which is very different from real-zero carbon emission or zero-carbon footprint. The reason is that DCs are normally deployed in areas that have access to a significant amount of fossil fuel-based backup power sources, all of this changes the perspective of how effective greenhouse gas emission offsets would be. Even though backup gensets are rarely utilized, the presence of those units and the need to operate them from time to time for maintenance add to the carbon footprint.
In a potential case of a widespread blackout, when a DC would lose both utility sources, there is typically a generation farm (gen farm)—connected at the MV level—that acts as a backup power supply to take over the DC load. The commonly utilized generation types for a gen farm are diesel generators. A counterargument offered in this article is that the traditional gen farms and 100% redundancy at the utility source can be rereduced or replaced by a design that incorporates DERs and distributed control and automation at various levels of the power delivery chain in a DC.
Considering all discussed challenges, this article proposes VPP-embedded DCs as a novel approach for realizing NWA methods. The proposed solutions are based on emerging DER technologies, coordinated through VPP control and monitoring platforms to provide reliability and manage excess demands that cannot be served by local electric utilities. However, any discussions around the use of DERs immediately get challenged from two aspects.
The aim is to address the aforementioned challenges and provide some reference designs with quantitative analysis.
Typical DER technologies that can be considered for DCs are BESSs, hydrogen fuel cells, solar PV systems, and wind power plants. The BESSs are normally short-term energy resources economically viable for providing 2–8 h of reserve capacity. They are also paired with intermittent renewable resources (PV and wind) to enhance the availability and utilization factors. However, none of them can provide a 100% firm capacity for extended outages.
For any outage that requires multiple days of supply, the long-term energy storage (LTES) solution, which can be a combination of a hydrogen-based fuel cell nature or a combination of BESSs with fuel cells, may be considered. Table 2 provides a comparative analysis of the feasibility of applying various clean DER technologies at DCs. Note that the quantitative values are excluded for simplicity. The industry norms are utilized for the analysis based on several publications by National Laboratories and energy agencies. The analysis incorporates key factors such as the following:
Table 2. The feasibility of applying certain DERs at DCs.
Based on Table 2, an ESS is presently the most viable DER for DC application. However, due to the limitation of supply, a hybrid solution that utilizes ESSs with other clean resources would have a higher feasibility of supplying critical load over multiple days. For instance, recently, there have been extensive discussions around pairing fuel cells and ESSs to achieve a multiday supply duration, but this has implications on costs.
Traditional N+1 redundancy can be improved to a hybrid 2N scenario with DERs distributed at LV buses accessible through a transfer bus supplying multiple server racks. A generic schematic of the newly designed topology for DCs is illustrated in Figure 6.
Figure 6. Example topology and locations for adding DERs/ESSs.
The DER at this level can be an ESS with 2–4 h of load-serving capability. To achieve a zero-carbon footprint, the gen farm in the proposed approach is also replaced with an alternative gen farm that uses LTES technologies. The LTES is a rapidly emerging area that typically incorporates a combination of conventional ESSs with other renewable and clean power resources, such as green hydrogen and fuel cell systems.
In the proposed approach, excess online generation from the DERs and alternative gen farms is available for grid support during congestion or widespread outage events, resulting in the elimination of gensets, reducing operational costs, and minimizing (or zeroing out) the carbon footprint while achieving the reliability targets.
Considering hybrid 2N redundancy by adding DERs and replacing gen farm technology with ESSs, a benchmark example is provided to illustrate the potential reduction in DC reserved capacity from utilities as a benefit of the proposed approach. The base case used for the benchmark represents a DC with four buildings, sized at 45 MW per building or a peak load of 180 MW. The DC topology and infrastructure deployment are designed with two infeed sources (Source A and Source B), each of them rated for 120 MW in normal operation and for 180 MW for short-term contingency (4 h). This is reflected in the transformer’s nominal rating of 120 MVA and a forced-cooled rating of 180 MVA.
Historical load analysis for this DC shows that the average load for each building is around 30 MW in most cases. Hence, the utilization factor is about 66% of the deployed capacity and available infrastructure. In an alternative design, the difference between the average and peak load (or 34% of 180 MW) can be used to size ESSs for the alternative gen farm. With an ESS of 60-MW and 120-MW contracted capacity from the grid, the peak load can be supplied within the proposed design. However, this approach would reduce about 80 MW capacity from the grid. The transformer can also be resized to 80/120 MVA, which is a significant saving in infrastructure.
However, adding multiple DER assets to the DC in the new design requires real-time control and monitoring technologies to achieve the following:
In addition, several challenges related to the parallel operation of the DERs and ESSs with the DC power delivery system should be analyzed and evaluated to avoid any adverse impact on the primary business focus. Example challenges are related to the fault current contribution of DERs connected to the MV buses on the circuit breakers ratings when operating in parallel with the main grid and the potential adverse impact on the voltage regulation schemes due to the interaction with on-load tap changers of main power transformers.
One way to achieve the highest value and ensure maximum reliability is to utilize an ESS as a replacement for any secondary generation units that are historically used for maintenance and short-term transfer. Figure 7 shows an example schematic for incorporating an ESS at the transfer bus level that would have access to multiple rows of the DC server racks. The access is limited and shared by all the server lineups that are used only during emergencies or maintenance. In this case, all critical loads would have access to a green source.
Figure 7. A distributed ESS at the server rack level. CB: circuit breaker.
To properly manage the operation and achieve the highest value of DERs, the VPP concept is proposed to be embedded into DC controls. The VPP provides value-added functionalities when paired with existing control and automation schemes of a typical DC to ensure that DERs are available for internal use or properly offered into the market at the right time and prices to achieve the full benefits of the proposed approach for hyperscalers.
VPP can be generally defined as a digital platform that assesses and controls the underutilized capacity of local generation and storage units associated with a legal entity to participate in wholesale electricity markets. The VPP operational system has been transitioning from its traditional way to the next generation, which includes AI-based controls along with forecasting and predictive algorithms with a heavy use of edge computing primarily targeting the management and optimization of large numbers of loads and resources. Dedicated VPP platforms may be implemented at various buildings within a DC or applied to multiple DC campuses within an electric utility operating region offering common pricing methods and market applications.
In an optimally designed VPP platform for a DC, DERs, including ESSs, are economically managing various power supply options using a uniquely designed intelligent-based controller to maintain reliability for both critical (such as IT equipment) and noncritical loads. In most cases, VPPs are cloud-based computation and communication systems to ensure the highest uptime.
A VPP platform applied across multiple DCs in a region is illustrated in Figure 8. Each DC could have a large percentage of inactive servers that are parts of the critical load demand calculation. Hence, real-time load assessment and forecasting tools can identify excess generation online to be offered into a market for providing additional revenue that can improve the return on investment of a VPP deployment.
Figure 8. Multiple DCs under a VPP platform.
Among potential applications for VPP-embedded DCs, the focus of this article is on certain primary and secondary use cases that can address the challenges highlighted in Figure 1. Figure 9 illustrates the primary and secondary use case categories, including congestion relief, load regulation, and backup power for primary use cases and market participation and resource aggregation for secondary use cases.
Figure 9. The VPP use case categories.
In this section, the primary use cases of VPP in the context of DCs are discussed in more detail.
In this section, the secondary use cases of VPP in the context of DCs are discussed in more detail.
This section details the VPP architecture and general components, including the VPP control system and communication network. The components of a VPP-embedded DC can be customized based on the size, load criticality, and potential market participation programs. The key building blocks of control and operation are
A generic VPP architecture is shown in Figure 10. A VPP control system can manage multiple interconnected VPP platforms for a building with a DC or for multiple DCs in an operating region. In the latter case, hyperscalers with multiple DCs in one region can get the benefit of resource aggregation, load management, and market participation.
Figure 10. A generic VPP architecture.
The VPP control platform includes the technical and nontechnical functions outlined in this section. Basic and advanced functions are the main two categories of control functions.
Table 3 describes the basic functions that deal with dispatching and direct active and reactive power controls for DERs, DR management, and market participation. Table 4 describes the advanced functions to further enhance VPP performance based on the applicable use cases for a DC. The advanced VPP functions complement DER core use cases in a DC to expand the services for grid support and stability improvement (such as frequency regulation and voltage control).
Table 3. VPP basic control functions.
Table 4. VPP advanced control functions.
This article examined the use case for a VPP-embedded DC as a new approach to manage high demand, provide congestion relief, and minimize carbon footprint by applying renewable and clean DERs (generation and storage). The key focus has been on assessing the applicable topologies that can best benefit from the presence of DERs. In addition, the operational challenges and potential impact of adding new centralized or distributed resources in a DC are discussed. It has been noted that coordinated control and monitoring are essential for ensuring superior reliability and that redundancy can be achieved for supplying critical loads, which would be the primary role of the VPP platform. Depending on the availability of the resources and DC load characteristics, there may be additional opportunities for DR and market participation to support the grid and realize new revenue streams for offsetting the cost of VPP.
P. Hwaiyu Geng, Data Center Handbook. Hoboken, NJ, USA: Wiley, 2015.
D. Gmach, J. Rolia, L. Cherkasova, and A. Kemper, “Capacity management and demand prediction for next generation data centers,” in Proc. IEEE Int. Conf. Web Services (ICWS), Piscataway, NJ, USA: IEEE Press, 2007, pp. 43–50.
P. Tandukar, L. Bajracharya, T. M. Hansen, R. Fourney, U. Tamrakar, and R. Tonkoski, “Real-time operation of a data center as virtual power plant considering battery lifetime,” in Proc. Int. Symp. Power Electron., Elect. Drives, Automat. Motion (SPEEDAM), Piscataway, NJ, USA: IEEE Press, 2018, pp. 81–86, doi: 10.1109/SPEEDAM.2018.8445345.
M. Sheppy, C. Lobato, O. Van Geet, S. Pless, K. Donovan, and C. Powers, “Reducing data center loads for a large-scale, low-energy office building: NREL’s research support facility (Book),” National Renewable Energy Lab., Golden, CO, USA, No. NREL/BK-7A40-52785, 2011.
G. Ghatikar, V. Ganti, N. Matson, and M. A. Piette, “Demand response opportunities and enabling technologies for data centers: Findings from field studies,” Lawrence Berkeley National Laboratory, Berkeley, CA, USA, Tech. Rep. LBNL-5763E, 2012.
“Investing in the rising data center economy.” McKinsey. Accessed: Jan. 17, 2023. [Online] . Available: https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/investing-in-the-rising-data-center-economy
Farid Katiraei (fkatiraei@quanta-technology.com) is with Quanta Technology, Raleigh, NC 27607 USA.
Samaneh Morovati (smorovati@quanta-technology.com) is with Quanta Technology, Raleigh, NC 27607 USA.
Shadi Chuangpishit (schuangpishit@quanta-technology.com) is with Quanta Technology, Raleigh, NC 27607 USA.
Seyyed Ali Ghorashi (sghorashi@quanta-technology.com) is with Quanta Technology, Raleigh, NC 27607 USA.
Digital Object Identifier 10.1109/MELE.2023.3291228
2325-5897/23©2023IEEE