**My dear reader, how are you? السلام عليكم**

He who lives in harmony with himself lives in harmony with the universe – Marcus Aurelius

In this post, I discuss the challenges and recent advancements in energy consumption predictive modelling for modern multicore computing platforms. I also share some research results and a list of publications for your further understandings.

#### Energy Consumption in ICT: A big Challenge

Energy is now a first-class design constraint along with performance in all computing settings and a serious environmental concern. The energy consumption of Information and Communications Technology (ICT) systems and devices is reported to be about 5300 terawatt-hours (TWh) in the year 2019 that accounts for 20% of the world-wide electricity demand. In 2030, according to a worst-case estimation, the ICT energy consumption would be equal to half of the worldwide energy utilization (Figure 1). Projection of the estimation leads to something very serious, i.e., the ICT energy consumption might become equal to global energy consumption by 2043.

Figure 1: *ICT Energy Usage Vs. Total Global Usage*

**Do you know: ***The energy transferred by the sun to earth in 2 minutes is equal to the energy consumption of the whole mankind for one complete year.*

#### Energy Consumption Measurement in modern computing platforms

Accurate measurement of energy consumption during an application execution is key to application-level energy minimization techniques. There are three popular approaches to providing it: a). Physical measurements using external power meters, b). Measurements using on-chip power sensors, and c). Energy predictive models. The first approach is accurate lacks the ability to provide a fine-grained component-level decomposition of the energy consumption of an application. This component-level decomposition is essential to finding an energy-efficient configuration of the application. The second approach had no definitive research works proving its accuracy. But, recently, we published detailed research [1] that provides the accuracy of on-chip sensors. The authors conducted a detailed experimental study using highly optimized scientific applications on modern GPU and CPU platforms and found that average errors can be high up to 73% with a maximum error hitting 300%. This is an eye-opening discovery pointing towards a serious challenge for energy consumption measurements. The third technique, that is, energy predictive models, has emerged greatly over the last few decades and caught huge attention from the researchers community. This is because of the flexibility of providing the component level decomposition of energy consumption. A large amount of energy predictive models are linear and use performance monitoring counters. There are hundreds of research works where performance monitoring counters have been used to build energy predictive models. Our research group published a survey of energy consumption models in 2016 [2] in ACM computing surveys. This survey highlighted the popular energy predictive models and presented a case study to show the poor prediction accuracies of the models. The average prediction errors were reported to be as high as 66%. Before discussing further let us first define performance monitoring counters.

#### Performance Events

Performance events or performance monitoring counters (PMCs) are special-purpose registers provided in modern microprocessors to store the counts of software and hardware activities. I use the acronym PMCs to refer to software events, which are pure kernel-level counters such as page-faults, context-switches, etc. as well as micro-architectural events originating from the processor and its performance monitoring unit called the hardware events such as cache-misses, branch-instructions, etc.

**Basic purpose:** PMCs have been developed primarily to aid low-level performance analysis and tuning. Remarkably while PMCs have not been used for performance modeling, over the years, they have become dominant predictor variables for energy predictive modeling.

#### Tools to collect PMCs

Perf tool can be used to gather the PMCs for CPUs in Linux. PAPI and Likwid allow obtaining PMCs for Intel and AMD microprocessors. Intel PCM gives PMCs of core and uncore components of an Intel processor. For Nvidia GPUs, CUDA Profiling Tools Interface (CUPTI) and nvprof can be used for obtaining the PMCs.

#### Causes of the inaccuracy of PMC-based energy predictive models

We published a paper in late 2017, where we presented a cause of inaccuracy in PMC-based energy predictive models [3]. We discovered that predictive models do not incorporate the basic properties of the universal energy conservation law and the parameters (PMCs) used as predictor variables in linear energy predictive models are not tested for any physical significance. We call this missing property as the *additivity* of PMCs.

*Additivity* is based on the experimental observation that the energy consumption of a serial execution of two applications is the sum of energy consumptions observed for the individual execution of each application. We define a compound application to represent a serial execution of a combination of two or more individual applications. The individual applications are also termed as base applications.

A linear predictive energy model is consistent if and only if its predictor variables are *additive* in the sense that the vector of predictor variables for a compound application is the sum of vectors for the individual execution of each application. The additivity property, therefore, is based on a simple and intuitive rule that the value of a PMC for a compound application is equal to the sum of its values for the executions of the base applications constituting the compound application. We brand a PMC *non-additive* on a platform if there exists a compound application for which the calculated value significantly differs from the value observed for the application execution on the platform (within a tolerance of 5.0%).

#### additivity test

We devised a test called the *additivity test* in order to check the suitability of PMCs to be used as parameters in energy predictive models. The test has two stages. A PMC must pass both stages to be pronounced additive for a given compound application on a given platform.

- In the first stage, we determine if the PMC is deterministic and reproducible.
- we examine how the PMC of the compound application relates to its values for the base applications. At first, we collect the values of the PMC for the base applications by executing them separately. Then, we execute the compound application and obtain its value of the PMC. Typically, the core computations for the compound application consist of the core computations of the base applications programmatically placed one after the other.

#### Improvements in the accuracy of energy predictive models using the additivity of PMCs

We published another paper recently that presents an experimental study to show the impact of using additive PMCs in energy models [4]. We improved the accuracy of models build using three techniques: 1). Linear regression (LR), 2). Random forests (RF), and 3). Neural Networks (NN). We showed that the accuracy of LR models improves with the reduction of the average prediction error from 31% to 18%. Similarly, the average prediction error of energy predictive models from RF and NN drops from 38% to 24% and 30% to 24%, respectively.

#### Number of Additive PMCs on Modern multicore platforms

In another recent research work, we show that the number of non-additive PMCs increases with the increase in the number of cores on a platform [5]. On a simple processor with one core, we show that there are only a very few PMCs that are non-additive but the number increase as the number of cores and shared resources increase. We attributed the increase in the number of non-additive PMCs on modern platforms are because of the inherent complexities such as severe resource contention and non-uniform memory accesses (NUMA).

To summary, we found no PMC that is absolutely *additive* for a range of applications. Therefore, a generic energy predictive model based on low-level PMCs that can predict the energy consumption of any application is not possible on modern computing systems. However, we found that for a limited set of applications or for the case of application-specific models, additivity test can greatly help in finding reliable parameters that can be used in energy models.

In our research work [5], we provided a mathematical validation of linear energy predictive models using additive parameters and extended the state of the art mathematical models by incorporating the properties from the fundamental law of energy conservation. Keeping the increasing complexities of modern multicore platforms in mind, we motivated the research community in another paper to strive towards understanding how pre-multicore methods and algorithms perform in the multicore era [6].

#### Conclusions and future work

- Each parameter of a linear energy predictive model must be
*additive*. - The number of non-additive PMCs for an application execution rise with the increase in the number of cores.
- For energy predictive models, the parameters should be checked for their physical significance using the property of additivity.
- Accuracy of mainstream energy predictive models improves using predictor variables that abide by the principals of energy conservation of computing.

In our future work, we will explore generic high-level parameters for reliable and accurate energy predictive modelling that can be used across the computing platforms.

References

[1] M. Fahad, A. Shahid, R. Reddy, and A. Lastovetsky , “A Comparative Study of Methods for Measurement of Energy of Computing” in Energies, MDPI, Vol. 12, Issue 11, 06/2019, pdf, DOI

[2] O’Brien, K., I. Petri, R. Reddy, A. Lastovetsky, and R. Sakellariou, “A Survey of Power and Energy Predictive Models in HPC Systems and Applications”, ACM Computing Surveys, vol. 50, issue 3: ACM, 10/2017.

[3] A. Shahid, M. Fahad, R. Reddy, A. Lastovetsky, “*Additivity*: A Selection Criterion for Performance Events for Reliable Energy Predictive Modeling” in Supercomputing Frontiers and Innovations, Vol. 4, Issue 4, 12/2017, pdf, Google Scholar, DOI

[4] A Shahid, M. Fahad, R. Reddy, and A. Lastovetsky , “Improving the Accuracy of Energy Predictive Models for Multicore CPUs Using *Additivity* of Performance Monitoring Counters” in 15th International Conference on Parallel Computing Technologies August 19-23, Almaty, Kazakhstan, 2019

[5] **A Shahid**, M Fahad, R R Manumachu, and A Lastovetsky, “Energy of Computing on Multicore CPUs: Predictive Models and Energy Conservation Law.” arXiv preprint arXiv:1907.02805. 2019, pdf, Google Scholar

[6] A. Lastovetsky, M. Fahad, H. Khaleghzadeh, S. Khokhriakov, R. R. Manumachu, A. Shahid,L. Szustak, and R. Wyrzykowski, “How Pre-multicore Methods and Algorithms Perform in Multicore Era”, High Performance Computing. ISC High Performance 2018. Lecture Notes in Computer Science, vol 11203, Frankfurt, Springer Nature, pp. 527-539, 24-26 June, 2018, 2019, pdf, Google Scholar, DOI

I hope you find this post useful. If you find any errors or feel any need for improvement, let me know in your comments below.

Signing off for today. Stay tuned and I will see you next week! Happy learning.