Ranking the systems by workload
Today I’ll start with a trivial question: How busy are the BW systems compared to each other? It will be a preparation for my machine learning steps in the next section. I am already collection many metrics, so I have enough data to evaluate. For this question I focus on 26 metrics because I think they are most helpful:
IP_INFOPAKIDS, IP_RECORDS, DTP_REQUESTS, DTP_DATAPAKS, DTP_RECORDS, DTP_GB, Q_USERS, Q_QUERIES, Q_NAVSTEPS, PC_CHAINS, PC_RUNS, BTC_JOBS, DB_BLOCK_CHANGES, LOG_GB, SELECTS, INSERTS, UPDATES, DELETES, CALLS, DV_NUMROWS, DV_DSO_GB, DV_IC_GB, DV_WARM_GB, BIA_USED, AL_SUCCESS, USERA
They should correlate very well with the desired workload, which means the higher these numbers the higher the (useful) workload on the system. I omitted ambiguous metrics like e.g. IP_SECONDS, because a high total runtime for InfoPackages could mean either:
I assume, no unnecessary batchjobs are being scheduled and users only execute useful queries etc. The actual values of the metrics are normalized by their overall average. For each system the normalized averages for all 26 metrics are summed up. A score of 26 would mean a system is showing a very average overall workload. This also means that all 26 metrics have the same weight, which sounds like a reasonable thing to do to keep things simple.
Finally I get this overview:
Table 1: Workload Scores
System POC is pretty much average, the highest workload is on PO7. The BWonHANA systems PH1 and PH2 are somwhere in the middle.
Picture 1: Barcharts of the workload scores
From my personal experience, I would classify the sysmtes PO5, PO8, PO4, POD and PO9 as low worload systems. Maybe the systems POC, PH2, POB and PO2 could be classified as medium workload systems, since they also seem to have a similar overall workload. Now let’s start with classification algorithms if they would confirm my opinion.
Classifying systems by workload
My overall goal is to detect unusual BW workload. So a classfication algorithm should help me with anomaly detection. I’ll start with standard classifcation algorithms that should simply take all available performance data without knowing which SID the workload belongs to, and then cluster the data intelligently. Ideally the 15 SIDs will each get their own cluster.
KMEANS
The first algorithm I used for clustering is KMEANS. This alogrithm needs the number of desired clusters as an input. I used the Silhouette score to identify the ideal number of clusters. The best fit was achieved with 9 clusters and a silhouette score of 0.45:
Today I’ll start with a trivial question: How busy are the BW systems compared to each other? It will be a preparation for my machine learning steps in the next section. I am already collection many metrics, so I have enough data to evaluate. For this question I focus on 26 metrics because I think they are most helpful:
IP_INFOPAKIDS, IP_RECORDS, DTP_REQUESTS, DTP_DATAPAKS, DTP_RECORDS, DTP_GB, Q_USERS, Q_QUERIES, Q_NAVSTEPS, PC_CHAINS, PC_RUNS, BTC_JOBS, DB_BLOCK_CHANGES, LOG_GB, SELECTS, INSERTS, UPDATES, DELETES, CALLS, DV_NUMROWS, DV_DSO_GB, DV_IC_GB, DV_WARM_GB, BIA_USED, AL_SUCCESS, USERA
They should correlate very well with the desired workload, which means the higher these numbers the higher the (useful) workload on the system. I omitted ambiguous metrics like e.g. IP_SECONDS, because a high total runtime for InfoPackages could mean either:
- a lot of data was being loaded from the source systems
- few data was loaded from the source systems, but in a very inefficient way
I assume, no unnecessary batchjobs are being scheduled and users only execute useful queries etc. The actual values of the metrics are normalized by their overall average. For each system the normalized averages for all 26 metrics are summed up. A score of 26 would mean a system is showing a very average overall workload. This also means that all 26 metrics have the same weight, which sounds like a reasonable thing to do to keep things simple.
Finally I get this overview:
SID | Score |
PO7 | 59.04 |
POA | 52.99 |
PO3 | 50.86 |
PO6 | 46.42 |
PH1 | 39.87 |
PO1 | 34.96 |
POC | 24.41 |
PH2 | 23.98 |
POB | 20.55 |
PO2 | 16.73 |
PO5 | 8.93 |
PO8 | 7.64 |
PO4 | 4.80 |
POD | 1.74 |
PO9 | 1.38 |
Table 1: Workload Scores
System POC is pretty much average, the highest workload is on PO7. The BWonHANA systems PH1 and PH2 are somwhere in the middle.
Picture 1: Barcharts of the workload scores
From my personal experience, I would classify the sysmtes PO5, PO8, PO4, POD and PO9 as low worload systems. Maybe the systems POC, PH2, POB and PO2 could be classified as medium workload systems, since they also seem to have a similar overall workload. Now let’s start with classification algorithms if they would confirm my opinion.
Classifying systems by workload
My overall goal is to detect unusual BW workload. So a classfication algorithm should help me with anomaly detection. I’ll start with standard classifcation algorithms that should simply take all available performance data without knowing which SID the workload belongs to, and then cluster the data intelligently. Ideally the 15 SIDs will each get their own cluster.
KMEANS
The first algorithm I used for clustering is KMEANS. This alogrithm needs the number of desired clusters as an input. I used the Silhouette score to identify the ideal number of clusters. The best fit was achieved with 9 clusters and a silhouette score of 0.45:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
PH1 | 67 |