You may have heard that Google has released some content for SAP. Sure enough, within a month I already have a connection between the two beasts (my S/4HANA and Google Big Query) up and running and after some trial and error have a few observations to share.
for those SAP-minded willing to get familiar with Google jargon:
As the release notes suggest, we have both back-end content (Pipelines and related schemas) and front-end (Looker visualizations). In this post I focus on back-end.
To sum it all up, first impression is: connector is good, design delivered by Google in the form of Accelerator – questionable (probably works best for old-school ECC systems), absence of delta-extraction will need some good engineering. Keep on discovering things yourselves
it’s not as hard as it seems.
Cheers,
Dmitry Kuznetsov
Okumaya devam et...
Terminology
for those SAP-minded willing to get familiar with Google jargon:
- Big Query – sort of Data Warehousing suite. The closest comparison would be SAP upcoming Data Warehouse Cloud or BW would be conceptually close
- Pipelines in Cloud Data Fusion – sort of Transformation, DTP and Process Chain somewhat blended together (in BW-speak)
- Looker – visualization tool, sort of SAP Analytics Cloud
- SAP Accelerators in Google Cloud – pre-delivered set of pipelines and visualizations that Google has come up with
As the release notes suggest, we have both back-end content (Pipelines and related schemas) and front-end (Looker visualizations). In this post I focus on back-end.
Observations
- The content is written to serve both ECC and S/4HANA. Which, after you think about I, makes more sense for those ECC customers, actually. Because those running S/4HANA can already do quite a lot of analytics in the system directly or using (embedded) BW. So, the data does not need to leave the premises in the first instance if it’s S/4HANA, because that would be like putting a spaceship on the train.
- The above (Accelerator fit for both ECC and S/4) results in content, wherein “naked tables” are being extracted (imagine, how many of them are needed to at least build up Material master data) and only then, after renaming of the fields and storing that data in Big Query, does Google build up a Material master-data table inside using ETL steps. While this makes sense to some (mostly dealing with ECC), it does not for the others. Keeping Material as an example, we know that in S/4HANA there are already a bunch of standard views, resulting in a dimension view I_Material. The latter is, actually, a result of joining 10-20 other tables with all the fields neatly renamed, etc. In other words, there is quite a bit of work done by SAP for us. Now, using SAP Accelerator in Google Cloud, similar operation happens, but it does so by running quite a number of pipelines (haven’t yet found any scheduling tool that would put those pipelines in sequence for dependency).
- The good news is that the connector Google is using (SAP Table Reader) is also capable of understanding the SQL views as good as it does understand tables. So, I went on and produced a Material master-data table not of 10 staging tables plus the product of their joins on Google Big Query side, but rather extracted all I needed using a single S/4HANA view IMATERIAL (which is an SQL view running behind the CDS I_Material). This worked like a charm.
- Delta. Ah, yes! At least examining the standard Google-delivered content in SAP Accelerator I have not found anything that would have an “Update” operation. So far all I find are inserts only. And now, therefore, I am trying to imagine, how to load those millions of historical records (e.g. same ACDOCA table) every few hours for no apparent reason. Probably we would need to come up with something like “load me the moving year, but not all the rest” Split the table into History and Current? Put the union on top? Here some smarter Big Query developer would be handy to come up with pseudo-delta approach.
To sum it all up, first impression is: connector is good, design delivered by Google in the form of Accelerator – questionable (probably works best for old-school ECC systems), absence of delta-extraction will need some good engineering. Keep on discovering things yourselves
Cheers,
Dmitry Kuznetsov
Okumaya devam et...