It would be great to get more details about the pros/cons of materializing hooks and potentially include some metadata (I know that would be basically the hub concept in DV), because I would consider creating a hook table once I have an integration challenge per hook (reconciliate 2+ datasets). I think full joins with COALESCE calculations to standardize keys are definitely not a good choice to combine datasets, the hook tables won't be difficult to create or maintain and would ease querying, but maybe I am missing something.
First question I would ask it do you need to implement hook tables? The only purpose they serve is as an anchor table which contains all business keys for business concepts (just like a DV hub) that you can LEFT JOIN to. However, as hook table only contains the (qualified) business keys you don't really need them. Unless you have a specific need, then just leave them out, it's one less thing to worry about.
The main design decision is around how to implement bags. View or materialised view or a physical table? If your data volumes are relatively small then a view should suffice, but as the volumes go up and join performance drops off then physicalising bags is the way to go. As always there are trade-offs and design choice to make, but that what the job is all about, right?
Hi Andrew. Sure, if you are linking bags from different hooks, then hooks are only optional since you can link the bags directly (the same as DV).
However, when you have a reconciliation challenge where multiple data sources provide data on the same business concept (which means multiple bags on the same hub), I would see the importance of having a central ID store to build a consolidated/conformed bag while avoiding full joins.
It would be great to get more details about the pros/cons of materializing hooks and potentially include some metadata (I know that would be basically the hub concept in DV), because I would consider creating a hook table once I have an integration challenge per hook (reconciliate 2+ datasets). I think full joins with COALESCE calculations to standardize keys are definitely not a good choice to combine datasets, the hook tables won't be difficult to create or maintain and would ease querying, but maybe I am missing something.
Hey Rodrigo,
First question I would ask it do you need to implement hook tables? The only purpose they serve is as an anchor table which contains all business keys for business concepts (just like a DV hub) that you can LEFT JOIN to. However, as hook table only contains the (qualified) business keys you don't really need them. Unless you have a specific need, then just leave them out, it's one less thing to worry about.
The main design decision is around how to implement bags. View or materialised view or a physical table? If your data volumes are relatively small then a view should suffice, but as the volumes go up and join performance drops off then physicalising bags is the way to go. As always there are trade-offs and design choice to make, but that what the job is all about, right?
Hi Andrew. Sure, if you are linking bags from different hooks, then hooks are only optional since you can link the bags directly (the same as DV).
However, when you have a reconciliation challenge where multiple data sources provide data on the same business concept (which means multiple bags on the same hub), I would see the importance of having a central ID store to build a consolidated/conformed bag while avoiding full joins.