Hello dear friends.
Today I want to share a life story about how DWH storage was arranged in Tele2 before the introduction of QCD (EDW).
I entered the Tele2 IT department in 2012 in the reporting systems department. At that time, DWH repository was already created in the company, on which many reporting processes and not only were already spinning.
A little about the technical stack that was used there at that time. For storage, the Oracle database was used with a capacity of 60-100 Tb T4-4 server with 1 TB of operative. Data from various sources was downloaded there. But the main ones were 4 Oracle billing bases, which were essentially a charging platform. And there was a department that was engaged in supporting these databases and providing services. The separation of these bases was by macro-regions. Reason: volumes are too large. That is, if a subscriber calls, say, from a Moscow SIM card, then the call cost is calculated in the corresponding billing.
The top-end hardware always went to billing databases, and resources were allocated to the rest of the systems according to the residual principle. Usually for DWH, the server always got a little weaker. Those. at billing there is a piece of iron T5-4, then at DWH - T4-4 in inheritance.
But these resources have always been enough to cover current tasks and curtail reporting. Data from billing was downloaded via DB-link. Classic ETL processes were configured when nightly data downloads took place with small transformations (for example, adding surrogate keys). ETL was of 2 types: full load for small volumes and incremental for large tables such as, for example, call details, charges, payments, etc. There was also such a large source as text files that download call information and Internet traffic from switches and base stations. Data is loaded as text files using oracle sql loader loaders. The increment to the base was usually 10-20 GB per day.
Partitioning tables, indexes, optimizing query plans, hinting in DWH had to be used constantly. There was not a day without hanging or long-playing sessions in which it was necessary to climb into the request plan.
The DWH storage structure in Tele2 prior to the introduction of EDW.
Also, one of DWH's main tasks was to generate monthly financial statements (ETFs). It was considered on the DWH server for 4 whole days due to the large volumes. To imagine what it is, I’ll say that this is an Oracle package of 5 thousand lines of PL / SQL code with complex ornate logic and all this is minimized in dynamics. And then the report is uploaded to FTP or to a network share in the form of CSV files. And all this without the use of boxed solutions. Those. hand written, optimized and automated functionality over the years.
But the DWH database was used not only to provide regular reporting but also as an operational storage. For example, it revolved around the process of providing different information to subscribers from a personal account on the Tele2 website.
It is also worth talking separately about the Oracle Application Express (APEX) system, which has a special place for reporting. APEX is an environment for the rapid development of WEB interfaces, either for reporting or for setting up a business process. On it was created, by hand, written functionality "Report upload", where users could create a report for themselves. Those. a person comes in, selects a set of fields for his report, if desired, he can pull the source as an excel file, and then he receives a report to the mail in the form of an archived csv file. And inside DWH, a huge number of PL / SQL procedures and functions are written which were essentially a built-in script generator for reports. Moreover, this tool was so popular within the company that over 8 years more than half a million reports with varying degrees of importance were generated on it.
APEX has also developed a lot of other interesting things. For example, hand written functionality for workflow and marketing automation system. In the first, the staff endorsed the documents. And secondly, the marketing department held various events for customers. For example, he performed mass SMS distribution to subscribers about new tariffs and services. And all this went through DWH and there was integration with the SMS channel.
Plus, a couple of reporting systems such as Crostal Reports and IBM Lotus connected to DWH via RPT files.
In the attached diagram above, you can see the old DWH repository structure and data flow for 2012. With the structure at present, it has nothing to do.
All this more or less successfully worked until the moment when the business realized that reporting was no longer enough and decided to implement QCD, BI-systems and BigData.
In general, there was a lot of interesting things. Perhaps I will dwell on this. See you soon.