XAI Today

Machine Learning, Data Mining and Analytics

What is Analytics Engineering?

Posted at — Oct 10, 2022

The well-known roles of the data team are Data Analyst/Scientist and Data Engineer. Yet, in recent years, there has been a growing demand for data-driven decision making to be distributed throughout the organisation, with a traditional data team at risk of becoming a bottle neck as the growing need for insights and analytics cannot be fully met.

With this increased demand has come the emergence of a new role - the Analytics Engineer - for the typical data team to better encompass these evolving responsibilities, and specialist technologies. This evolution has led to a paradigm shift in data processes from Extract-Transform-Load (ETL) to Extract-Load-Transform (ETL), which facilitates a more self-service oriented Business Intelligence (BI) operating model where data analysts and data scientists can be more embedded with domain teams.

How does this new role fit in and solve the problem?

Traditional data analysts, data scientists, and machine learning engineers want to focus on analyzing and modelling data.

Data Engineers are responsible for designing and implementing robust, scalable and fault tolerant data pipelines to Extract data from homegeneous systems (operational, external, IoT/Edge etc) and Load them to a Data Lake or Data Warehouse. These raw data are rarely available in a form that is well-suited to analytical workloads.

The task to Transform these raw data into analytical data structures suffered greatly from a lack of separation of concerns. The Data Engineering ETL process was generelly implemented without taking domain knowledge into account. As a generic technical process, the resulting data structures were inflexible and not analysis ready.

The data analysts/scientists have to spend further time wrangling data into the format they want, and the rigid Data Warehouse architecture is not flexible enough for self-service BI for less technical users with high domain knowledge.

The result of this unclear separation of concerns is that the Transform step is notoriously hard to document and maintain. Data quality assumptions are spread over multiple systems. The steps required to rebuild modelled data are not owned nor executable by a single system. The lineage of data in the user-facing data sources is hard to trace. All these issues have a knock-on effect to debugging issues.

Analytics engineers step into this gap. Their focus is on turning raw data into the building blocks of clean, tidy, documented and discoverable data that enable these other roles to do their best work. The required skills for succeeding at this role are:

By focusing on the technical aspects of data analytics and data transformation to meet the business objectives, analytics engineers release the full value of business data, as data analysts/scientists and non-technical data consumers can get on with the the work of extracting insights, building predictive models, and making better decisions. As the amount of data generated by organizations continues to grow, the demand for skilled analytics engineers is likely to increase, making this a promising career path for those interested in the intersection of data and technology.

comments powered by Disqus