Ryan Booz: Using Common Table Expressions: Transforming and Analyzing Data in PostgreSQL, Part 2

原文英文,共4280词,阅读约需16分钟。

In the first article in this transforming data series, I discussed how powerful PostgreSQL can be in ingesting and transforming data for analysis. Over the last few decades, this was traditionally done with a methodology called Extract-Transform-Load (ETL) which usually requires external tools. The goal of ETL is to do the transformation work outside of the database and only import the final form of data that is needed for further analysis and reporting. However, as databases have improved and matured, there are more capabilities to do much of the raw data transformation inside of the database. In doing so, we flip the process just slightly so that we Extract-Load-Transform (ELT), focusing on getting the raw data into the database and transforming it internally. In many circumstances this can dramatically improve the iteration of development because we can use SQL rather than external tools. While ELT won’t be able to replace every transformation workload, understanding how to do the work can help improve many data transformation and analysis workloads. To demonstrate how SQL and PostgreSQL functions can be used to transform raw data directly in the database, in the first article I used sample data from the Advent of Code 2023, Day 7. By the end of the first article, I had demonstrated how to take the sample input and transform it into a usable table of data that could be queried and analyzed. If you haven’t read that article first, it’s best to start there because you’ll be able to load the sample data, understand the puzzle we are trying to solve, and some of the unique PostgreSQL features that improve the process. To get setup so that you can follow along, this simple script will create the ‘dec07’ table we need and insert a few rows of sample data. In the first article, I demonstrated two ways to do this that are more practical when dealing with raw input. This is just intended to get you started quickly. CREATE TABLE dec07 ( id integer generated by default as identity, lin[...]

本文介绍了在PostgreSQL中使用ELT方法进行原始数据转换的步骤和技巧,包括使用SQL和PostgreSQL函数进行转换,以及使用CTE简化查询和分析过程。最后,展示了如何使用CTE进行数据聚合和排序解决问题。

CTE ELT PostgreSQL 数据转换 聚合和排序
相关推荐 去reddit讨论