Ryan Booz: Recursive CTEs: Transforming and Analyzing Data in PostgreSQL, Part 3

原文英文,约4100词,阅读约需15分钟。发表于:

The first two articles in this series demonstrated how PostgreSQL is a capable tool for ELT – taking raw input and transforming it into usable data for querying and analyzing. We used sample data from the Advent of Code 2023 to demonstrate some of the ELT techniques in PostgreSQL. In the first article, we discussed functions and features of SQL in PostgreSQL that can be used together to transform and analyze data. In the second article, we introduced Common Table Expressions (CTE) as a method to build a transformation query one step at a time to improve readability and debugging. In this final article, I’ll tackle one last feature of SQL that allows us to process data in a loop, referencing previous iterations to perform complex calculations: Recursive CTE’s. SQL is Set-based SQL is primarily a set-based, declarative language. Using standard ANSII SQL and platform-specific functions, a SQL developer declares the desired outcome of a query, not the process by which the database should retrieve and process the data. The query planner typically uses statistics about the distribution of data to determine the best plan to get the desired result and return the full set of rows. While CTE’s and LATERAL joins make it feel like we can use the output of one query to impact another, those are always a one-shot opportunity. As a set-based language, it’s impossible to do algorithmic calculations, the ability to use the output of a query as input and control to another in a loop. Stated another way, early versions of the SQL standard did not have procedural capabilities. To do that, most database platforms use their own superset of SQL that provides procedural capabilities. By default, this is T-SQL in SQL Server and pl/pgsql in PostgreSQL. That changed with the SQL:1999 standard. With this new feature, implemented by all major databases, SQL became a Turing-complete language that can solve complex calculations in a single query. Recursive Common Table Expressions (aka. Hierarchical Queries) [...]

本文介绍了PostgreSQL作为ELT工具的能力,以及如何使用递归CTE进行数据处理。递归CTE允许查询在循环中引用前一次迭代的输出,以执行复杂的计算。文章还通过示例演示了如何使用递归CTE来计算从1到10的数字序列和文件系统的层次结构。最后,文章总结了使用SQL和PostgreSQL进行数据处理的优势和建议。

Ryan Booz: Recursive CTEs: Transforming and Analyzing Data in PostgreSQL, Part 3
相关推荐 去reddit讨论