标签

 postgres 

相关的文章:

这是一个关于Postgres的文章集锦,涵盖了连接池、外部数据包装器、行级安全、扩展、逻辑复制协议、备份恢复和演讲等主题。

Planet PostgreSQL

Planet PostgreSQL -

Pavel Borisov: Postgres Bloat Minimization

Understanding and minimizing Postgres table bloat

Postgres使用“堆”方法存储数据,每个表分为8Kb的页面。Vacuum过程释放页面上的空间并更新表的空闲空间图和可见性图。Vacuum可以自动或手动执行。为了释放文件系统空间,可以使用更积极的VACUUM FULL模式。可以调整自动清理参数来控制何时和多久自动清理运行。长时间运行的事务和锁可能会阻止自动清理成功。其他优化包括调整自动清理工作进程的数量和增加autovacuum_work_mem参数。Vacuum和自动清理是维护表和防止膨胀的有效方法。

相关推荐 去reddit讨论
Planet PostgreSQL

Planet PostgreSQL -

Claire Giordano: About Talk Selection for POSETTE: An Event for Postgres 2024

As promised in the CFP for POSETTE: An Event for Postgres 2024, all of the talk selection decisions were emailed out on April 17th. Our talk selection work has now concluded, with the possible exception of accepting proposals from the Reserve list. So what’s next? First I want to thank all of you Postgres people who submitted such amazing talk proposals into the CFP for POSETTE, now in its 3rd year. I was so impressed by the submissions and wish we could have accepted more of them. And I also want to thank Alicja Kucharczyk, Daniel Gustafsson, and Melanie Plageman from POSETTE’s Talk Selection Team for contributing their time and expertise to collaborate with me to select the talks for this year’s virtual POSETTE event. It’s not easy to carefully read through and review 184 talk proposals—in just 8 days—to come up with the program for an event like #PosetteConf. That’s right, 184 talk proposals—from 120 unique speakers. (The CFP had a maximum of 4 submissions per speaker.) With just 38 talks to accept this year, that means POSETTE 2024 has a ~20% talk acceptance rate. Bottom line, we had some difficult decisions to make. So many great talk proposals we had to lengthen the POSETTE schedule to make space The original POSETTE plan for 2024 was to have 4 livestreams with 9 talks each. The math looked like this: Each livestream would have: 1 invited keynote—not selected through the CFP talk selection process, but rather an invited keynote speaker 8 unique talks selected via the CFP process Hence, 36 talks total: 32 talks selected via the CFP process + 4 unique keynotes However, the best laid plans of mice and men and all that, we had to throw that math out the window. There were too many good talk proposals. Luckily the talk production team led by Teresa Giacomini was able to rejigger their recording schedules to make room for 6 more talks. So the final POSETTE 2024 schedule will have: 42 talks total: 38 [...]

POSETTE 2024将有42个演讲,其中38个通过CFP选择,4个是特邀演讲。POSETTE 2025计划继续提交演讲提案。文章强调了透明度和演讲选择过程。

相关推荐 去reddit讨论
Planet PostgreSQL

Planet PostgreSQL -

David Wheeler: 🎙️ Hacking Postgres s02e03

Last week I appeared on s02e03 of the Hacking Postgres podcast. The experience I had after my independent Consulting gig for 10 years working in companies was, like, bringing up other people and being supportive of other people and hearing from a diversity of voices and perspectives makes everything better. That’s part of why I want to get so much input on and feedback on the stuff that I’m hoping do with PGXN v2 — or whatever we ultimately call it. But people matter, more than the technology, more than any of the rest of it. I quite enjoyed this wide-ranging discussion. We covered my history with the Postgres community, a bunch of the projects I’ve worked on over the years, plans and hopes for the PGXN v2 project, perspectives on people and technology, and exciting new and anticipated features of Postgres. Find it wherever fine podcasts are streamed, including: YouTube Apple Podcasts Overcast Twitter More about… Postgres Podcast Hacking Postgres Sqitch pgTAP PGXN

相关推荐 去reddit讨论
Planet PostgreSQL

Planet PostgreSQL -

Keith Fiske: Auto-archiving and Data Retention Management in Postgres with pg_partman

You could be saving money every month on databases costs with a smarter data retention policy. One of the primary reasons, and a huge benefit of partitioning is using it to automatically archive your data. For example, you might have a huge log table. For business purposes, you need to keep this data for 30 days. This table grows continually over time and keeping all the data makes database maintenance challenging. With time-based partitioning, you can simply archive off data older than 30 days. The nature of most relational databases means that deleting large volumes of data can be very inefficient and that space is not immediately, if ever, returned to the file system. PostgreSQL does not return the space it reserves to the file system when normal deletion operations are run except under very specific conditions: the page(s) at the end of the relation are completely emptied a VACUUM FULL/CLUSTER is run against the relation (exclusively locking it until complete) If you find yourself needing that space back more immediately, or without intrusive locking, then partitioning can provide a much simpler means of removing old data: drop the table. The removal is nearly instantaneous (barring any transactions locking the table) and immediately returns the space to the file system. pg_partman, the Postgres extension for partitioning, provides a very easy way to manage this for time and integer based partitioning. pg_partman daily partition example Recently pg_partman 5.1 was released that includes new features such as list partitioning for single value integers, controlled maintenance run ordering, and experimental support for numeric partitioning. This new version also includes several bug fixes, so please update to the latest release when possible! All examples were done using this latest version. https://github.com/pgpartman/pg_partman First lets get a simple, time-based daily partition set going CREATE TABLE public.time_stuff (col1 int , col2 text default 'stuff' [...]

通过智能的数据保留策略,每月节省数据库成本。分区可自动归档超过30天的旧数据。pg_partman是Postgres的分区扩展,提供了管理时间和整数分区的简单方法。最新版本pg_partman 5.1包括了新功能,如单值整数的列表分区、控制维护运行顺序以及对数值分区的实验性支持。此外,该版本还包括了一些错误修复。

相关推荐 去reddit讨论
Planet PostgreSQL

Planet PostgreSQL -

Cady Motyka: Introducing Snowflake Sequences in a Postgres Extension

In a PostgreSQL database, sequences provide a convenient way to generate a unique identifier, and are often used for key generation. From the community, PostgreSQL provides functions and SQL language to help manage sequence generation, but the sequences themselves are not without limitations in a multi-master environment. Snowflake sequences from pgEdge work seamlessly in a multi-master PostgreSQL cluster to remove those limitations so your data can thrive at the network edge. Why are Sequences an Issue? In a distributed multi-master database system, sequences can get complicated. Ensuring consistency and uniqueness across the nodes in your cluster is a problem if you use PostgreSQL sequences; the Snowflake extension steps up to automatically mitigate this issue.PostgreSQL sequence values are prepared for assignment in a table in your PostgreSQL database; as each sequence value is used, the next sequence value is incremented. Changes to the next available sequence value are not replicated to the other nodes in your replication cluster. In a simple example, you might have a table on node , with 10 rows, each with a primary key that is assigned a sequence value from 1 to 10; the next prepared sequence value on will be 11. Rows are replicated from  to  without issue until you add a row on . The PostgreSQL sequence value table on  has not been incrementing sequence values in step with the sequence value table on . When you add a row on , it will try to use the next available sequence value ( will be 1 if you haven't added a row on ), and the  will fail because a row with the primary key value of 1 already exists. This disorder can be monitored and corrected by manually coordinating the PostgreSQL sequences between nodes in the cluster, but that quickly becomes complicated and potentially impacts the user experience as you add more nodes to the cluster.  Introducing Snowflake Sequences An alternative to using PostgreSQL sequences is to use a guaranteed unique Snowflake sequence. Snowflake sequences are repre[...]

在分布式多主数据库系统中,使用PostgreSQL序列可能会变得复杂。Snowflake序列可以解决这个问题,确保在集群中的节点之间保持一致性和唯一性。

相关推荐 去reddit讨论
Planet PostgreSQL

Planet PostgreSQL -

Gabriele Bartolini: CloudNativePG Recipe 7: Postgres Vertical Scaling with Storage in Kubernetes - part 2

This is the second article in a series that explores advanced strategies for scaling PostgreSQL databases in Kubernetes with the help of CloudNativePG. This article focuses on horizontal table partitioning and tablespaces and how they can be used to manage large datasets. By partitioning tables based on specific criteria and optimising storage with tablespaces, PostgreSQL users can achieve better scalability and performance in cloud-native environments, just like they could in traditional VMs or bare metal deployments.

本文介绍了在Kubernetes中使用CloudNativePG扩展PostgreSQL数据库的高级策略,包括水平表分区和表空间的使用。通过分区和优化存储,PostgreSQL用户可以在云原生环境中实现更好的可扩展性和性能。

相关推荐 去reddit讨论
Planet PostgreSQL

Planet PostgreSQL -

Adam Hendel: Operationalizing Vector Databases on Postgres

Why do we need vector databases? The proliferation of embeddings immediately brought forth the need to efficiently store, index, and search these arrays of floats. However, these steps are just a small piece of the overall technology stack required to make use of embeddings. The task of transforming source data to embeddings and the serving of the transformer models that make this happen is often left as a task to the application developer. If that developer is part of a large organization, they might have a machine learning or data engineering team to help them. But in any case, the generation of embeddings is not a one-time task, but a lifecycle that needs to be maintained. Embeddings need to be transformed on every search request, and inevitably the new source data is generated or updated, requiring a re-compute of embeddings. Consistency between model training and inference Traditionally, machine learning projects have two distinct phases: training and inference. In training, a model is generated from a historical dataset. The data that go into the model training are called features, and typically undergo transformations. At inference, the model is used to make predictions on new data. The data incoming into the model for inference requires precisely the same transformations that were conducted at training. For example in classical ML, imagine you have a text classification model trained on TF-IDF vectors. At inference, any new text must undergo the same preprocessing (tokenization, stop word removal) and then be transformed into a TF-IDF vector using the same vocabulary as during training. If there’s a discrepancy in this transformation, the model’s output will be unreliable. Similarly, in a vector database used for embedding search, if you’re dealing with text embeddings, a new text query must be converted into an embedding using the same model and preprocessing steps that were used to create the embeddings in the database. Embeddings stored in the database using OpenAI’s text-embedding-a[...]

为了高效地存储、索引和搜索浮点数组,我们需要向量数据库。生成和搜索嵌入向量的过程需要保持一致性,以确保模型的输出可靠。pg_vectorize解决了这个问题,它跟踪了用于生成嵌入向量的转换模型,并提供了管理转换的方法。pg_vectorize还支持定时和实时更新嵌入向量的方式。它可以使用不同的转换模型生成嵌入向量,并支持OpenAI和Hugging Face的嵌入模型。pg_vectorize是开源的,可在GitHub上获取。

相关推荐 去reddit讨论
Planet PostgreSQL

Planet PostgreSQL -

Adam Hendel: Operationalizing Vector Databases on Postgres

This post contained content that could not be rendered in the Atom feed. Please use the official post link: https://tembo.io/blog/operationalizing-vectordbs-on-postgres

本文介绍了向量数据库的必要性和使用pg_vectorize生成和搜索嵌入向量的方法。pg_vectorize提供了管理转换的方法,并支持定时和实时更新嵌入向量。支持的嵌入模型包括OpenAI和Hugging Face的模型。

相关推荐 去reddit讨论
Planet PostgreSQL

Planet PostgreSQL -

Andrew Atkinson: 🎙️ Hacking Postgres 🐘 Podcast - Season 2, Ep. 1 - Andrew Atkinson

Recently I joined Ry Walker, CEO of Tembo, as a guest on the Hacking Postgres podcast. Hacking Postgres has had a lot of great Postgres contributors as guests on the show, so I was honored to be a part of it being that my contributions are more in the form of developer education and advocacy. Ry asked me about when I got started with PostgreSQL and what my role looks like today. PostgreSQL Origin Ry has also been a Ruby on Rails programmer, so that was a fun background we shared. We both started on early versions of Ruby on Rails in the 2000s, and were also early users of Heroku in the late 2000s. Since PostgreSQL was the default DB for Rails apps deployed on Heroku, for many Rails programmers it was the first time they used PostgreSQL. Heroku valued the fit and finish of their hosted platform offering, and provided best in class documentation and developer experience as a cutting edge platform as a service (PaaS). The popularity of that platform helped grow the use of PostgreSQL amongst Rails programmers even beyond Heroku. For me, Heroku was where I really started using PostgreSQL and learning about some of the performance optimization tactics “basics” as a web app developer. Meeting The Tembo Team Besides Ry, I’ve also had the chance to meet more folks from Tembo. Adam Hendel is a founding engineer and also based here in Minnesota. I also met Samay Sharma, PostgreSQL contributor and now CTO of Tembo, at PGConf NYC 2023 last Fall. While not an employee or affiliated with the company at all, it’s been interesting to track what they’re up to, and get little glimpses into starting up a whole company that’s focused on leveraging the power and extensibility of PostgreSQL. If you’d like to learn more about Adam’s background, Adam was the guest for Season 1, Episode 2 of Hacking Postgres, which you can find here: https://tembo.io/blog/hacking-postgres-ep2 Using PostgreSQL with Ruby on Rails Apps Ruby on Rails as a web development framework has great support via the ORM - [...]

我在Hacking Postgres播客上与Tembo的CEO Ry Walker一起参加节目,讨论了Ruby on Rails程序员使用PostgreSQL的经验和性能优化策略。推荐Hacking Postgres播客了解PostgreSQL社区和技术创新。

相关推荐 去reddit讨论
Planet PostgreSQL

Planet PostgreSQL -

Raminder Singh: Postgres Roles and Privileges

A guide to Postgres roles and privileges

Postgres提供了强大而灵活的权限模型来控制对数据的访问。角色和权限用于管理对数据库对象的访问。对象有所有者可以授予其他角色权限。可以为未来的对象设置默认访问权限。超级用户角色可以绕过所有权限检查。公共角色是所有其他角色的默认角色。理解这些概念对于有效的数据库管理非常重要。

相关推荐 去reddit讨论

热榜 Top10

LigaAI
LigaAI
eolink
eolink
Dify.AI
Dify.AI
观测云
观测云

推荐或自荐