Posts by David Zhang
Author: David Zhang
David Z: How to setup Lustre file system and run Postgres on it

Feed: Planet PostgreSQL. 1. Overview Similar to PostgreSQL, Lustre file system is also an open source project which started about 20 years ago. According to Wikipedia, Lustre file system is a type of parallel distributed file system, and is designed for large-scale cluster computing with native Remote Direct Memory Access (RDMA) support. Lustre file systems are scalable and can be part of multiple computer clusters with tens of thousands of client nodes, tens of petabytes (PB) of storage on hundreds of servers, and more than a terabyte per second (TB/s) of aggregate I/O throughput. This blog will explain how to ... Read More
David Z: One idea of accessing Primary’s buffer blocks

Feed: Planet PostgreSQL. 1. Overview PostgreSQL is a great open source project for many reasons. One of the reasons I like it is because of the design of buffer blocks addressing. In this blog, I am going to explain a possible way to share a Primary’s buffer blocks with a Standby. If you want to know more about how buffer tag works, you can refer to my previous blog. 2. Primary and Standby In Postgres, a Primary is an active database which accepts connections and performs read-write SQL operations; a Standby is a copy of the active database, and it ... Read More
David Z: How to do backup and restore for PostgreSQL using pg_rman

Feed: Planet PostgreSQL. 1. Overview PostgreSQL is a very popular open-source relational database management system, and it is widely used in many different production environments. To maintain the production environment always functioning, you need to a lot tools, and one of the tools must to have been backup and restore. This blog is going to introduce one backup and restore tools designed for Postgres, i.e. pg_rman. 2. What is pg_rman pg_ramn is a free utility program designed to backup and restore PostgreSQL database. It takes a physical online backup of whole database cluster, archive WALs, and server logs, and restore ... Read More
David Z: global deadlock in a distributed database cluster

Feed: Planet PostgreSQL. 1. Overview Nowadays, supporting distributed transactions is a typical requirement for many use cases, however, the global deadlock detection is one of the key challenging issues if you plan to use PostgreSQL to setup a distributed database solution. There are many discussions about global deadlock, but this blog will provide you a step-by-step procedure about how to create such a global deadlock and share some thoughts based on personal experience. 2. Deadlock First of all, the basic concept of a deadlock is that Process A is trying to acquire Lock2 while it is holding Lock1 and Process ... Read More
David Z: parallel commit in postgres fdw

Feed: Planet PostgreSQL. 1. Overview PostgreSQL is one of the greatest open source databases, not only because of the extensibility and SQL compliance but also the evolution of new features. For example, in postgres_fdw, there is a new feature parallel commit has been added into the main branch and will be released in PG15. This blog is for a quick taste of this new feature. 2. what is parallel commit postgres_fdw If you are a PostgreSQL database developer or if your internal database is built based on PostgreSQL database, and especially you have some applications which are related with the ... Read More
David Z: A snippet to acquire a Lightweight lock

Feed: Planet PostgreSQL. 1. Overview Recently, I was working on an internal issue related with buffer manager in PostgreSQL, and I saw a typical use of the Lightweight lock in buffer manager like below. 1 INIT_BUFFERTAG(newTag, smgr_rnode.node, forkNum, blockNum); 2 newHash = BufTableHashCode(&newTag); 3 newPartitionLock = BufMappingPartitionLock(newHash); 4 LWLockAcquire(newPartitionLock, LW_SHARED); 5 buf_id = BufTableLookup(&newTag, newHash); 6 LWLockRelease(newPartitionLock); Basically, when the buffer manger needs to access a buffer block using buffer tag, it will have to acquire a lightweight lock in either shared or exclusive mode, then find the buffer block and then release the lightweight lock. Since the buffer manager ... Read More
David Z: Transaction ID and Snapshot information functions

Feed: Planet PostgreSQL. 1. Overview I recently investigated one internal issue which was related with snapshot and found there were some changes on transaction id and snapshot information functions in PostgreSQL. Here, I am trying to share what I have learned. Before PostgreSQL 13, all transaction id and snapshot related public functions were named as txid_xxx_yyy, for example,txid_current(), which returns the current toplevel transaction ID.txid_current_if_assigned(), which is similar to txid_current() but doesn’t assign a new xid if there isn’t one.txid_current_snapshot(), which returns current snapshot in txid format with only top-transaction XIDs.txid_status(), which reports the status of a recent transaction ID ... Read More
David Z: How to run a specific regression test

Feed: Planet PostgreSQL. 1. Overview I have been working on an internal project based on PostgreSQL for a while, and from time to time, I need to run some specific test cases to verify my changes. Here, I want to shared a tip to run a specific regression TAP test quickly, especially, when you are focusing on a particular bug and you know which test case can help verify the fix. A details document about the regression test can be found at Running the Tests. 2. Regression test PostgreSQL provides a comprehensive set of regression tests to verify the SQL ... Read More
David Z: Backup Label in PostgreSQL

Feed: Planet PostgreSQL. 1. Overview When I was working on some backup and recovery related features for a project based on Postgres, I noticed that there is file called backup_label. By quickly google search, you can find some very nice blogs or books which discussed this topic, such as, The Internals of PostgreSQL, one of my favourite books. In this blog, I am going to talk it a little more based on my experience. 2. What is backup_label? The backup_label is a file created in $PGDATA folder when there is an exclusive backup triggered by pg_start_backup() and the backup is ... Read More
David Z: A quick test for postgres_fdw batch insertion

Feed: Planet PostgreSQL. 1. Overview In my previous blog, I briefly walked through how the bulk/batch insertion was done for postgres_fdw in PG14. In this blog, I am going to run some basic tests to compare the performance for before and after the batch insertion was introduced in postgres_fdw, so that we can have a general idea about whether this feature makes any difference. 2. PG Servers Setup The key of the blog is to see if there is any difference for batch insertion. To make the testing simple, here is how I set up a simple environment. As this ... Read More
Recent Comments