初遇 PostgreSQL
以现有 Pleroma 数据库为基础学习 postgreSQL
本地安装 PG
用 Homebrew 安装
brew install postgresql@13
添加环境变量
# ~/.zshrc
+ # PostgreSQL
+ export PATH="/opt/homebrew/Cellar/postgresql@13/13.11_1/bin:$PATH"
+ export PGDATA="/opt/homebrew/var/postgresql@13"
让修改生效 & 测试
source ~/.zshrc # 对配置文件的修改生效
pg_ctl -V # 查看版本,return `pg_ctl (PostgreSQL) 13.1`
dmg 安装
修改环境变量
# PostgreSQL
export PATH="/Library/PostgreSQL/13/bin:$PATH"
source ~/.zshrc
远程操作 PG
登录
允许远程登录
locate postgresql.conf # 定位文件位置
# /etc/postgresql/13/main/postgresql.conf
listen_addresses = '*' # 修改内容
添加客户端鉴定条目
# /etc/postgresql/13/main/pg_hba.conf
# 允许用户在提供有效口令(md5)后从某个 IP 地址登录
host all all 0.0.0.0/0 md5
host all all ::0/0 md5
[可选]修改密码
postgres=# alter user postgres with password 'xxxxxxxx';
ALTER ROLE
最后重启 pg
但是不知道为啥(也许以后就知道了)我开启 5432 端口无效:ufw 命令返回结果显示已开启,防火墙规则添加了,ip 检查工具还是显示端口未开放。然后黛黛教了我端口转发
ssh -L 127.0.0.1:5432:127.0.0.1:5432 username@hostname
连上之后保持打开这个窗口,我用 Navicat 连上了数据库,第一次见到了数据库到底是啥结构……(Navicat 是什么请看 用户手册)
.
└── public # public 是默认的数据库模式(schema)。一个数据库可以包含多个模式,而每个模式又可以包含多个表、视图、函数等对象.schema 是一种逻辑容器,用于组织和管理数据库对象。它可以帮助在一个数据库中对不同的对象进行分类和分组
├── Tables # 表是存储数据的基本对象,类似于电子表格或数据库中的表格。它们由行和列组成,用于存储和组织数据
├── Views # 视图是虚拟表,是根据存储在其他表中的数据定义的查询结果。视图可以简化复杂查询,并提供一种方式来访问和使用数据
├── Materialized Views # 材料化视图是一种特殊类型的视图,它在创建时将查询结果缓存为实际的表。与普通视图不同,材料化视图的内容在查询时不会再次计算,而是使用预先计算的结果
├── Functions # 函数是一段封装了特定功能的可重用代码块。在数据库中,函数用于执行特定的计算、处理数据或返回结果
├── Queries
└── Backups
解决数据库增长过快的问题(到底解决了没有呢?)
前提:备份数据库
# 备份
mkdir -p /pleromabackup
chmod 777 /pleromabackup
su - postgres
psql -U postgres
postgres=# \! pg_dump pleroma -f /pleromabackup/pleroma.pgdump
# 检查备份目录大小
[xxx@xxx pleromabackup]# du -hs
341M .
🌟 **查询所有表的行数:**创建一个 count_em_all
函数,然后调用它获得准确的统计信息
- 创建:
CREATE TYPE table_count AS (table_name TEXT, num_rows INTEGER);
CREATE OR REPLACE FUNCTION count_em_all () RETURNS SETOF table_count AS '
DECLARE
the_count RECORD;
t_name RECORD;
r table_count%ROWTYPE;
BEGIN
FOR t_name IN
SELECT
c.relname
FROM
pg_catalog.pg_class c LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE
c.relkind = ''r''
AND n.nspname = ''public''
ORDER BY 1
LOOP
FOR the_count IN EXECUTE ''SELECT COUNT(*) AS "count" FROM '' || t_name.relname
LOOP
END LOOP;
r.table_name := t_name.relname;
r.num_rows := the_count.count;
RETURN NEXT r;
END LOOP;
RETURN;
END;
' LANGUAGE plpgsql;
- 调用
SELECT
*
FROM
count_em_all ( ) AS r
ORDER BY
r.num_rows DESC;
结果:
table_name | num_rows
--------------------------------------------+----------
activities | 171772
objects | 120664
hashtags_objects | 38433
users | 22054
hashtags | 7719
deliveries | 5420
counter_cache | 876
notifications | 570
oauth_authorizations | 479
oauth_tokens | 474
following_relationships | 472
schema_migrations | 302
apps | 83
user_relationships | 53
conversation_participation_recipient_ships | 36
instances | 34
oban_jobs | 32
conversation_participations | 18
conversations | 10
config | 10
bookmarks | 5
markers | 3
user_notes | 2
data_migrations | 2
backups | 1
moderation_log | 1
password_reset_tokens | 1
lists | 1
oban_peers | 1
announcements | 0
filters | 0
announcement_read_relationships | 0
data_migration_failed_ids | 0
mfa_tokens | 0
chats | 0
chat_message_references | 0
push_subscriptions | 0
registrations | 0
report_notes | 0
scheduled_activities | 0
thread_mutes | 0
user_invite_tokens | 0
(42 rows)
🌟 查询所有表的总大小(括表数据文件、索引文件、TOAST表、Free Space Map等对象的大小)
pleroma=# SELECT relname, pg_size_pretty(pg_total_relation_size(relid)) AS size_mb
FROM pg_stat_user_tables
ORDER BY pg_total_relation_size(relid) DESC;
relname | size_mb
--------------------------------------------+---------
objects | 266 MB
activities | 258 MB
oban_jobs | 100 MB
users | 64 MB
hashtags_objects | 4096 kB
hashtags | 1288 kB
deliveries | 960 kB
oauth_tokens | 592 kB
oauth_authorizations | 344 kB
notifications | 240 kB
following_relationships | 232 kB
counter_cache | 208 kB
apps | 176 kB
instances | 96 kB
conversation_participations | 88 kB
lists | 80 kB
markers | 80 kB
oban_peers | 80 kB
schema_migrations | 64 kB
conversation_participation_recipient_ships | 56 kB
user_relationships | 56 kB
data_migrations | 48 kB
config | 48 kB
backups | 48 kB
conversations | 40 kB
bookmarks | 40 kB
user_notes | 40 kB
filters | 32 kB
scheduled_activities | 32 kB
registrations | 32 kB
chat_message_references | 32 kB
moderation_log | 32 kB
password_reset_tokens | 24 kB
push_subscriptions | 24 kB
user_invite_tokens | 16 kB
mfa_tokens | 16 kB
announcement_read_relationships | 16 kB
thread_mutes | 16 kB
announcements | 16 kB
report_notes | 16 kB
chats | 16 kB
data_migration_failed_ids | 16 kB
(42 rows)
列出行数和 size 排前五的表
🌟 activities 表
type 字段的所有值
pleroma=# SELECT DISTINCT data->>'type'
FROM activities;
?column?
------------
Follow
Block
Announce
Move
Delete
Undo
Create
EmojiReact
Remove
Update
Like
Add
Reject
Accept
(14 rows)
列出值是 Announce 的所有行
SELECT * FROM activities WHERE data->>'type' = 'Announce';
./bin/pleroma_ctl database vacuum full
SELECT a."actor", a."data"->>'type' AS type, a."updated_at", o."data"
FROM "activities" AS a
INNER JOIN "objects" AS o
ON (o."data"->>'id') = COALESCE(a."data"->'object'->>'id', a."data"->>'object')
WHERE a."data"->>'actor' = 'https://pleroma.your.site/users/admin'
OR a."data"->'to' ? 'https://pleroma.your.site/users/admin'
OR a."data"->'cc' ? 'https://pleroma.your.site/users/admin'
ORDER BY a."updated_at";
🌟 users 表里有很多很多只有 nickname 的用户,它们所在实例是真实的,但是 ID 不是,我把这些行删了,暂时还没出问题
SELECT * FROM users WHERE name IS NULL;
DELETE FROM users WHERE name IS NULL;
🌟 删除了一个 bot 的记录
pleroma=# SELECT *
FROM objects
WHERE data->>'actor' = 'https://xx.ca/users/xxxBot';
pleroma=# DELETE FROM objects
WHERE data->>'actor' = 'https://xx.ca/users/xxxBot';
DELETE 485
🌟 删除了 activities 表 2023-06-15 之前、与我无讨论关系、书签以外的内容
pleroma=# DELETE FROM activities
WHERE id NOT IN (SELECT activity_id FROM bookmarks)
AND inserted_at < '2023-06-15'
AND id NOT IN (
SELECT a.id
FROM activities AS a
INNER JOIN objects AS o ON (o."data"->>'id') = COALESCE(a."data"->'object'->>'id', a."data"->>'object')
WHERE a."data"->>'actor' = 'https://ponderduck.cc/users/yaya'
OR a."data"->'to' ? 'https://ponderduck.cc/users/yaya'
OR a."data"->'cc' ? 'https://ponderduck.cc/users/yaya'
);
DELETE 89430
🌟 删除了 objects 表与以上内容对应的行
/*
创建一个临时表 temp_objects
来存储那些在 activities 表中找不到对应行的 objects 表的行
*/
pleroma=# CREATE TEMPORARY TABLE temp_objects AS
SELECT o.*
FROM objects o
LEFT JOIN activities a ON o.data->>'id' = a.data->'object'->>'id'
WHERE a.data->'object'->>'id' IS NULL;
SELECT 128652
/*
报错:
执行删除操作时违反了外键约束
objects 表中的某些行仍然被 deliveries 表引用,因此无法直接删除
deliveries 表是用于跟踪消息传递的表,通常在 ActivityPub 协议的实现中使用
它记录了消息传递的相关信息,例如发送方、接收方、消息内容等
每条记录代表一次消息传递或交互。
在 Pleroma 中,deliveries 表用于存储消息的传递状态和相关信息,以确保消息能够正确地传递给目标用户
它与 objects 表和 activities 表之间可能存在外键关系,用于保证数据的完整性
*/
pleroma=# DELETE FROM objects
WHERE id IN (SELECT id FROM temp_objects);
ERROR: update or delete on table "objects" violates foreign key constraint "deliveries_object_id_fkey" on table "deliveries"
DETAIL: Key (id)=(75072) is still referenced from table "deliveries".
/*
所以先删除被 deliveries 表引用的行
*/
pleroma=# DELETE FROM deliveries
WHERE user_id NOT IN (SELECT id FROM activities);
DELETE 6372
/*
继续执行删除操作并删除临时表
*/
pleroma=# DELETE FROM objects
WHERE id IN (SELECT id FROM temp_objects);
DELETE 128652
pleroma=# DROP TABLE temp_objects;
DROP TABLE
最后 su pleroma -s $SHELL -lc "./bin/pleroma_ctl database vacuum full "
,但是!SQL 命令有问题,删了我好多嘟,呜呜呜哭着倒地。重新看了一下数据库大小,现在倒是变得迷你了,164 MB。但是这一切真的值得吗……完全在我计划外的……
常用 psql 命令
psql -U postgres # 指定用户连接PostgreSQL
psql -d postgres # 指定数据库连接PostgreSQL
psql -h 127.0.0.1 -p 5432 -U pg -d postgres # 连接到本地主机(127.0.0.1)的 PostgreSQL 数据库服务器,使用用户名 pg 连接到名为 postgres 的数据库
\du # 查看所有用户
\l # 查看所有数据库
\c{dbname} # 切换到数据库
\dt # 查看当前数据库的所有表
\d {tablename} # 查看指定表
SHOW data_directory; # 查看数据目录
\q # 退出交互查询模式