Description
Description
Recently on my Gitea instance, loading the index page started taking an absolute eternity, 30 seconds+ quite often. This is a bug report / analysis as to why.
I took a look in the logs for the container and noticed this:
2024/09/23 17:16:08 ...eb/routing/logger.go:102:func1() [I] router: completed GET /user/events for 10.250.250.11:0, 200 OK in 292770.1ms @ events/events.go:18(events.Events)
2024/09/23 17:16:11 ...eb/routing/logger.go:68:func1() [W] router: slow GET / for 10.250.250.11:0, elapsed 3380.0ms @ web/home.go:32(web.Home)
2024/09/23 17:16:29 ...activities/action.go:461:GetFeeds() [W] [Slow SQL Query] SELECT `action`.* FROM `action` INNER JOIN `repository` ON `repository`.id = `action`.repo_id WHERE user_id=? AND is_deleted=? ORDER BY `action`.`created_unix` DESC LIMIT 20 [1 false] - 22.823196669s
Hmm... that's odd. Why would that query be so slow?
Running similar query against database using my actual user_id=2 and is_deleted = 0 returns in 0.006 seconds
MariaDB [gitea]> analyze select action.* from action inner join repository on repository.id = action.repo_id where user_id=3 and is_deleted=0 order by action.created_unix desc limit 20;
+------+-------------+------------+--------+-------------------------------------+--------------------+---------+----------------------+------+--------+----------+------------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
+------+-------------+------------+--------+-------------------------------------+--------------------+---------+----------------------+------+--------+----------+------------+-----------------------------+
| 1 | SIMPLE | action | ref | IDX_action_user_id,IDX_action_r_u_d | IDX_action_user_id | 9 | const | 340 | 20.00 | 100.00 | 100.00 | Using where; Using filesort |
| 1 | SIMPLE | repository | eq_ref | PRIMARY | PRIMARY | 8 | gitea.action.repo_id | 1 | 1.00 | 100.00 | 100.00 | Using index |
+------+-------------+------------+--------+-------------------------------------+--------------------+---------+----------------------+------+--------+----------+------------+-----------------------------+
2 rows in set (0.006 sec)
Strange... similar query gets results quite quickly.
Let's try removing the where...
Wow! That makes the query take significantly longer!
MariaDB [gitea]> analyze select action.* from action inner join repository on repository.id = action.repo_id order by action.created_unix desc limit 20;
+------+-------------+------------+-------+------------------+----------------------------+---------+---------------------+------+---------+----------+------------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
+------+-------------+------------+-------+------------------+----------------------------+---------+---------------------+------+---------+----------+------------+----------------------------------------------+
| 1 | SIMPLE | repository | index | PRIMARY | IDX_repository_is_template | 1 | NULL | 60 | 60.00 | 100.00 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | action | ref | IDX_action_r_u_d | IDX_action_r_u_d | 9 | gitea.repository.id | 398 | 1437.77 | 100.00 | 100.00 | |
+------+-------------+------------+-------+------------------+----------------------------+---------+---------------------+------+---------+----------+------------+----------------------------------------------+
2 rows in set (45.996 sec)
Output from the actual select statement:
MariaDB [gitea]> select action.* from action inner join repository on repository.id = action.repo_id order by action.created_unix desc limit 20;
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
| id | user_id | op_type | act_user_id | repo_id | comment_id | is_deleted | ref_name | is_private | content | created_unix |
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
| 86272 | 1 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86274 | 7 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86273 | 8 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86269 | 1 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86271 | 7 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86270 | 8 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86266 | 1 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86268 | 7 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86267 | 8 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86263 | 1 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86265 | 7 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86264 | 8 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86260 | 1 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86262 | 7 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86261 | 8 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86257 | 1 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86259 | 7 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86258 | 8 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86254 | 1 | 8 | 1 | 67 | 0 | 0 | | 0 | Removed for Report | 1727109786 |
| 86256 | 7 | 8 | 1 | 67 | 0 | 0 | | 0 | Removed for Report | 1727109786 |
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
20 rows in set (38.802 sec)
Changing the select to remove the inner join:
MariaDB [gitea]> select * from action order by created_unix desc limit 20;
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
| id | user_id | op_type | act_user_id | repo_id | comment_id | is_deleted | ref_name | is_private | content | created_unix |
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
| 86273 | 8 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86274 | 7 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86272 | 1 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86270 | 8 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86271 | 7 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86269 | 1 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86267 | 8 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86268 | 7 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86266 | 1 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86264 | 8 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86265 | 7 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86263 | 1 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86261 | 8 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86262 | 7 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86260 | 1 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86258 | 8 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86259 | 7 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86257 | 1 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86255 | 8 | 8 | 1 | 67 | 0 | 0 | | 0 | Removed for Report | 1727109786 |
| 86256 | 7 | 8 | 1 | 67 | 0 | 0 | | 0 | Removed for Report | 1727109786 |
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
20 rows in set (0.003 sec)
From reviewing the source code, the columns for repository
are intentionally ignored.
gitea/models/activities/action.go
Lines 456 to 459 in e1f0598
With this knowledge, I was able to build a significantly faster query that achieves the same thing (only getting actions for repositories that exist in the repository
table):
MariaDB [gitea]> select action.* from action where repo_id not in (select unique(repo_id) from action where repo_id not in (select id from repository)) order by action.created_unix desc limit 20;
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
| id | user_id | op_type | act_user_id | repo_id | comment_id | is_deleted | ref_name | is_private | content | created_unix |
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
| 86273 | 8 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86274 | 7 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86272 | 1 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86270 | 8 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86271 | 7 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86269 | 1 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86267 | 8 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86268 | 7 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86266 | 1 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86264 | 8 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86265 | 7 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86263 | 1 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86261 | 8 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86262 | 7 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86260 | 1 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86258 | 8 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86259 | 7 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86257 | 1 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86255 | 8 | 8 | 1 | 67 | 0 | 0 | | 0 | Removed for Report | 1727109786 |
| 86256 | 7 | 8 | 1 | 67 | 0 | 0 | | 0 | Removed for Report | 1727109786 |
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
20 rows in set (0.028 sec)
Look at that! 0.028 seconds! Much better! But wait? What in the world is that query? Let me break it down a bit:
The original query is completing an "inner join" for the apparently sole purpose of determing if the actions belong to repositories that currently exist.
The first thing I tried was to change it from an inner join to a subquery:
MariaDB [gitea]> analyze select action.* from action where repo_id in (select id from repository) order by action.created_unix desc limit 20;
+------+-------------+------------+-------+------------------+----------------------------+---------+---------------------+------+---------+----------+------------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
+------+-------------+------------+-------+------------------+----------------------------+---------+---------------------+------+---------+----------+------------+----------------------------------------------+
| 1 | PRIMARY | repository | index | PRIMARY | IDX_repository_is_template | 1 | NULL | 60 | 60.00 | 100.00 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | PRIMARY | action | ref | IDX_action_r_u_d | IDX_action_r_u_d | 9 | gitea.repository.id | 398 | 1437.77 | 100.00 | 100.00 | |
+------+-------------+------------+-------+------------------+----------------------------+---------+---------------------+------+---------+----------+------------+----------------------------------------------+
2 rows in set (42.362 sec)
Unfortunately... this is also just as slow...
So thinking about it... how do you make it faster? What are we trying to do? Exclude actions that have a repo_id that is not in repositories right?
Let's take a look at the queries we just used above!
MariaDB [gitea]> select unique(repo_id) from action;
+---------+
| repo_id |
+---------+
| 1 |
| ... |
| 70 |
+---------+
55 rows in set (0.001 sec)
MariaDB [gitea]> select id from repository;
+-----+
| id |
+-----+
| 1 |
| ... |
| 70 |
+-----+
60 rows in set (0.000 sec)
Ok, both of those look to be pretty similar, but what's happening is MariaDB needs to compare all of the ids against the "select id from repository" part of the subquery.
This is likely the "same thing" that MariaDB is doing when completing the above inner join.
So how do you make this faster?
Let's flip it around and only get the repo_ids of actions that are not in the repository table!
MariaDB [gitea]> select unique(repo_id) from action where repo_id not in (select id from repository);
Empty set (0.001 sec)
Neat! For me it's an empty set! (and it generally should be; however, repo_id is not a foreign key in the current schema)
Now, when you do a select against that subquery, it is much faster to execute as there is "less" that needs to be scanned for each row.
Bringing us back to this...
MariaDB [gitea]> select action.* from action where repo_id not in (select unique(repo_id) from action where repo_id not in (select id from repository)) order by action.created_unix desc limit 20;
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
| id | user_id | op_type | act_user_id | repo_id | comment_id | is_deleted | ref_name | is_private | content | created_unix |
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
| 86273 | 8 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86274 | 7 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86272 | 1 | 8 | 1 | 40 | 0 | 0 | | 0 | Removed for Report | 1727109844 |
| 86270 | 8 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86271 | 7 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86269 | 1 | 8 | 1 | 42 | 0 | 0 | | 0 | Removed for Report | 1727109836 |
| 86267 | 8 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86268 | 7 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86266 | 1 | 8 | 1 | 46 | 0 | 0 | | 0 | Removed for Report | 1727109821 |
| 86264 | 8 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86265 | 7 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86263 | 1 | 8 | 1 | 48 | 0 | 0 | | 0 | Removed for Report | 1727109811 |
| 86261 | 8 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86262 | 7 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86260 | 1 | 8 | 1 | 64 | 0 | 0 | | 0 | Removed for Report | 1727109803 |
| 86258 | 8 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86259 | 7 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86257 | 1 | 8 | 1 | 39 | 0 | 0 | | 0 | Removed for Report | 1727109795 |
| 86255 | 8 | 8 | 1 | 67 | 0 | 0 | | 0 | Removed for Report | 1727109786 |
| 86256 | 7 | 8 | 1 | 67 | 0 | 0 | | 0 | Removed for Report | 1727109786 |
+-------+---------+---------+-------------+---------+------------+------------+----------+------------+--------------------------+--------------+
20 rows in set (0.028 sec)
That is where the odd looking subquery comes from; however, it leads to significantly faster queries that still maintain the "same" restrictions
as the existing inner join!
Finally, let's add user_id=2 and is_deleted=0 back just for a "apples to apples" comparison with original query:
MariaDB [gitea]> analyze select action.* from action where repo_id not in (select unique(repo_id) from action where repo_id not in (select id from repository)) and user_id=2 and is_deleted=0 order by action.created_unix desc limit 20;
+------+--------------+------------+----------------+--------------------+----------------------------+---------+-------+------+--------+----------+------------+-------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
+------+--------------+------------+----------------+--------------------+----------------------------+---------+-------+------+--------+----------+------------+-------------------------------------------------+
| 1 | PRIMARY | action | ref | IDX_action_user_id | IDX_action_user_id | 9 | const | 186 | 20.00 | 100.00 | 100.00 | Using where; Using filesort |
| 2 | SUBQUERY | action | index_subquery | IDX_action_r_u_d | IDX_action_r_u_d | 9 | func | 796 | NULL | 100.00 | NULL | Using index; Using where; Full scan on NULL key |
| 3 | MATERIALIZED | repository | index | PRIMARY | IDX_repository_is_template | 1 | NULL | 60 | 60.00 | 100.00 | 100.00 | Using index |
+------+--------------+------------+----------------+--------------------+----------------------------+---------+-------+------+--------+----------+------------+-------------------------------------------------+
3 rows in set (0.002 sec)
Neat! Just as fast (0.002 seconds), I'm omitting the actual output from this query as it's just an absolute load of json in the content
column, but otherwise looks similar to above samples.
I'd test changing the query in the source code; however, I do not have a Go development environment currently setup. I do hope this knowledge can be incorporated into the next version update for faster index page loads!
Thanks!
Gitea Version
1.22.2
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
No response
Screenshots
No response
Git Version
No response
Operating System
AlmaLinux 9.4 - Official Gitea container
How are you running Gitea?
Official gitea container, running on podman:
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/gitea/gitea latest f5b290987d7d 2 weeks ago 168 MB
Database
MySQL/MariaDB