MM-63756: Added index to sidebarchannels table (#30724)

The (s SqlChannelStore) getSidebarCategoriesT gets called quite frequently.
- Team switch
- WS reconnect
- Category created
- Category updated
- Category deleted

Of these 1 and 2 are probably the most commonly called sources. Based on that,
the sidebarChannels table is not that well-optimized. Even though
the query time might be reasonable, without an index, it has to churn a lot of
DB CPU for a sequential scan.

We add a new index to optimize this.

CREATE INDEX idx_sidebarchannels_categoryid ON sidebarchannels(categoryid);

```
Before:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=40854.18..40854.19 rows=4 width=193) (actual time=251.635..251.646 rows=204 loops=1)
   Sort Key: sidebarcategories.sortorder, sidebarchannels.sortorder
   Sort Method: quicksort  Memory: 65kB
   Buffers: shared hit=1203 read=23668
   ->  Nested Loop  (cost=8.87..40854.14 rows=4 width=193) (actual time=251.345..251.455 rows=204 loops=1)
         Buffers: shared hit=1203 read=23668
         ->  Nested Loop  (cost=0.41..9.47 rows=1 width=54) (actual time=0.068..0.074 rows=1 loops=1)
               Buffers: shared hit=5
               ->  Seq Scan on teams  (cost=0.00..1.03 rows=1 width=27) (actual time=0.024..0.026 rows=1 loops=1)
                     Filter: (((id)::text = '3ee5y5ok6jgxicrmqstdnghmfr'::text) AND (deleteat = 0))
                     Rows Removed by Filter: 1
                     Buffers: shared hit=1
               ->  Index Scan using teammembers_pkey on teammembers  (cost=0.41..8.43 rows=1 width=27) (actual time=0.039..0.043 rows=1 loops=1)
                     Index Cond: (((teamid)::text = '3ee5y5ok6jgxicrmqstdnghmfr'::text) AND ((userid)::text = 'tc3p1yqw67d8idcp3g98awexqe'::text))
                     Filter: (deleteat = 0)
                     Buffers: shared hit=4
         ->  Hash Right Join  (cost=8.45..40844.62 rows=4 width=193) (actual time=251.274..251.361 rows=204 loops=1)
               Hash Cond: ((sidebarchannels.categoryid)::text = (sidebarcategories.id)::text)
               Buffers: shared hit=1198 read=23668
               ->  Seq Scan on sidebarchannels  (cost=0.00..37514.77 rows=1265277 width=100) (actual time=0.043..99.345 rows=1265444 loops=1)
                     Buffers: shared hit=1194 read=23668
               ->  Hash  (cost=8.44..8.44 rows=1 width=158) (actual time=0.047..0.047 rows=6 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 10kB
                     Buffers: shared hit=4
                     ->  Index Scan using idx_sidebarcategories_userid_teamid on sidebarcategories  (cost=0.42..8.44 rows=1 width=158) (actual time=0.029..0.037 rows=6 loops=1)
                           Index Cond: (((userid)::text = 'tc3p1yqw67d8idcp3g98awexqe'::text) AND ((teamid)::text = '3ee5y5ok6jgxicrmqstdnghmfr'::text))
                           Buffers: shared hit=4
 Planning:
   Buffers: shared hit=9
 Planning Time: 1.215 ms
 Execution Time: 251.755 ms
(31 rows)

After:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=1544.53..1544.54 rows=4 width=192) (actual time=0.834..0.859 rows=204 loops=1)
   Sort Key: sidebarcategories.sortorder, sidebarchannels.sortorder
   Sort Method: quicksort  Memory: 65kB
   Buffers: shared hit=58
   ->  Nested Loop Left Join  (cost=8.53..1544.49 rows=4 width=192) (actual time=0.066..0.252 rows=204 loops=1)
         Buffers: shared hit=58
         ->  Nested Loop  (cost=0.83..17.93 rows=1 width=157) (actual time=0.042..0.098 rows=6 loops=1)
               Buffers: shared hit=34
               ->  Nested Loop  (cost=0.42..9.48 rows=1 width=157) (actual time=0.030..0.049 rows=6 loops=1)
                     Buffers: shared hit=10
                     ->  Index Scan using idx_sidebarcategories_userid_teamid on sidebarcategories  (cost=0.42..8.44 rows=1 width=157) (actual time=0.018..0.022 rows=6 loops=1)
                           Index Cond: (((userid)::text = 'tc3p1yqw67d8idcp3g98awexqe'::text) AND ((teamid)::text = '3ee5y5ok6jgxicrmqstdnghmfr'::text))
                           Buffers: shared hit=4
                     ->  Seq Scan on teams  (cost=0.00..1.03 rows=1 width=27) (actual time=0.002..0.003 rows=1 loops=6)
                           Filter: (((id)::text = '3ee5y5ok6jgxicrmqstdnghmfr'::text) AND (deleteat = 0))
                           Rows Removed by Filter: 1
                           Buffers: shared hit=6
               ->  Index Scan using teammembers_pkey on teammembers  (cost=0.41..8.43 rows=1 width=27) (actual time=0.007..0.007 rows=1 loops=6)
                     Index Cond: (((teamid)::text = '3ee5y5ok6jgxicrmqstdnghmfr'::text) AND ((userid)::text = 'tc3p1yqw67d8idcp3g98awexqe'::text))
                     Filter: (deleteat = 0)
                     Buffers: shared hit=24
         ->  Bitmap Heap Scan on sidebarchannels  (cost=7.69..1522.35 rows=421 width=100) (actual time=0.012..0.017 rows=34 loops=6)
               Recheck Cond: ((categoryid)::text = (sidebarcategories.id)::text)
               Heap Blocks: exact=6
               Buffers: shared hit=24
               ->  Bitmap Index Scan on idx_sidebarchannels_categoryid  (cost=0.00..7.58 rows=421 width=0) (actual time=0.010..0.010 rows=34 loops=6)
                     Index Cond: ((categoryid)::text = (sidebarcategories.id)::text)
                     Buffers: shared hit=18
 Planning:
   Buffers: shared hit=18
 Planning Time: 0.543 ms
 Execution Time: 0.968 ms
(32 rows)
```

I have also looked at potentially re-ordering the JOINs to make
sidebarchannels and sidebarcategories JOIN earlier, but that didn't give
a major benefit.

Also looked at adding a compound index with (categoryid, sortorder) to improve
sorting performance, but that didn't give a major benefit from what the single
column index already gives.

The `completePopulatingCategoryChannelsT` query also partially benefits
from this. But the Postgres optimizer sometimes selects the index on categoryId
and sometimes on ChannelId, both giving equivalent performance. So there's no major
improvement there, but at the same time, no regression as well.

```
Original:
[bigdb] # EXPLAIN (ANALYZE, BUFFERS) SELECT Id FROM ChannelMembers LEFT JOIN Channels ON Channels.Id=ChannelMembers.ChannelId WHERE (ChannelMembers.UserId = 'tc3p1yqw67d8idcp3g98awexqe' AND Channels.Type IN ('D'
                                                                                                                                                                                                                ,'G') AND Channels.DeleteAt = 0 AND NOT EXISTS ( SELECT 1 FROM SidebarChannels JOIN SidebarCategories on SidebarChannels.CategoryId=SidebarCategories.Id WHERE (SidebarChannels.ChannelId = ChannelMembers.ChannelI
                                                                                                                                                                                                                                                                                                                                                                                d AND SidebarCategories.UserId = 'tc3p1yqw67d8idcp3g98awexqe' AND SidebarCategories.TeamId = '3ee5y5ok6jgxicrmqstdnghmfr') )) ORDER BY DisplayName ASC;
                                                                                            QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=5864.68..5865.84 rows=463 width=40) (actual time=9.008..9.022 rows=39 loops=1)
   Sort Key: channels.displayname
   Sort Method: quicksort  Memory: 27kB
   Buffers: shared hit=2112
   ->  Nested Loop Anti Join  (cost=1.96..5844.18 rows=463 width=40) (actual time=0.188..8.932 rows=39 loops=1)
         Buffers: shared hit=2112
         ->  Nested Loop  (cost=0.99..3476.66 rows=463 width=67) (actual time=0.159..7.952 rows=39 loops=1)
               Buffers: shared hit=1956
               ->  Index Only Scan using idx_channelmembers_user_id_channel_id_last_viewed_at on channelmembers  (cost=0.56..40.78 rows=470 width=27) (actual time=0.036..0.467 rows=437 loops=1)
                     Index Cond: (userid = 'tc3p1yqw67d8idcp3g98awexqe'::text)
                     Heap Fetches: 45
                     Buffers: shared hit=208
               ->  Memoize  (cost=0.43..7.69 rows=1 width=40) (actual time=0.016..0.016 rows=0 loops=437)
                     Cache Key: channelmembers.channelid
                     Cache Mode: logical
                     Hits: 0  Misses: 437  Evictions: 0  Overflows: 0  Memory Usage: 42kB
                     Buffers: shared hit=1748
                     ->  Index Scan using channels_pkey on channels  (cost=0.42..7.68 rows=1 width=40) (actual time=0.015..0.015 rows=0 loops=437)
                           Index Cond: ((id)::text = (channelmembers.channelid)::text)
                           Filter: ((type = ANY ('{D,G}'::channel_type[])) AND (deleteat = 0))
                           Rows Removed by Filter: 1
                           Buffers: shared hit=1748
         ->  Nested Loop  (cost=0.97..5.10 rows=1 width=27) (actual time=0.023..0.023 rows=0 loops=39)
               Buffers: shared hit=156
               ->  Index Only Scan using sidebarchannels_pkey on sidebarchannels  (cost=0.55..4.56 rows=1 width=92) (actual time=0.022..0.022 rows=0 loops=39)
                     Index Cond: (channelid = (channelmembers.channelid)::text)
                     Heap Fetches: 0
                     Buffers: shared hit=156
               ->  Index Scan using sidebarcategories_pkey on sidebarcategories  (cost=0.42..0.48 rows=1 width=65) (never executed)
                     Index Cond: ((id)::text = (sidebarchannels.categoryid)::text)
                     Filter: (((userid)::text = 'tc3p1yqw67d8idcp3g98awexqe'::text) AND ((teamid)::text = '3ee5y5ok6jgxicrmqstdnghmfr'::text))
 Planning:
   Buffers: shared hit=48 dirtied=1
 Planning Time: 2.222 ms
 Execution Time: 9.142 ms
(35 rows)

New:
[bigdb] # EXPLAIN (ANALYZE, BUFFERS) SELECT Id FROM ChannelMembers LEFT JOIN Channels ON Channels.Id=ChannelMembers.ChannelId WHERE (ChannelMembers.UserId = 'tc3p1yqw67d8idcp3g98awexqe' AND Channels.Type IN ('D'
                                                                                                                                                                                                                ,'G') AND Channels.DeleteAt = 0 AND NOT EXISTS ( SELECT 1 FROM SidebarChannels JOIN SidebarCategories on SidebarChannels.CategoryId=SidebarCategories.Id WHERE (SidebarChannels.ChannelId = ChannelMembers.ChannelId AND SidebarCategories.UserId = 'tc3p1yqw67d8idcp3g98awexqe' AND SidebarCategories.TeamId = '3ee5y5ok6jgxicrmqstdnghmfr') )) ORDER BY DisplayName ASC;
                                                                                            QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=5059.95..5061.11 rows=463 width=40) (actual time=12.072..12.086 rows=39 loops=1)
   Sort Key: channels.displayname
   Sort Method: quicksort  Memory: 27kB
   Buffers: shared hit=1984
   ->  Nested Loop Anti Join  (cost=9.10..5039.45 rows=463 width=40) (actual time=0.751..12.009 rows=39 loops=1)
         Join Filter: ((sidebarchannels.channelid)::text = (channelmembers.channelid)::text)
         Rows Removed by Join Filter: 7839
         Buffers: shared hit=1984
         ->  Nested Loop  (cost=0.99..3476.66 rows=463 width=67) (actual time=0.161..7.579 rows=39 loops=1)
               Buffers: shared hit=1956
               ->  Index Only Scan using idx_channelmembers_user_id_channel_id_last_viewed_at on channelmembers  (cost=0.56..40.78 rows=470 width=27) (actual time=0.036..0.449 rows=437 loops=1)
                     Index Cond: (userid = 'tc3p1yqw67d8idcp3g98awexqe'::text)
                     Heap Fetches: 45
                     Buffers: shared hit=208
               ->  Memoize  (cost=0.43..7.69 rows=1 width=40) (actual time=0.016..0.016 rows=0 loops=437)
                     Cache Key: channelmembers.channelid
                     Cache Mode: logical
                     Hits: 0  Misses: 437  Evictions: 0  Overflows: 0  Memory Usage: 42kB
                     Buffers: shared hit=1748
                     ->  Index Scan using channels_pkey on channels  (cost=0.42..7.68 rows=1 width=40) (actual time=0.014..0.014 rows=0 loops=437)
                           Index Cond: ((id)::text = (channelmembers.channelid)::text)
                           Filter: ((type = ANY ('{D,G}'::channel_type[])) AND (deleteat = 0))
                           Rows Removed by Filter: 1
                           Buffers: shared hit=1748
         ->  Materialize  (cost=8.11..1535.03 rows=4 width=27) (actual time=0.003..0.046 rows=201 loops=39)
               Buffers: shared hit=28
               ->  Nested Loop  (cost=8.11..1535.01 rows=4 width=27) (actual time=0.099..0.383 rows=201 loops=1)
                     Buffers: shared hit=28
                     ->  Index Scan using idx_sidebarcategories_userid_teamid on sidebarcategories  (cost=0.42..8.44 rows=1 width=65) (actual time=0.047..0.057 rows=6 loops=1)
                           Index Cond: (((userid)::text = 'tc3p1yqw67d8idcp3g98awexqe'::text) AND ((teamid)::text = '3ee5y5ok6jgxicrmqstdnghmfr'::text))
                           Buffers: shared hit=4
                     ->  Bitmap Heap Scan on sidebarchannels  (cost=7.69..1522.35 rows=421 width=92) (actual time=0.028..0.040 rows=34 loops=6)
                           Recheck Cond: ((categoryid)::text = (sidebarcategories.id)::text)
                           Heap Blocks: exact=6
                           Buffers: shared hit=24
                           ->  Bitmap Index Scan on idx_sidebarchannels_categoryid  (cost=0.00..7.58 rows=421 width=0) (actual time=0.023..0.023 rows=34 loops=6)
                                 Index Cond: ((categoryid)::text = (sidebarcategories.id)::text)
                                 Buffers: shared hit=18
 Planning:
   Buffers: shared hit=51
 Planning Time: 2.240 ms
 Execution Time: 12.210 ms
(42 rows)
```

Analysis on MySQL for completion:
```
Before:
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Sort: SidebarCategories.SortOrder, SidebarChannels.SortOrder  (actual time=277.675..277.675 rows=4 loops=1)
    -> Stream results  (cost=138558.36 rows=1287808) (actual time=242.506..277.650 rows=4 loops=1)
        -> Left hash join (<hash>(SidebarChannels.CategoryId)=<hash>(SidebarCategories.Id)), extra conditions: (SidebarChannels.CategoryId = SidebarCategories.Id)  (cost=138558.36 rows=1287808) (actual time=242.498..277.626 rows=4 loops=1)
            -> Index lookup on SidebarCategories using idx_sidebarcategories_userid_teamid (UserId='qdggj9pyobgkjpj8htwzizks1r', TeamId='xmh7bupzajnudqf3h4mm76qapy')  (cost=1.40 rows=4) (actual time=0.092..0.094 rows=4 loops=1)
            -> Hash
                -> Table scan on SidebarChannels  (cost=8394.55 rows=321952) (actual time=0.123..106.334 rows=300002 loops=1)
 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

After:
----------------------------------------------------------------+
| -> Sort: SidebarCategories.SortOrder, SidebarChannels.SortOrder  (actual time=0.739..0.742 rows=4 loops=1)
    -> Stream results  (cost=6.80 rows=7) (actual time=0.468..0.703 rows=4 loops=1)
        -> Nested loop left join  (cost=6.80 rows=7) (actual time=0.456..0.673 rows=4 loops=1)
            -> Index lookup on SidebarCategories using idx_sidebarcategories_userid_teamid (UserId='qdggj9pyobgkjpj8htwzizks1r', TeamId='xmh7bupzajnudqf3h4mm76qapy')  (cost=4.38 rows=4) (actual time=0.302..0.313 rows=4 loops=1)
            -> Index lookup on SidebarChannels using idx_sidebarchannels_categoryid (CategoryId=SidebarCategories.Id)  (cost=0.48 rows=2) (actual time=0.085..0.087 rows=0 loops=4)
 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
```

Timing wise, it takes around 2s to add the index on a table with 1.2M rows
for Postgres. And it takes around 5s on MySQL on a table with 300K rows.
It looks like it takes longer on MySQL, but since both migrations are
non-locking, it should be fine.

https://mattermost.atlassian.net/browse/MM-63756
```release-note
NONE
```
This commit is contained in:
Agniva De Sarker 2025-04-24 12:11:28 +05:30 committed by GitHub
parent 011f179831
commit 131cf039bb
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 36 additions and 0 deletions

View file

@ -265,6 +265,8 @@ channels/db/migrations/mysql/000133_add_channel_banner_fields.down.sql
channels/db/migrations/mysql/000133_add_channel_banner_fields.up.sql
channels/db/migrations/mysql/000134_create_access_control_policies.down.sql
channels/db/migrations/mysql/000134_create_access_control_policies.up.sql
channels/db/migrations/mysql/000135_sidebarchannels_categoryid.down.sql
channels/db/migrations/mysql/000135_sidebarchannels_categoryid.up.sql
channels/db/migrations/postgres/000001_create_teams.down.sql
channels/db/migrations/postgres/000001_create_teams.up.sql
channels/db/migrations/postgres/000002_create_team_members.down.sql
@ -531,3 +533,5 @@ channels/db/migrations/postgres/000133_add_channel_banner_fields.down.sql
channels/db/migrations/postgres/000133_add_channel_banner_fields.up.sql
channels/db/migrations/postgres/000134_create_access_control_policies.down.sql
channels/db/migrations/postgres/000134_create_access_control_policies.up.sql
channels/db/migrations/postgres/000135_sidebarchannels_categoryid.down.sql
channels/db/migrations/postgres/000135_sidebarchannels_categoryid.up.sql

View file

@ -0,0 +1,14 @@
SET @preparedStatement = (SELECT IF(
(
SELECT COUNT(*) FROM INFORMATION_SCHEMA.STATISTICS
WHERE table_name = 'SidebarChannels'
AND table_schema = DATABASE()
AND index_name = 'idx_sidebarchannels_categoryid'
) > 0,
'SELECT 1',
'DROP INDEX idx_sidebarchannels_categoryid ON SidebarChannels;'
));
PREPARE removeIndexIfExists FROM @preparedStatement;
EXECUTE removeIndexIfExists;
DEALLOCATE PREPARE removeIndexIfExists;

View file

@ -0,0 +1,14 @@
SET @preparedStatement = (SELECT IF(
(
SELECT COUNT(*) FROM INFORMATION_SCHEMA.STATISTICS
WHERE table_name = 'SidebarChannels'
AND table_schema = DATABASE()
AND index_name = 'idx_sidebarchannels_categoryid'
) > 0,
'SELECT 1',
'CREATE INDEX idx_sidebarchannels_categoryid ON SidebarChannels(CategoryId);'
));
PREPARE createIndexIfNotExists FROM @preparedStatement;
EXECUTE createIndexIfNotExists;
DEALLOCATE PREPARE createIndexIfNotExists;

View file

@ -0,0 +1,2 @@
-- morph:nontransactional
DROP INDEX CONCURRENTLY IF EXISTS idx_sidebarchannels_categoryid;

View file

@ -0,0 +1,2 @@
-- morph:nontransactional
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_sidebarchannels_categoryid ON sidebarchannels(categoryid);