Volatile Functions in Query on Broadcast-like Relation in Greenplum

Some background knowledge

Greenplum is MPP database. The SQL query’s planning and executing model is a distributed computing model. Greenplum introduces the concept of locus to model a path’s data distribution.

  • Function scan of generate_series is General locus (because the information is complete  at every single place, but in Greenplum 5 this function is not treated as General, this may be the historical reason)
  • the scan path of a replicated table is segmenteGeneral locus (because data only stored in segments, not master, and the information is complete at every single segment)
  • ……

For more details, please refer to one of my talk Greenplum 查询优化器关键技术介绍.

The issue of general and segmentGeneral path with volatile functions

General and segmentGeneral locus imply that if the corresponding slice is executed in many different segments should provide the same result data set. Thus, in some cases, General and segmentGeneral can be treated like broadcast.


But what if the segmentGeneral and general locus path contain volatile functions? Volatile functions, by definition, do not guarantee the results of different invokes. So for such cases, they lose the property and cannot be treated as broadcast. Previously, Greenplum planner (also orca) does not handle these cases correctly.

Let’s see some examples of the wrong plans because of this kind of bug, for the complete list, please refer to the Gist: https://gist.github.com/kainwen/ad43f4ae88dc56f3c6eb73056b195e8b

Case 1: random() in a replicated table’s restriction clause, the following plan will execute the scan of replicated table everywhere, and thus the plan is not equivalent to the Postgres one.

explain (costs off, verbose) select * from t_rep, t_hashdist where t_rep.a > random();
 QUERY PLAN                                                                          
-------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)                                            
   Output: t_rep.a, t_rep.b, t_rep.c, t_hashdist.a, t_hashdist.b, t_hashdist.c       
   ->  Nested Loop                                                                   
         Output: t_rep.a, t_rep.b, t_rep.c, t_hashdist.a, t_hashdist.b, t_hashdist.c 
         ->  Seq Scan on public.t_hashdist                                           
               Output: t_hashdist.a, t_hashdist.b, t_hashdist.c                      
         ->  Materialize                                                             
               Output: t_rep.a, t_rep.b, t_rep.c                                     
               ->  Seq Scan on public.t_rep                                          
                     Output: t_rep.a, t_rep.b, t_rep.c                               
                     Filter: ((t_rep.a)::double precision > random())        

Case 2: random() is in the targetlist of a replicated table’s scan which also execute everywhere and break the property of “broadcast-like”:

explain (costs off, verbose) select * from t_hashdist where a in (select random()::int from t_rep);
 QUERY PLAN                                                     
----------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)                       
   Output: t_hashdist.a, t_hashdist.b, t_hashdist.c             
   ->  Hash Join                                                
         Output: t_hashdist.a, t_hashdist.b, t_hashdist.c       
         Hash Cond: (t_hashdist.a = ((random())::integer))      
         ->  Seq Scan on public.t_hashdist                      
               Output: t_hashdist.a, t_hashdist.b, t_hashdist.c 
         ->  Hash                                               
               Output: ((random())::integer)                    
               ->  HashAggregate                                
                     Output: ((random())::integer)              
                     Group Key: (random())::integer             
                     ->  Seq Scan on public.t_rep               
                           Output: (random())::integer         

After reading all the wrong plans, we should understand what the bug is. To emphasize it again: the root cause is that if a general or segmentGeneral locus paths contain volatile functions, they cannot be treated as general or segmentGeneral.

My idea to fix this issue

The idea is: when we find the pattern (a general or segmentGeneral locus paths contain volatile functions), we create a motion path above it to turn its locus to singleQE and then create a projection path. Then the core job becomes how we choose the right places to check:

  1. For a single base rel, we should only check its restriction, this is the at bottom of planner, this is at the function set_rel_pathlist
  2. When creating a join path, if the join locus is general or segmentGeneral, check its joinqual to see if it contains volatile functions
  3. When handling subquery, we will invoke set_subquery_pathlist function, at the end of this function, check the targetlist and havingQual
  4. When creating limit path, the check and change algorithm should also be used
  5. Correctly handle make_subplan

OrderBy clause and Group Clause should be included in targetlist and handled by the above Step 3.

I have open a PR (with detailed commit message, comments and test cases): Correct plan of general & segmentGeneral path with volatiole functions. #10418. Please refer to the code change there.

DMLs on replicated table

Update & Delete Statement on a replicated table is special. These statements have to be dispatched to each segment to execute. So if they contain volatile functions in their targetList or where clause, we should reject such statements:

  1. For targetList, we check it at the function create_motion_path_for_upddel 
  2. For where clause, they will be handled in the query planner and if we find the pattern and want to fix it, do another check if we are updating or deleting replicated table, if so reject the statement.

For insert statement, if we have correct behavior on planner, it is OK automatically. Refer to the  cases in the above gist link and the following gist link. Also, for insert we have to check the upsert related clause in the statement.

The correct plans after fixing

The cases that matched the above wrong ones are listed below. For the whole list, please refer to https://gist.github.com/kainwen/cd96c8fc35b68c9d87d73af7f269ba99

Case 1: A motion is added to broadcast the replicated table scan result.

explain (costs off, verbose) select * from t_rep, t_hashdist where t_rep.a > random();
 QUERY PLAN                                                                          
-------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)                                            
   Output: t_rep.a, t_rep.b, t_rep.c, t_hashdist.a, t_hashdist.b, t_hashdist.c       
   ->  Nested Loop                                                                   
         Output: t_rep.a, t_rep.b, t_rep.c, t_hashdist.a, t_hashdist.b, t_hashdist.c 
         ->  Seq Scan on public.t_hashdist                                           
               Output: t_hashdist.a, t_hashdist.b, t_hashdist.c                      
         ->  Materialize                                                             
               Output: t_rep.a, t_rep.b, t_rep.c                                     
               ->  Broadcast Motion 1:3  (slice2; segments: 1)                       
                     Output: t_rep.a, t_rep.b, t_rep.c                               
                     ->  Result                                                      
                           Output: t_rep.a, t_rep.b, t_rep.c                         
                           ->  Seq Scan on public.t_rep                              
                                 Output: t_rep.a, t_rep.b, t_rep.c                   
                                 Filter: ((t_rep.a)::double precision > random())    

Case 2: A motion is added to broadcast the replicated table scan result.

explain (costs off, verbose) select * from t_hashdist where a in (select random()::int from t_rep);
 QUERY PLAN                                                       
------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)                         
   Output: t_hashdist.a, t_hashdist.b, t_hashdist.c               
   ->  Hash Semi Join                                             
         Output: t_hashdist.a, t_hashdist.b, t_hashdist.c         
         Hash Cond: (t_hashdist.a = ((random())::integer))        
         ->  Seq Scan on public.t_hashdist                        
               Output: t_hashdist.a, t_hashdist.b, t_hashdist.c   
         ->  Hash                                                 
               Output: ((random())::integer)                      
               ->  Redistribute Motion 1:3  (slice2; segments: 1) 
                     Output: ((random())::integer)                
                     Hash Key: ((random())::integer)              
                     ->  Result                                   
                           Output: ((random())::integer)          
                           ->  Seq Scan on public.t_rep           
                                 Output: (random())::integer 

Related Resources

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.