Thursday, January 21, 2021

Demo of another benchmarking framework

At PGCon 2020, I gave a presentation proposing that the PostgreSQL community could use better benchmarking tools to serve the open source developers.  Some of us continued discussing that during the unconference shortly after.  Here I describe a proof of concept for an open source framework to run, but not necessarily limited to, user defined benchmarks that follows the model of loading a database and running a series of queries.

The goal of this framework is to have simpler way create, modify, and execute benchmarks.  I will try to demonstrate that by using dbgen and qgen provided by the TPC-H Benchmark (TM).  My intent is to use this well known benchmark only to show how the framework can be used to customize workloads for testing and development purposes.  For those not familiar, this models a data warehouse type of system that may execute large and complex reporting queries.

I'll give a couple of examples such as changing a table definition and how to modify a query.

The framework itself is just for executing a set of script.  I've named the user defined benchmark referenced here as pgnottpch.  I've created a brief 51 second demo of executing this benchmark.  It doesn't exactly match what I'm describing here anymore, since it's still very volatile, but it will hopefully still provide additional context around the examples shown here.

First let's review the information for the pgnottpch user defined benchmark (this is a snippet of the complete description):

$ tsbyo1 info ~/touchstone-udb/pgnottpch
This is **NOT** a TPC-H.  Go to `<>`_ for a real TPC-H.

This is a set of scripts that perform some tasks that appear to be a TPC-H
benchmark but is absolutely not comparable to a TPC-H benchmark.

I want to highlight the lines displayed under the STAGES heading towards the bottom of the output that correspond to subdirectories under pgnottpch containing executable scripts:

  • 00createdb simply runs the PostgreSQL createdb command
  • 01load is a set of scripts, one per table, to create a table, load data, and create indexes
  • 02power  is a set a of scripts, one per query, to execute

Let's look at the script for building the region table (touchstone-udb/pgnottpch/01load/06build_region).  It's a here-script to run SQL commands through psql:

psql -Xq $DBNAME << __EOF__

        r_regionkey INTEGER NOT NULL,
        r_name CHAR(25) NOT NULL,
        r_comment VARCHAR(152) NOT NULL);


ON region (r_regionkey)
WITH (fillfactor = 100);




This is fairly simple table with three columns and an index where one could experiment with things like:

  • Creating an index on the r_name column
  • Adjusting the fillfactor on an index

There are several ways to apply these changes after the database has been created, but one way to use the framework is to first drop the table, and then run a command to execute the step that builds the region table:

$ tsbyo1 run ~/touchstone-udb/pgnottpch load build_region

Now let's look at how to modify a query.  The pgnottpch scripts include a set of query templates for PostgreSQL.  Here is a snippet from the query 6 template:

        sum(l_extendedprice * l_discount) as revenue
        l_shipdate >= date ':1'
        and l_shipdate < cast(date ':1' + interval '1 year' as date)
        and l_discount between :2 - 0.01 and :2 + 0.01
        and l_quantity < :3;

This is a fairly simple query and could be modified or rewritten (if you allow us to pretend it's actually a larger and complex query like query 7).  Then the framework can execute the single query with the following command:

$ tsbyo1 run ~/touchstone-udb/pgnottpch run power q6

The pgnottpch scripts are written to capture the query execution and the framework saves the output of each execution.  Here is a snippet from what is saved from executing query 6:

    sum(l_extendedprice * l_discount) as revenue
    l_shipdate >= date '1994-01-01'
    and l_shipdate < cast(date '1994-01-01' + interval '1 year' as date)
    and l_discount between 0.08 - 0.01 and 0.08 + 0.01
    and l_quantity < 25;
                                                                                        QUERY PLAN

 Finalize Aggregate  (cost=173045.41..173045.42 rows=1 width=32)
   ->  Gather  (cost=173045.19..173045.40 rows=2 width=32)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=172045.19..172045.20 rows=1 width=32)
               ->  Parallel Seq Scan on lineitem  (cost=0.00..171798.44 rows=49349 width=12)
                     Filter: ((l_shipdate >= '1994-01-01'::date) AND (l_shipdate < '1995-01-01'::date) AND (l_discount >= 0.07) AND (l_discount <= 0.09) AND (l_quantity < '25'::numeric))
(6 rows)

Time: 1.490 ms

The actual query executed with its randomly generated values is saved with the EXPLAIN output, instead of the query result, can be reviewed as well as its execution time.

That's my brief overview of some of the aspects I am hoping is an improvement to running and developing benchmarks.   I'm hoping that additional benchmarks can be more easily created and shared if this proves to be a more useful implementation.


There are many things that I've glossed over that I'll mention here:

  • The framework is PostgreSQL Licensed and written as a (hopefully) POSIX compliant shell script, intended for portability.
  • User defined benchmarks are to be distributed separately.
  • For ease of demoing, dbgen was modified so that it can support direct loading with the PostgreSQL COPY command.  A dbt3 fork with those changes is available in the pgxl10 branch.  There may be additional changes to enable more useful testing with PostgreSQL.
  • The scripts in 02power execute qgen to generate and execute queries, but focusing on the query templates seemed more relevant here.
  • The framework currently can only run each step serially, meaning each table is built and loaded before the next is started, and only one query is executed at a time.
  • Results are saved in timestamp named directories (e.g. 2021-01-19T14:29:50-08:00) in $HOME/Results by default.  The framework saves some timing information, but the user defined benchmark scripts can save additional data.  The pgnottpch scripts saved an additional file per query executed in the power stage.


Labels: ,


Post a Comment

<< Home