Databend on Volcengine TOS Performance
tip
- Hardware: Volcengine(ecs.g1ie.8xlarge), 32 vCPU
- Storage: TOS (cn-beijing)
- Dataset: ontime, 60.8 GB Raw CSV Data, 202687654 records
- Databend: v0.7.88-nightly
- No Local Caching(Data Caching/Query Result Caching)
- Databend is automatic performance optimization with no manual tuning required.
- Analyzing OnTime Datasets with Databend on Volcengine TOS
Q1 ( 7 rows in set (0.15 sec) )
SELECT
    DayOfWeek,
    count(*) AS c
FROM ontime
WHERE (Year >= 2000) AND (Year <= 2008)
GROUP BY DayOfWeek
ORDER BY c DESC
+-----------+---------+
| DayOfWeek | c       |
+-----------+---------+
|         5 | 8497470 |
|         1 | 8476871 |
|         4 | 8455694 |
|         3 | 8415608 |
|         2 | 8371065 |
|         7 | 8033931 |
|         6 | 7299504 |
+-----------+---------+
7 rows in set (0.15 sec)
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                                                                           |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection: DayOfWeek:UInt8, count() as c:UInt64                                                                                                                                                                                                                  |
|   Sort: count():UInt64                                                                                                                                                                                                                                            |
|     AggregatorFinal: groupBy=[[DayOfWeek]], aggr=[[count()]]                                                                                                                                                                                                      |
|       AggregatorPartial: groupBy=[[DayOfWeek]], aggr=[[count()]]                                                                                                                                                                                                  |
|         Filter: ((Year >= 2000) and (Year <= 2008))                                                                                                                                                                                                               |
|           ReadDataSource: scan schema: [Year:UInt16, DayOfWeek:UInt8], statistics: [read_rows: 77000000, read_bytes: 231000000, partitions_scanned: 77, partitions_total: 209], push_downs: [projections: [0, 4], filters: [((Year >= 2000) AND (Year <= 2008))]] |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q2 ( 7 rows in set (0.25 sec) )
SELECT
    DayOfWeek,
    count(*) AS c
FROM ontime
WHERE (DepDelay > 10) AND (Year >= 2000) AND (Year <= 2008)
GROUP BY DayOfWeek
ORDER BY c DESC
+-----------+---------+
| DayOfWeek | c       |
+-----------+---------+
|         5 | 2038214 |
|         4 | 1870671 |
|         1 | 1749681 |
|         7 | 1741265 |
|         3 | 1598446 |
|         2 | 1500023 |
|         6 | 1359931 |
+-----------+---------+
7 rows in set (0.25 sec)
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                                                                                                                     |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection: DayOfWeek:UInt8, count() as c:UInt64                                                                                                                                                                                                                                                            |
|   Sort: count():UInt64                                                                                                                                                                                                                                                                                      |
|     AggregatorFinal: groupBy=[[DayOfWeek]], aggr=[[count()]]                                                                                                                                                                                                                                                |
|       AggregatorPartial: groupBy=[[DayOfWeek]], aggr=[[count()]]                                                                                                                                                                                                                                            |
|         Filter: (((DepDelay > 10) and (Year >= 2000)) and (Year <= 2008))                                                                                                                                                                                                                                   |
|           ReadDataSource: scan schema: [Year:UInt16, DayOfWeek:UInt8, DepDelay:Int32], statistics: [read_rows: 77000000, read_bytes: 539000000, partitions_scanned: 77, partitions_total: 209], push_downs: [projections: [0, 4, 31], filters: [(((DepDelay > 10) AND (Year >= 2000)) AND (Year <= 2008))]] |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q3 ( 10 rows in set (0.43 sec) )
SELECT
    Origin,
    count(*) AS c
FROM ontime
WHERE (DepDelay > 10) AND (Year >= 2000) AND (Year <= 2008)
GROUP BY Origin
ORDER BY c DESC
LIMIT 10
+--------+--------+
| Origin | c      |
+--------+--------+
| ORD    | 828292 |
| ATL    | 799974 |
| DFW    | 587998 |
| LAX    | 383225 |
| PHX    | 383033 |
| LAS    | 343314 |
| DEN    | 338946 |
| EWR    | 284441 |
| DTW    | 282655 |
| IAH    | 277552 |
+--------+--------+
10 rows in set (0.43 sec)
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Limit: 10                                                                                                                                                                                                                                                                                                     |
|   Projection: Origin:String, count() as c:UInt64                                                                                                                                                                                                                                                              |
|     Sort: count():UInt64                                                                                                                                                                                                                                                                                      |
|       AggregatorFinal: groupBy=[[Origin]], aggr=[[count()]]                                                                                                                                                                                                                                                   |
|         AggregatorPartial: groupBy=[[Origin]], aggr=[[count()]]                                                                                                                                                                                                                                               |
|           Filter: (((DepDelay > 10) and (Year >= 2000)) and (Year <= 2008))                                                                                                                                                                                                                                   |
|             ReadDataSource: scan schema: [Year:UInt16, Origin:String, DepDelay:Int32], statistics: [read_rows: 77000000, read_bytes: 1313594870, partitions_scanned: 77, partitions_total: 209], push_downs: [projections: [0, 14, 31], filters: [(((DepDelay > 10) AND (Year >= 2000)) AND (Year <= 2008))]] |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q4 ( 20 rows in set (0.16 sec) )
SELECT
    IATA_CODE_Reporting_Airline AS Carrier,
    count()
FROM ontime
WHERE (DepDelay > 10) AND (Year = 2007)
GROUP BY Carrier
ORDER BY count() DESC
+---------+---------+
| Carrier | count() |
+---------+---------+
| WN      |  296293 |
| AA      |  176203 |
| MQ      |  145630 |
| US      |  135987 |
| UA      |  128174 |
| OO      |  127426 |
| EV      |  101796 |
| XE      |   99915 |
| DL      |   93675 |
| NW      |   90429 |
| CO      |   76662 |
| YV      |   67905 |
| FL      |   59460 |
| OH      |   59034 |
| B6      |   50740 |
| 9E      |   46948 |
| AS      |   42830 |
| F9      |   23035 |
| AQ      |    4299 |
| HA      |    2746 |
+---------+---------+
20 rows in set (0.16 sec)
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                                                                                                                  |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection: IATA_CODE_Reporting_Airline as Carrier:String, count():UInt64                                                                                                                                                                                                                                |
|   Sort: count():UInt64                                                                                                                                                                                                                                                                                   |
|     AggregatorFinal: groupBy=[[IATA_CODE_Reporting_Airline]], aggr=[[count()]]                                                                                                                                                                                                                           |
|       AggregatorPartial: groupBy=[[IATA_CODE_Reporting_Airline]], aggr=[[count()]]                                                                                                                                                                                                                       |
|         Filter: ((DepDelay > 10) and (Year = 2007))                                                                                                                                                                                                                                                      |
|           ReadDataSource: scan schema: [Year:UInt16, IATA_CODE_Reporting_Airline:String, DepDelay:Int32], statistics: [read_rows: 27000000, read_bytes: 433050686, partitions_scanned: 27, partitions_total: 209], push_downs: [projections: [0, 8, 31], filters: [((DepDelay > 10) AND (Year = 2007))]] |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q5 ( 20 rows in set (0.19 sec) )
SELECT
    IATA_CODE_Reporting_Airline AS Carrier,
    avg(CAST(DepDelay > 10, Int8)) * 1000 AS c3
FROM ontime
WHERE Year = 2007
GROUP BY Carrier
ORDER BY c3 DESC
+---------+--------------------+
| Carrier | c3                 |
+---------+--------------------+
| EV      |  355.6390924907593 |
| US      |  280.1273877477871 |
| AA      | 277.98541311368336 |
| MQ      | 269.43869867195565 |
| AS      |  267.3783437899928 |
| B6      |  265.0300339514233 |
| UA      |  261.5785241692891 |
| WN      | 253.48648396615198 |
| OH      | 250.11015455531455 |
| CO      | 237.23274877688758 |
| F9      | 235.62806873977087 |
| YV      | 230.68534661403305 |
| XE      |  229.8095787916913 |
| FL      | 225.94705102238572 |
| NW      | 218.15036933750838 |
| OO      |  213.1297250284338 |
| DL      |  196.8421207466447 |
| 9E      |  181.3707499681284 |
| AQ      |  92.73080241587576 |
| HA      |  48.88295505117935 |
+---------+--------------------+
20 rows in set (0.19 sec)
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                                                                                                |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection: IATA_CODE_Reporting_Airline as Carrier:String, (avg(cast((DepDelay > 10) as TINYINT)) * 1000) as c3:Float64                                                                                                                                                                |
|   Sort: (avg(cast((DepDelay > 10) as TINYINT)) * 1000):Float64                                                                                                                                                                                                                         |
|     Expression: IATA_CODE_Reporting_Airline:String, (avg(cast((DepDelay > 10) as TINYINT)) * 1000):Float64 (Before OrderBy)                                                                                                                                                            |
|       AggregatorFinal: groupBy=[[IATA_CODE_Reporting_Airline]], aggr=[[avg(cast((DepDelay > 10) as TINYINT))]]                                                                                                                                                                         |
|         AggregatorPartial: groupBy=[[IATA_CODE_Reporting_Airline]], aggr=[[avg(cast((DepDelay > 10) as TINYINT))]]                                                                                                                                                                     |
|           Expression: IATA_CODE_Reporting_Airline:String, cast((DepDelay > 10) as Int8):Int8 (Before GroupBy)                                                                                                                                                                          |
|             Filter: (Year = 2007)                                                                                                                                                                                                                                                      |
|               ReadDataSource: scan schema: [Year:UInt16, IATA_CODE_Reporting_Airline:String, DepDelay:Int32], statistics: [read_rows: 27000000, read_bytes: 433050686, partitions_scanned: 27, partitions_total: 209], push_downs: [projections: [0, 8, 31], filters: [(Year = 2007)]] |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q6 ( 25 rows in set (0.47 sec) )
SELECT
    IATA_CODE_Reporting_Airline AS Carrier,
    avg(CAST(DepDelay > 10, Int8)) * 1000 AS c3
FROM ontime
WHERE (Year >= 2000) AND (Year <= 2008)
GROUP BY Carrier
ORDER BY c3 DESC
+---------+--------------------+
| Carrier | c3                 |
+---------+--------------------+
| EV      | 273.75673908484157 |
| AS      |  249.3750261156234 |
| B6      |  242.9765484752507 |
| FL      | 235.78579837350685 |
| WN      | 234.25767821665684 |
| YV      | 232.75640004870877 |
| XE      | 225.37125427720642 |
| MQ      | 223.90123340736127 |
| UA      |   216.795406879701 |
| F9      | 215.44602190134705 |
| DH      | 215.11592809811052 |
| OH      | 207.80180638232238 |
| HP      |  205.4787666304912 |
| AA      |  203.3885253342485 |
| US      | 194.92934952964256 |
| TW      |  185.2254167220212 |
| OO      | 181.59144617288132 |
| CO      | 179.07975376134564 |
| DL      | 178.57936839556356 |
| 9E      | 171.55638804818648 |
| NW      |  169.9350396739337 |
| RU      | 165.93486875604194 |
| TZ      | 159.30095696228443 |
| AQ      | 111.66529559984713 |
| HA      |  57.99147625931697 |
+---------+--------------------+
25 rows in set (0.47 sec)
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection: IATA_CODE_Reporting_Airline as Carrier:String, (avg(cast((DepDelay > 10) as TINYINT)) * 1000) as c3:Float64                                                                                                                                                                                       |
|   Sort: (avg(cast((DepDelay > 10) as TINYINT)) * 1000):Float64                                                                                                                                                                                                                                                |
|     Expression: IATA_CODE_Reporting_Airline:String, (avg(cast((DepDelay > 10) as TINYINT)) * 1000):Float64 (Before OrderBy)                                                                                                                                                                                   |
|       AggregatorFinal: groupBy=[[IATA_CODE_Reporting_Airline]], aggr=[[avg(cast((DepDelay > 10) as TINYINT))]]                                                                                                                                                                                                |
|         AggregatorPartial: groupBy=[[IATA_CODE_Reporting_Airline]], aggr=[[avg(cast((DepDelay > 10) as TINYINT))]]                                                                                                                                                                                            |
|           Expression: IATA_CODE_Reporting_Airline:String, cast((DepDelay > 10) as Int8):Int8 (Before GroupBy)                                                                                                                                                                                                 |
|             Filter: ((Year >= 2000) and (Year <= 2008))                                                                                                                                                                                                                                                       |
|               ReadDataSource: scan schema: [Year:UInt16, IATA_CODE_Reporting_Airline:String, DepDelay:Int32], statistics: [read_rows: 77000000, read_bytes: 1235063452, partitions_scanned: 77, partitions_total: 209], push_downs: [projections: [0, 8, 31], filters: [((Year >= 2000) AND (Year <= 2008))]] |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q7 ( 25 rows in set (0.47 sec) )
SELECT
    IATA_CODE_Reporting_Airline AS Carrier,
    avg(DepDelay) * 1000 AS c3
FROM ontime
WHERE (Year >= 2000) AND (Year <= 2008)
GROUP BY Carrier
+---------+--------------------+
| Carrier | c3                 |
+---------+--------------------+
| OO      |  7156.105931260668 |
| NW      |  5953.887737746666 |
| YV      | 12465.722388227468 |
| HA      | -538.8768939038356 |
| RU      |   6137.49191332561 |
| AA      |  9192.396795017481 |
| CO      |  7723.676547969114 |
| F9      |  6078.962351003004 |
| HP      |  8606.658580393145 |
| AQ      | 1569.9276465368148 |
| UA      | 10474.691240783097 |
| EV      | 13248.640462884143 |
| OH      |  9174.990431086686 |
| DH      |  9562.764009429655 |
| FL      |   10091.3806677191 |
| 9E      |  7700.845777541507 |
| US      |  7003.805255113001 |
| MQ      |   9004.73935505603 |
| WN      |  9800.984032216535 |
| TZ      |  5604.692697405455 |
| AS      |  9446.867494009617 |
| TW      |  7715.900299305268 |
| B6      | 11177.923627230142 |
| XE      | 11066.582047557225 |
| DL      |  7316.332086536578 |
+---------+--------------------+
25 rows in set (0.47 sec)
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                                                                                                                   |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection: IATA_CODE_Reporting_Airline as Carrier:String, (avg(DepDelay) * 1000) as c3:Float64                                                                                                                                                                                                           |
|   Expression: IATA_CODE_Reporting_Airline:String, (avg(DepDelay) * 1000):Float64 (Before Projection)                                                                                                                                                                                                      |
|     AggregatorFinal: groupBy=[[IATA_CODE_Reporting_Airline]], aggr=[[avg(DepDelay)]]                                                                                                                                                                                                                      |
|       AggregatorPartial: groupBy=[[IATA_CODE_Reporting_Airline]], aggr=[[avg(DepDelay)]]                                                                                                                                                                                                                  |
|         Filter: ((Year >= 2000) and (Year <= 2008))                                                                                                                                                                                                                                                       |
|           ReadDataSource: scan schema: [Year:UInt16, IATA_CODE_Reporting_Airline:String, DepDelay:Int32], statistics: [read_rows: 77000000, read_bytes: 1235063452, partitions_scanned: 77, partitions_total: 209], push_downs: [projections: [0, 8, 31], filters: [((Year >= 2000) AND (Year <= 2008))]] |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q8 ( 35 rows in set (0.43 sec) )
SELECT
    Year,
    avg(DepDelay)
FROM ontime
GROUP BY Year
+------+--------------------+
| Year | avg(DepDelay)      |
+------+--------------------+
| 1987 |  7.942636447211749 |
| 1988 |  6.642095416924255 |
| 1989 |  8.082143933983971 |
| 1990 |  6.840675574328676 |
| 1991 | 5.7044772180010535 |
| 1992 |   5.62925082631977 |
| 1993 | 6.0503186963181745 |
| 1994 |  6.574114757237771 |
| 1995 |  8.135946473302818 |
| 1996 |   9.75112383578199 |
| 1997 |  8.086793537802187 |
| 1998 |  8.782369968657616 |
| 1999 |  9.076455475549054 |
| 2000 |  10.90851967263336 |
| 2001 |  7.838910616678229 |
| 2002 |  5.463908832617927 |
| 2003 |  5.365668232952322 |
| 2004 |  7.750883470537657 |
| 2005 |   8.67799405902929 |
| 2006 |  9.921313478360586 |
| 2007 | 11.154102001513522 |
| 2008 |  9.778738997786789 |
| 2009 |  7.875202103472947 |
| 2010 |  8.122003058239098 |
| 2011 |   8.30051069128936 |
| 2012 |  7.697317526910186 |
| 2013 |   9.60551109179679 |
| 2014 | 10.417677824932802 |
| 2015 |  9.231430609551786 |
| 2016 |  8.837048463968436 |
| 2017 |   9.58803416122416 |
| 2018 |  9.807826800117446 |
| 2019 | 10.731779968221662 |
| 2020 | 1.9366636990295527 |
| 2021 |  8.846637428189743 |
+------+--------------------+
35 rows in set (0.43 sec)
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                          |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection: Year:UInt16, avg(DepDelay):Float64                                                                                                                                                                   |
|   AggregatorFinal: groupBy=[[Year]], aggr=[[avg(DepDelay)]]                                                                                                                                                      |
|     AggregatorPartial: groupBy=[[Year]], aggr=[[avg(DepDelay)]]                                                                                                                                                  |
|       ReadDataSource: scan schema: [Year:UInt16, DepDelay:Int32], statistics: [read_rows: 200950100, read_bytes: 1205700600, partitions_scanned: 209, partitions_total: 209], push_downs: [projections: [0, 31]] |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q9 ( 35 rows in set (0.22 sec) )
SELECT
    Year,
    count(*) AS c1
FROM ontime
GROUP BY Year
+------+---------+
| Year | c1      |
+------+---------+
| 1987 | 1311826 |
| 1988 | 5202096 |
| 1989 | 5041200 |
| 1990 | 5270893 |
| 1991 | 5076925 |
| 1992 | 5092157 |
| 1993 | 5070501 |
| 1994 | 5180048 |
| 1995 | 5327435 |
| 1996 | 5351983 |
| 1997 | 5411843 |
| 1998 | 5384721 |
| 1999 | 5527884 |
| 2000 | 5683047 |
| 2001 | 5967780 |
| 2002 | 5271359 |
| 2003 | 5961237 |
| 2004 | 7129270 |
| 2005 | 5930344 |
| 2006 | 7141922 |
| 2007 | 7455458 |
| 2008 | 7009726 |
| 2009 | 6450285 |
| 2010 | 6450117 |
| 2011 | 6085281 |
| 2012 | 6096762 |
| 2013 | 6369482 |
| 2014 | 5819811 |
| 2015 | 5819079 |
| 2016 | 5617658 |
| 2017 | 5674621 |
| 2018 | 7213446 |
| 2019 | 7422037 |
| 2020 | 4688354 |
| 2021 | 5443512 |
+------+---------+
35 rows in set (0.22 sec)
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                     |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection: Year:UInt16, count() as c1:UInt64                                                                                                                                               |
|   AggregatorFinal: groupBy=[[Year]], aggr=[[count()]]                                                                                                                                       |
|     AggregatorPartial: groupBy=[[Year]], aggr=[[count()]]                                                                                                                                   |
|       ReadDataSource: scan schema: [Year:UInt16], statistics: [read_rows: 200950100, read_bytes: 401900200, partitions_scanned: 209, partitions_total: 209], push_downs: [projections: [0]] |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q10 ( 1 row in set (0.33 sec) )
SELECT avg(cnt)
FROM
(
    SELECT
        Year,
        Month,
        count(*) AS cnt
    FROM ontime
    WHERE DepDel15 = 1
    GROUP BY
        Year,
        Month
) AS a
+-------------------+
| avg(cnt)          |
+-------------------+
| 81318.87223587223 |
+-------------------+
1 row in set (0.33 sec)
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                                                                             |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection: avg(cnt):Float64                                                                                                                                                                                                                                        |
|   AggregatorFinal: groupBy=[[]], aggr=[[avg(cnt)]]                                                                                                                                                                                                                  |
|     AggregatorPartial: groupBy=[[]], aggr=[[avg(cnt)]]                                                                                                                                                                                                              |
|       Projection: Year:UInt16, Month:UInt8, count() as cnt:UInt64                                                                                                                                                                                                   |
|         AggregatorFinal: groupBy=[[Year, Month]], aggr=[[count()]]                                                                                                                                                                                                  |
|           AggregatorPartial: groupBy=[[Year, Month]], aggr=[[count()]]                                                                                                                                                                                              |
|             Filter: (DepDel15 = 1)                                                                                                                                                                                                                                  |
|               ReadDataSource: scan schema: [Year:UInt16, Month:UInt8, DepDel15:Int32], statistics: [read_rows: 200950100, read_bytes: 1406650700, partitions_scanned: 209, partitions_total: 209], push_downs: [projections: [0, 2, 33], filters: [(DepDel15 = 1)]] |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q11 ( 1 row in set (0.32 sec) )
SELECT avg(c1)
FROM
(
    SELECT
        Year,
        Month,
        count(*) AS c1
    FROM ontime
    GROUP BY
        Year,
        Month
) AS a
+-------------------+
| avg(c1)           |
+-------------------+
| 493734.8894348894 |
+-------------------+
1 row in set (0.32 sec)
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                           |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection: avg(c1):Float64                                                                                                                                                                                       |
|   AggregatorFinal: groupBy=[[]], aggr=[[avg(c1)]]                                                                                                                                                                 |
|     AggregatorPartial: groupBy=[[]], aggr=[[avg(c1)]]                                                                                                                                                             |
|       Projection: Year:UInt16, Month:UInt8, count() as c1:UInt64                                                                                                                                                  |
|         AggregatorFinal: groupBy=[[Year, Month]], aggr=[[count()]]                                                                                                                                                |
|           AggregatorPartial: groupBy=[[Year, Month]], aggr=[[count()]]                                                                                                                                            |
|             ReadDataSource: scan schema: [Year:UInt16, Month:UInt8], statistics: [read_rows: 200950100, read_bytes: 602850300, partitions_scanned: 209, partitions_total: 209], push_downs: [projections: [0, 2]] |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q12 ( 10 rows in set (4.85 sec) )
SELECT
    OriginCityName,
    DestCityName,
    count(*) AS c
FROM ontime
GROUP BY
    OriginCityName,
    DestCityName
ORDER BY c DESC
LIMIT 10
+-------------------+-------------------+--------+
| OriginCityName    | DestCityName      | c      |
+-------------------+-------------------+--------+
| San Francisco, CA | Los Angeles, CA   | 516662 |
| Los Angeles, CA   | San Francisco, CA | 513827 |
| New York, NY      | Chicago, IL       | 454198 |
| Chicago, IL       | New York, NY      | 446857 |
| Chicago, IL       | Minneapolis, MN   | 435750 |
| Minneapolis, MN   | Chicago, IL       | 431627 |
| Los Angeles, CA   | Las Vegas, NV     | 428198 |
| Las Vegas, NV     | Los Angeles, CA   | 422165 |
| New York, NY      | Boston, MA        | 417470 |
| Boston, MA        | New York, NY      | 414192 |
+-------------------+-------------------+--------+
10 rows in set (4.85 sec)
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                                              |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Limit: 10                                                                                                                                                                                                                            |
|   Projection: OriginCityName:String, DestCityName:String, count() as c:UInt64                                                                                                                                                        |
|     Sort: count():UInt64                                                                                                                                                                                                             |
|       AggregatorFinal: groupBy=[[OriginCityName, DestCityName]], aggr=[[count()]]                                                                                                                                                    |
|         AggregatorPartial: groupBy=[[OriginCityName, DestCityName]], aggr=[[count()]]                                                                                                                                                |
|           ReadDataSource: scan schema: [OriginCityName:String, DestCityName:String], statistics: [read_rows: 200950100, read_bytes: 8598200494, partitions_scanned: 209, partitions_total: 209], push_downs: [projections: [15, 24]] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q13 ( 10 rows in set (1.25 sec) )
SELECT
    OriginCityName,
    count(*) AS c
FROM ontime
GROUP BY OriginCityName
ORDER BY c DESC
LIMIT 10
+-----------------------+----------+
| OriginCityName        | c        |
+-----------------------+----------+
| Chicago, IL           | 12482940 |
| Atlanta, GA           | 10841412 |
| Dallas/Fort Worth, TX |  8967109 |
| Houston, TX           |  6803425 |
| Los Angeles, CA       |  6669311 |
| New York, NY          |  6282937 |
| Denver, CO            |  6271605 |
| Phoenix, AZ           |  5634743 |
| Washington, DC        |  4968979 |
| San Francisco, CA     |  4665553 |
+-----------------------+----------+
10 rows in set (1.25 sec)
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                     |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Limit: 10                                                                                                                                                                                                   |
|   Projection: OriginCityName:String, count() as c:UInt64                                                                                                                                                    |
|     Sort: count():UInt64                                                                                                                                                                                    |
|       AggregatorFinal: groupBy=[[OriginCityName]], aggr=[[count()]]                                                                                                                                         |
|         AggregatorPartial: groupBy=[[OriginCityName]], aggr=[[count()]]                                                                                                                                     |
|           ReadDataSource: scan schema: [OriginCityName:String], statistics: [read_rows: 200950100, read_bytes: 4299031113, partitions_scanned: 209, partitions_total: 209], push_downs: [projections: [15]] |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Q14 ( 1 row in set (0.01 sec) )
SELECT
       count(*)
FROM ontime;
+-----------+
| count()   |
+-----------+
| 200950100 |
+-----------+
1 row in set (0.01 sec)
+-----------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                 |
+-----------------------------------------------------------------------------------------------------------------------------------------+
| Projection: count():UInt64                                                                                                              |
|   Projection: 200950100 as count():UInt64                                                                                               |
|     Expression: 200950100:UInt64 (Exact Statistics)                                                                                     |
|       ReadDataSource: scan schema: [dummy:UInt8], statistics: [read_rows: 1, read_bytes: 1, partitions_scanned: 1, partitions_total: 1] |
+-----------------------------------------------------------------------------------------------------------------------------------------+
Block Statistics
CALL system$fuse_snapshot('default', 'ontime');
+----------------------------------+---------------------------------------------------+----------------+----------------------------------+---------------+-------------+-----------+--------------------+------------------+
| snapshot_id                      | snapshot_location                                 | format_version | previous_snapshot_id             | segment_count | block_count | row_count | bytes_uncompressed | bytes_compressed |
+----------------------------------+---------------------------------------------------+----------------+----------------------------------+---------------+-------------+-----------+--------------------+------------------+
| dbca8a63c1ed4152bcabfe345fa93be8 | 1/38/_ss/dbca8a63c1ed4152bcabfe345fa93be8_v1.json |              1 | 96aa4a7e69b140ef8967716c1521498f |            16 |         209 | 200950100 |       146170203439 |       9169260438 |
+----------------------------------+---------------------------------------------------+----------------+----------------------------------+---------------+-------------+-----------+--------------------+------------------+