LogoLogo
latest
latest
  • Introduction
  • Basics
    • Concepts
      • Pinot storage model
      • Architecture
      • Components
        • Cluster
          • Tenant
          • Server
          • Controller
          • Broker
          • Minion
        • Table
          • Segment
            • Deep Store
            • Segment threshold
            • Segment retention
          • Schema
          • Time boundary
        • Pinot Data Explorer
    • Getting Started
      • Running Pinot locally
      • Running Pinot in Docker
      • Quick Start Examples
      • Running in Kubernetes
      • Running on public clouds
        • Running on Azure
        • Running on GCP
        • Running on AWS
      • Create and update a table configuration
      • Batch import example
      • Stream ingestion example
      • HDFS as Deep Storage
      • Troubleshooting Pinot
      • Frequently Asked Questions (FAQs)
        • General
        • Pinot On Kubernetes FAQ
        • Ingestion FAQ
        • Query FAQ
        • Operations FAQ
    • Indexing
      • Bloom filter
      • Dictionary index
      • Forward index
      • FST index
      • Geospatial
      • Inverted index
      • JSON index
      • Native text index
      • Range index
      • Star-tree index
      • Text search support
      • Timestamp index
      • Vector index
    • Release notes
      • 1.3.0
      • 1.2.0
      • 1.1.0
      • 1.0.0
      • 0.12.1
      • 0.12.0
      • 0.11.0
      • 0.10.0
      • 0.9.3
      • 0.9.2
      • 0.9.1
      • 0.9.0
      • 0.8.0
      • 0.7.1
      • 0.6.0
      • 0.5.0
      • 0.4.0
      • 0.3.0
      • 0.2.0
      • 0.1.0
    • Recipes
      • Connect to Streamlit
      • Connect to Dash
      • Visualize data with Redash
      • GitHub Events Stream
  • For Users
    • Query
      • Querying Pinot
      • Query Syntax
        • Explain Plan (Single-Stage)
        • Filtering with IdSet
        • GapFill Function For Time-Series Dataset
        • Grouping Algorithm
        • JOINs
        • Lookup UDF Join
      • Query Options
      • Query Quotas
      • Query Cancellation
      • Query Correlation ID
      • Query using Cursors
      • Multi-stage query
        • Understanding Stages
        • Stats
        • Optimizing joins
        • Join strategies
          • Random + broadcast join strategy
          • Query time partition join strategy
          • Colocated join strategy
          • Lookup join strategy
        • Hints
        • Operator Types
          • Aggregate
          • Filter
          • Join
          • Intersect
          • Leaf
          • Literal
          • Mailbox receive
          • Mailbox send
          • Minus
          • Sort or limit
          • Transform
          • Union
          • Window
        • Stage-Level Spooling
      • Explain plan
    • APIs
      • Broker Query API
        • Query Response Format
      • Broker GRPC API
      • Controller Admin API
      • Controller API Reference
    • External Clients
      • JDBC
      • Java
      • Python
      • Golang
    • Tutorials
      • Use OSS as Deep Storage for Pinot
      • Ingest Parquet Files from S3 Using Spark
      • Creating Pinot Segments
      • Use S3 as Deep Storage for Pinot
      • Use S3 and Pinot in Docker
      • Batch Data Ingestion In Practice
      • Schema Evolution
  • For Developers
    • Basics
      • Extending Pinot
        • Writing Custom Aggregation Function
        • Segment Fetchers
      • Contribution Guidelines
      • Code Setup
      • Code Modules and Organization
      • Dependency Management
      • Update documentation
    • Advanced
      • Data Ingestion Overview
      • Ingestion Aggregations
      • Ingestion Transformations
      • Null value support
      • Use the multi-stage query engine (v2)
      • Advanced Pinot Setup
    • Plugins
      • Write Custom Plugins
        • Input Format Plugin
        • Filesystem Plugin
        • Batch Segment Fetcher Plugin
        • Stream Ingestion Plugin
    • Design Documents
      • Segment Writer API
  • For Operators
    • Deployment and Monitoring
      • Set up cluster
      • Server Startup Status Checkers
      • Set up table
      • Set up ingestion
      • Decoupling Controller from the Data Path
      • Segment Assignment
      • Instance Assignment
      • Rebalance
        • Rebalance Servers
          • Examples and Scenarios
        • Rebalance Brokers
        • Rebalance Tenant
      • Separating data storage by age
        • Using multiple tenants
        • Using multiple directories
      • Pinot managed Offline flows
      • Minion merge rollup task
      • Consistent Push and Rollback
      • Access Control
      • Monitoring
      • Tuning
        • Tuning Default MMAP Advice
        • Real-time
        • Routing
        • Query Routing using Adaptive Server Selection
        • Query Scheduling
      • Upgrading Pinot with confidence
      • Managing Logs
      • OOM Protection Using Automatic Query Killing
      • Pause ingestion based on resource utilization
    • Command-Line Interface (CLI)
    • Configuration Recommendation Engine
    • Tutorials
      • Authentication
        • Basic auth access control
        • ZkBasicAuthAccessControl
      • Configuring TLS/SSL
      • Build Docker Images
      • Running Pinot in Production
      • Kubernetes Deployment
      • Amazon EKS (Kafka)
      • Amazon MSK (Kafka)
      • Monitor Pinot using Prometheus and Grafana
      • Performance Optimization Configurations
      • Segment Operations Throttling
      • Reload a table segment
  • Configuration Reference
    • Cluster
    • Controller
    • Broker
    • Server
    • Table
    • Ingestion
    • Schema
    • Database
    • Ingestion Job Spec
    • Monitoring Metrics
    • Plugin Reference
      • Stream Ingestion Connectors
      • VAR_POP
      • VAR_SAMP
      • STDDEV_POP
      • STDDEV_SAMP
    • Dynamic Environment
  • Manage Data
    • Import Data
      • SQL Insert Into From Files
      • Upload Pinot segment Using CommandLine
      • Batch Ingestion
        • Spark
        • Flink
        • Hadoop
        • Backfill Data
        • Dimension table
      • Stream Ingestion
        • Ingest streaming data from Apache Kafka
        • Ingest streaming data from Amazon Kinesis
        • Ingest streaming data from Apache Pulsar
        • Configure indexes
        • Stream ingestion with CLP
      • Upsert and Dedup
        • Stream ingestion with Upsert
        • Segment compaction on upserts
        • Stream ingestion with Dedup
      • Supported Data Formats
      • File Systems
        • Amazon S3
        • Azure Data Lake Storage
        • HDFS
        • Google Cloud Storage
      • Complex Type (Array, Map) Handling
        • Complex Type Examples (Unnest)
      • Ingest records with dynamic schemas
  • Functions
    • Aggregation Functions
    • Transformation Functions
    • Array Functions
    • Binary Functions
    • DateTime Functions
    • Funnel Analysis Functions
    • GeoSpatial Functions
    • Hash Functions
    • JSON Functions
    • Math Functions
    • String Functions
    • User-Defined Functions (UDFs)
    • URL Functions
    • Unique Count and cardinality Estimation Functions
  • Window Functions
  • Function List
    • ABS
    • ADD
    • ago
    • EXPR_MIN / EXPR_MAX
    • ARRAY_AGG
    • arrayConcatDouble
    • arrayConcatFloat
    • arrayConcatInt
    • arrayConcatLong
    • arrayConcatString
    • arrayContainsInt
    • arrayContainsString
    • arrayDistinctInt
    • arrayDistinctString
    • arrayIndexOfInt
    • arrayIndexOfString
    • ARRAYLENGTH
    • arrayRemoveInt
    • arrayRemoveString
    • arrayReverseInt
    • arrayReverseString
    • arraySliceInt
    • arraySliceString
    • arraySortInt
    • arraySortString
    • arrayUnionInt
    • arrayUnionString
    • AVGMV
    • Base64
    • caseWhen
    • ceil
    • CHR
    • codepoint
    • concat
    • count
    • COUNTMV
    • COVAR_POP
    • COVAR_SAMP
    • day
    • dayOfWeek
    • dayOfYear
    • DISTINCT
    • DISTINCTCOUNT
    • DISTINCTCOUNTMV
    • DISTINCT_COUNT_OFF_HEAP
    • SEGMENTPARTITIONEDDISTINCTCOUNT
    • DISTINCTCOUNTBITMAP
    • DISTINCTCOUNTBITMAPMV
    • DISTINCTCOUNTHLL
    • DISTINCTCOUNTHLLMV
    • DISTINCTCOUNTRAWHLL
    • DISTINCTCOUNTRAWHLLMV
    • DISTINCTCOUNTSMARTHLL
    • DISTINCTCOUNTHLLPLUS
    • DISTINCTCOUNTULL
    • DISTINCTCOUNTTHETASKETCH
    • DISTINCTCOUNTRAWTHETASKETCH
    • DISTINCTSUM
    • DISTINCTSUMMV
    • DISTINCTAVG
    • DISTINCTAVGMV
    • DIV
    • DATETIMECONVERT
    • DATETRUNC
    • exp
    • FIRSTWITHTIME
    • FLOOR
    • FrequentLongsSketch
    • FrequentStringsSketch
    • FromDateTime
    • FromEpoch
    • FromEpochBucket
    • FUNNELCOUNT
    • FunnelCompleteCount
    • FunnelMaxStep
    • FunnelMatchStep
    • GridDistance
    • Histogram
    • hour
    • isSubnetOf
    • JSONFORMAT
    • JSONPATH
    • JSONPATHARRAY
    • JSONPATHARRAYDEFAULTEMPTY
    • JSONPATHDOUBLE
    • JSONPATHLONG
    • JSONPATHSTRING
    • jsonextractkey
    • jsonextractscalar
    • LAG
    • LASTWITHTIME
    • LEAD
    • length
    • ln
    • lower
    • lpad
    • ltrim
    • max
    • MAXMV
    • MD5
    • millisecond
    • min
    • minmaxrange
    • MINMAXRANGEMV
    • MINMV
    • minute
    • MOD
    • mode
    • month
    • mult
    • now
    • percentile
    • percentileest
    • percentileestmv
    • percentilemv
    • percentiletdigest
    • percentiletdigestmv
    • percentilekll
    • percentilerawkll
    • percentilekllmv
    • percentilerawkllmv
    • quarter
    • regexpExtract
    • regexpReplace
    • remove
    • replace
    • reverse
    • round
    • roundDecimal
    • ROW_NUMBER
    • rpad
    • rtrim
    • second
    • sha
    • sha256
    • sha512
    • sqrt
    • startswith
    • ST_AsBinary
    • ST_AsText
    • ST_Contains
    • ST_Distance
    • ST_GeogFromText
    • ST_GeogFromWKB
    • ST_GeometryType
    • ST_GeomFromText
    • ST_GeomFromWKB
    • STPOINT
    • ST_Polygon
    • strpos
    • ST_Union
    • SUB
    • substr
    • sum
    • summv
    • TIMECONVERT
    • timezoneHour
    • timezoneMinute
    • ToDateTime
    • ToEpoch
    • ToEpochBucket
    • ToEpochRounded
    • TOJSONMAPSTR
    • toGeometry
    • toSphericalGeography
    • trim
    • upper
    • Url
    • UTF8
    • VALUEIN
    • week
    • year
    • Extract
    • yearOfWeek
    • FIRST_VALUE
    • LAST_VALUE
    • ST_GeomFromGeoJSON
    • ST_GeogFromGeoJSON
    • ST_AsGeoJSON
  • Reference
    • Single-stage query engine (v1)
    • Multi-stage query engine (v2)
    • Troubleshooting
      • Troubleshoot issues with the multi-stage query engine (v2)
      • Troubleshoot issues with ZooKeeper znodes
      • Realtime Ingestion Stopped
  • RESOURCES
    • Community
    • Team
    • Blogs
    • Presentations
    • Videos
  • Integrations
    • Tableau
    • Trino
    • ThirdEye
    • Superset
    • Presto
    • Spark-Pinot Connector
  • Contributing
    • Contribute Pinot documentation
    • Style guide
Powered by GitBook
On this page
  • Multi-stage stats visualizer
  • The JSON format

Was this helpful?

Edit on GitHub
Export as PDF
  1. For Users
  2. Query
  3. Multi-stage query

Stats

Learn more about multi-stage stats and how to use them to improve your queries.

PreviousUnderstanding StagesNextOptimizing joins

Last updated 11 days ago

Was this helpful?

Multi-stage stats (MSE) are more complex but also more expressive than single-stage stats. While in single-stage stats Apache Pinot returns a single set of statistics for the query, in multi-stage stats Apache Pinot returns a set of statistics for each operator of the query execution. These stats are collected by default and included in the response of any MSE query.

Each operator has its own set of statistics, which are collected during the execution of the query. See the section to learn more about the different operator types and their statistics.

Multi-stage stats visualizer

The recommended way to analyze the multi-stage stats is to use the visualizer included in the Pinot UI. It can be accessed by running a query in the Pinot controller UI and clicking on the Visual button.

Then, the view is changed to only show the multi-stage stats in a graph format like the following, where each node represents an operator. Inside each node, you can see the operator type and the statistics collected for that operator. Nodes are connected with edges that represent the relationship between the operators. Parent operators are above their children, and the edges' width represents the time spent on the child operator.

For example, the following query in ColocatedJoinQuickStart:

select * 
from userAttributes a 
join userGroups g
on a.userUUID = g.userUUID
join userFactEvents fe
on fe.userUUID = g.userUUID

Creates the following graph:

select *
from userFactEvents fe
join (
    select *
    from userAttributes a
    join userGroups g
    on a.userUUID = g.userUUID
) as g
on fe.userUUID = g.userUUID

By default, the visualizer will only show the most important stats. To show all the stats, click on Show details button in the bottom left corner of the visualizer.

The JSON format

The Pinot UI stats visualizer is a convenient way to see the multi-stage stats, but sometimes you may want to see the raw JSON format. For example, you may want to analyze the stats programmatically or use a different visualization tool. To do so, you can read the stageStats field in the JSON response of the query.

For example, the same query used in the previous section returns: Returns the following stageStats:

{
  ...,
  "stageStats": {
    "type": "MAILBOX_RECEIVE",
    "executionTimeMs": 18,
    "emittedRows": 2494,
    "fanIn": 4,
    "rawMessages": 18,
    "deserializedBytes": 219393,
    "upstreamWaitMs": 80,
    "children": [
      {
        "type": "MAILBOX_SEND",
        "executionTimeMs": 75,
        "emittedRows": 2494,
        "stage": 1,
        "parallelism": 4,
        "fanOut": 1,
        "rawMessages": 14,
        "serializedBytes": 216854,
        "serializationTimeMs": 4,
        "children": [
          {
            "type": "HASH_JOIN",
            "executionTimeMs": 70,
            "emittedRows": 2494,
            "timeBuildingHashTableMs": 73,
            "children": [
              {
                "type": "MAILBOX_RECEIVE",
                "emittedRows": 2494,
                "fanIn": 4,
                "inMemoryMessages": 18,
                "rawMessages": 12,
                "deserializedBytes": 2085,
                "upstreamWaitMs": 131,
                "children": [
                  {
                    "type": "MAILBOX_SEND",
                    "executionTimeMs": 23,
                    "emittedRows": 2494,
                    "stage": 2,
                    "parallelism": 4,
                    "fanOut": 4,
                    "inMemoryMessages": 14,
                    "children": [
                      {
                        "type": "HASH_JOIN",
                        "executionTimeMs": 21,
                        "emittedRows": 2494,
                        "timeBuildingHashTableMs": 20,
                        "children": [
                          {
                            "type": "MAILBOX_RECEIVE",
                            "executionTimeMs": 1,
                            "emittedRows": 10000,
                            "fanIn": 2,
                            "inMemoryMessages": 6,
                            "rawMessages": 18,
                            "deserializedBytes": 221576,
                            "deserializationTimeMs": 3,
                            "upstreamWaitMs": 61,
                            "children": [
                              {
                                "type": "MAILBOX_SEND",
                                "executionTimeMs": 11,
                                "emittedRows": 10000,
                                "stage": 3,
                                "parallelism": 2,
                                "fanOut": 4,
                                "inMemoryMessages": 4,
                                "rawMessages": 12,
                                "serializedBytes": 220890,
                                "serializationTimeMs": 6,
                                "children": [
                                  {
                                    "type": "LEAF",
                                    "table": "userAttributes",
                                    "executionTimeMs": 8,
                                    "emittedRows": 10000,
                                    "numDocsScanned": 10000,
                                    "totalDocs": 10000,
                                    "numEntriesScannedPostFilter": 40000,
                                    "numSegmentsQueried": 4,
                                    "numSegmentsProcessed": 4,
                                    "numSegmentsMatched": 4,
                                    "threadCpuTimeNs": 4733524
                                  }
                                ]
                              }
                            ]
                          },
                          {
                            "type": "MAILBOX_RECEIVE",
                            "executionTimeMs": 7,
                            "emittedRows": 2494,
                            "fanIn": 2,
                            "inMemoryMessages": 10,
                            "rawMessages": 26,
                            "deserializedBytes": 46102,
                            "deserializationTimeMs": 3,
                            "upstreamWaitMs": 40,
                            "children": [
                              {
                                "type": "MAILBOX_SEND",
                                "executionTimeMs": 4,
                                "emittedRows": 2494,
                                "stage": 4,
                                "parallelism": 2,
                                "fanOut": 4,
                                "inMemoryMessages": 8,
                                "rawMessages": 20,
                                "serializedBytes": 45422,
                                "serializationTimeMs": 4,
                                "children": [
                                  {
                                    "type": "LEAF",
                                    "table": "userGroups",
                                    "executionTimeMs": 5,
                                    "emittedRows": 2494,
                                    "numDocsScanned": 2494,
                                    "totalDocs": 2494,
                                    "numEntriesScannedPostFilter": 4988,
                                    "numSegmentsQueried": 8,
                                    "numSegmentsProcessed": 8,
                                    "numSegmentsMatched": 8,
                                    "threadCpuTimeNs": 1423051
                                  }
                                ]
                              }
                            ]
                          }
                        ]
                      }
                    ]
                  }
                ]
              },
              {
                "type": "MAILBOX_RECEIVE",
                "executionTimeMs": 48,
                "emittedRows": 40000,
                "fanIn": 2,
                "inMemoryMessages": 10,
                "rawMessages": 30,
                "deserializedBytes": 1755012,
                "deserializationTimeMs": 7,
                "upstreamWaitMs": 133,
                "children": [
                  {
                    "type": "MAILBOX_SEND",
                    "executionTimeMs": 30,
                    "emittedRows": 40000,
                    "stage": 5,
                    "parallelism": 2,
                    "fanOut": 4,
                    "inMemoryMessages": 8,
                    "rawMessages": 24,
                    "serializedBytes": 1754652,
                    "serializationTimeMs": 15,
                    "children": [
                      {
                        "type": "LEAF",
                        "table": "userFactEvents",
                        "executionTimeMs": 21,
                        "emittedRows": 40000,
                        "numDocsScanned": 40000,
                        "totalDocs": 40000,
                        "numEntriesScannedPostFilter": 320000,
                        "numSegmentsQueried": 8,
                        "numSegmentsProcessed": 8,
                        "numSegmentsMatched": 8,
                        "threadCpuTimeNs": 32716947
                      }
                    ]
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  },
  ...
}

Each node in the tree represents an operation that is executed and the tree structure form is similar (but not equal) to the logical plan of the query that can be obtained with the EXPLAIN PLAN command.

Here we can see there are 5 stages (one for each MAILBOX_SEND operator). A significant part of the time is spent in HASH_JOIN on stage 1, followed by the read on userFactEvents. We can also see that stage 5, the one that reads from userFactEvents , returns 40000 rows while the other stage returns 2494 rows, so as explained in , it is better to have the smaller table on the right side of the join, so the query would be faster if written as:

The graph being drawn is usually a tree-like structure, but it can be a directed acyclic graph (DAG) in some cases, like when using .

The stats are always a tree structure when using the JSON format, even when are used. In that case, the spooled stages will be included more than once in the tree. You will need to create the DAG yourself by looking at the stage field for each operator and connect the operators with the same stage ID.

Optimizing joins
spools
spools
Operator Types