Trino Cheat Sheet
Overview
Trino is a distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources.
Trino Concepts
Server Types
There are two types of Trino servers - coordinators and workers
Data Sources
A data source is a system where data is retrieved from. You can query systems such as distributed object storage, RDBMS, NoSQL databases, document databases, and many others.
Connectors
A connector translates the query and storage concepts of an underlying data source, such as a relational database management system (RDBMS), object storage, or a key-value store, to the SQL and Trino concepts of tables, columns, rows and data types. These can be simple SQL-to-SQL translations and mappings, but also much more complicated translations from SQL to object storage or NoSQL systems.
Catalogs
Catalogs define and name the configuration to connect to and query a data source.
Schema
Schemas are a way to organize tables. Together, a catalog and schema define a set of tables that can be queried.
Table
A table is a set of unordered rows, which are organized into named columns with types. This is the same as in any relational database. The mapping from source data to tables is defined by the connector.
Query Execution Model
Trino executes SQL statements and turns these statements into queries, that are executed across a distributed cluster of coordinator and workers.
Trino Connectors
Accumulo
Atop
BigQuery
Black Hole
Cassandra
ClickHouse
Delta Lake *
Druid
Elasticsearch
Exasol
Google Sheets
Hive *
Hudi
Iceberg *
Ignite
JMX
Kafka
Kinesis
Kudu
Local File
MariaDB
Memory
MongoDB
MySQL *
OpenSearch
Oracle
Phoenix
Pinot
PostgreSQL *
Prometheus
Redis
Redshift
SingleStore
SQL Server *
System
Thrift
TPCDS
TPCH
supports Table Statistics *
Table Statistics
Trino supports statistics based optimizations for queries. For a query to take advantage of these optimizations, Trino must have statistical information for the tables in that query. Table statistics are provided to the query planner by connectors.
SHOW STATS
SHOW STATS FOR table
SHOW STATS FOR (query)
Returns approximated statistics for the named table or for the results of a query. Returns NULL for any statistics that are not populated or unavailable on the data source.
ANALYZE
ANALYZE table_name [WITH (property_name = expression [, ...])]
Collects table and column statistics for a given table. The optional WITH clause can be used to provide connector-specific properties.
SQL Language
Trino is an ANSI SQL compliant query engine. This standard compliance allows Trino users to integrate their favorite data tools, including BI and ETL tools with any underlying data source.
Data types
Boolean
Integer
Floating-point
Fixed-precision
String
Date and time
Structural
Network address
UUID
HyperLogLog
SetDigest
Quantile digest
T-Digest
Functions & Operators
Aggregate
Array
Binary
Bitwise
Color
Comparison
Conditional
Conversion
Date and time
Decimal
Geospatial
HyperLogLog
IP address
JSON
Lambda
Logical
Machine learning
Map
Math
Quantile digest
Regular expression
Session
Set digest
String
System
Table
Teradata
T-Digest
URL
UUID
Window
Edition
August 2024