This documentation is for an unreleased version of Apache Flink. We recommend you use the latest stable version.

Flink SQL #

Flink SQL enables you to develop streaming and batch applications using standard SQL. Flink’s SQL support is based on Apache Calcite which implements the SQL standard. Queries have the same semantics and produce the same results regardless of whether the input is continuous (streaming) or bounded (batch).

Flink SQL integrates seamlessly with the Table API and Flink’s DataStream API. You can easily switch between all APIs and libraries which build upon them. For instance, you can detect patterns from a table using the MATCH_RECOGNIZE clause and later use the DataStream API to build alerting based on the matched patterns.

Ways to Use Flink SQL #

Flink SQL can be used through several interfaces depending on your use case:

Interface	Description	Use Case
SQL Client	Interactive command-line interface	Ad-hoc queries, development, debugging
SQL Gateway	REST and HiveServer2 endpoints	Remote SQL submission, integration with BI tools
JDBC Driver	Standard JDBC connectivity	Application integration, BI tool connectivity
Table API	Programmatic SQL execution	Embedded SQL in Java/Scala/Python applications

Key Concepts #

Flink SQL is built on the concept of dynamic tables, which represent both bounded (batch) and unbounded (streaming) data. SQL queries on dynamic tables produce continuously updating results as new data arrives.

For a deeper understanding of how Flink SQL processes streaming data, see:

Dynamic Tables - How streaming data is represented as tables
Time Attributes - Event time and processing time in SQL
Streaming Concepts - Handling updating results and state

Where to Go Next #

SQL Client: Interactive CLI for running SQL queries without programming.
SQL Reference: Complete SQL syntax reference (DDL, DML, queries).
Built-in Functions: Available SQL functions for data transformation.
SQL Gateway: REST and HiveServer2 service for remote SQL access.
JDBC Driver: Connect to Flink SQL from any JDBC-compatible tool.
Catalogs: Manage metadata for databases, tables, and functions.
Hive Compatibility: Integration with Apache Hive metastore and syntax.
Materialized Tables: Incrementally maintained query results.
Connect to External Systems: Available connectors and formats for reading and writing data.
Data Types: Pre-defined data types and their properties.
Time Zone: How time zones affect timestamp processing.
Table API: Programmatic API for relational operations.
OLAP Quickstart: Set up a Flink OLAP service for high-throughput SQL queries.