Joining Druid datasources

polaris Events July 17, 2020 | 0

Notice: Undefined offset: 1 in /data/httpd/www/html/wp-includes/media.php on line 764

Notice: Undefined offset: 1 in /data/httpd/www/html/wp-includes/media.php on line 766

Notice: Undefined offset: 1 in /data/httpd/www/html/wp-includes/media.php on line 769

Notice: Undefined offset: 1 in /data/httpd/www/html/wp-includes/media.php on line 764

Notice: Undefined offset: 1 in /data/httpd/www/html/wp-includes/media.php on line 766

Notice: Undefined offset: 1 in /data/httpd/www/html/wp-includes/media.php on line 769

Notice: Undefined offset: 1 in /data/httpd/www/html/wp-includes/media.php on line 764

Notice: Undefined offset: 1 in /data/httpd/www/html/wp-includes/media.php on line 766

Notice: Undefined offset: 1 in /data/httpd/www/html/wp-includes/media.php on line 769

Notice: Undefined offset: 1 in /data/httpd/www/html/wp-includes/media.php on line 70

In general, joining between tables in relational database is very important and must be performed. In a NoSQL-based database such as Druid, the operation such as Join is not required, but depending on the situation, it may be necessary to refer to values from other data sources (for example, to check the reference values of coded fields). For the same reason, Aapache Druid also provides join in two ways.

Query time Lookup
Join operation

Query time lookup can be used only in a single key-value map, and it is difficult to use in adhoc query environment due to the fact that there are many things to prepare in advance. Therefore, join operation between datasources is mainly used in adhoc query environment. Metatron distributed Druid extends the join function of Apache Druid to provide some additional functions.

Right/Full outer join support
Supports join algorithms other than Hash-join such as Sort-merge

For more detailed support, please refer to the previous blog “Running TPC-H using SQL in Druid“.

So let’s use the SQL query for the Druid datasource in Discovery’s workbench. To use Druid SQL in Discovery’s workbench, you need to create a connection for Druid. Go to Management->Data Storage->Data Connection to create a new Data Connection.

Create a data connection by entering the broker url and port for Druid as shown in the figure. After that, you can finish the setup by sharing the created Data Connection to the desired workspace. Now, go to the workspace where Data Connection is shared and create a workbench for Druid and execute the SQL statement.

Joining Druid datasources

Joining Druid datasources

Leave a Reply Cancel reply

Category

Subscribe

Search

© SK TELECOM

Get Support

Subscribe Us

Metatron Discovery