Skip to main content

ClickHouse Quick Start

Tip

This page helps you set up open-source ClickHouse on your own machine. The fastest way to deploy ClickHouse and to get access to our exclusive SQL Console is to use ClickHouse Cloud.

New users get $300 in free trial credits. Click here to sign up.

1: Download the binary

ClickHouse runs natively on Linux, FreeBSD and macOS, and runs on Windows via the WSL. The simplest way to download ClickHouse locally is to run the following curl command. It determines if your operating system is supported, then downloads an appropriate ClickHouse binary:

curl https://clickhouse.com/ | sh

2: Start the server

Run the following command to start the ClickHouse server:

./clickhouse server

3: Start the client

Use the clickhouse-client to connect to your ClickHouse service. Open a new Terminal, change directories to where your clickhouse binary is saved, and run the following command:

./clickhouse client

You should see a smiling face as it connects to your service running on localhost:

my-host :)

4: Create a table

Use CREATE TABLE to define a new table. Typical SQL DDL commands work in ClickHouse with one addition - tables in ClickHouse require an ENGINE clause. Use MergeTree to take advantage of the performance benefits of ClickHouse:

CREATE TABLE my_first_table
(
user_id UInt32,
message String,
timestamp DateTime,
metric Float32
)
ENGINE = MergeTree
PRIMARY KEY (user_id, timestamp)

5. Insert data

You can use the familiar INSERT INTO TABLE command with ClickHouse, but it is important to understand that each insert into a MergeTree table causes a part (folder) to be created in storage. To minimize parts, bulk insert lots of rows at a time (tens of thousands or even millions at once).

INSERT INTO my_first_table (user_id, message, timestamp, metric) VALUES
(101, 'Hello, ClickHouse!', now(), -1.0 ),
(102, 'Insert a lot of rows per batch', yesterday(), 1.41421 ),
(102, 'Sort your data based on your commonly-used queries', today(), 2.718 ),
(101, 'Granules are the smallest chunks of data read', now() + 5, 3.14159 )

6. Query your new table

You can write a SELECT query just like you would with any SQL database:

 SELECT *
FROM my_first_table
ORDER BY timestamp

Notice the response comes back in a nice table format:

┌─user_id─┬─message────────────────────────────────────────────┬───────────timestamp─┬──metric─┐
│ 102 │ Insert a lot of rows per batch │ 2022-03-21 00:00:00 │ 1.41421 │
│ 102 │ Sort your data based on your commonly-used queries │ 2022-03-22 00:00:00 │ 2.718 │
│ 101 │ Hello, ClickHouse! │ 2022-03-22 14:04:09 │ -1 │
│ 101 │ Granules are the smallest chunks of data read │ 2022-03-22 14:04:14 │ 3.14159 │
└─────────┴────────────────────────────────────────────────────┴─────────────────────┴─────────┘

4 rows in set. Elapsed: 0.008 sec.

7: Insert your own data

The next step is to get your current data into ClickHouse. We have lots of table functions and integrations for ingesting data. We have some examples in the tabs below, or check out our Integrations for a long list of technologies that integrate with ClickHouse.

Use the s3 table function to read files from S3. It's a table function - meaning that the result is a table that can be:

  1. used as the source of a SELECT query (allowing you to run ad-hoc queries and leave your data in S3), or...
  2. insert the resulting table into a MergeTree table (when you are ready to move your data into ClickHouse)

An ad-hoc query looks like:

SELECT
passenger_count,
avg(toFloat32(total_amount))
FROM s3(
'https://datasets-documentation.s3.eu-west-3.amazonaws.com/nyc-taxi/trips_0.gz',
'TabSeparatedWithNames'
)
GROUP BY passenger_count
ORDER BY passenger_count;

Moving the data into a ClickHouse table looks like the following, where nyc_taxi is a MergeTree table:

INSERT INTO nyc_taxi
SELECT * FROM s3(
'https://datasets-documentation.s3.eu-west-3.amazonaws.com/nyc-taxi/trips_0.gz',
'TabSeparatedWithNames'
)
SETTINGS input_format_allow_errors_num=25000;

View our collection of AWS S3 documentation pages for lots more details and examples of using S3 with ClickHouse.

What's Next?