Shared Table Multi-Tenant DB Architecture in MYSQL

Saravanan A R
3 min readMay 15, 2021

Say, you want to develop a cloud-based application that involves storing and retrieving customers' data in a cloud data layer.
Essentially, you’ll have many questions in your mind about how you’re going to store/isolate your customers' data? What DB Architecture to achieve this?

In this blog, we’ll see one such DB Architecture. ie. Shared Table Multi-Tenant DB Architecture.

To explain the architecture in a simpler term, I’m taking MYSQL Relational database without a cluster model.

MY_CONVERTER Application

As an example, I’m naming my application as MY_CONVERTER. It’s a simple converter application that will convert Meter to Kilometer and vise versa.

Also, this application will store the history of conversion done by a specific user. In another word, the user can log in to this application and can view the history of conversion he did in the past.

First, we’ll design a table schema for a single tenant. Later, we’ll convert it to support multi-users.

Conversion_history table for single-tenant looks like,

CREATE TABLE conversion_history(
id BIGINT,
input_value BIGINT,
conversion_mode INT,
result BIGINT,
PRIMARY KEY(id)
);

How can we change this “conversion_history” table schema to support multiple users? Oh! Yes! We can introduce a tenant identifier column that will denote which tenant owns this row.

Multi-tenant table schema looks like,

CREATE TABLE conversion_history(
id BIGINT,
tenant_identifier BIGINT,
input_value BIGINT,
conversion_mode INT,
result BIGINT,
PRIMARY_KEY(id)
);

Say, three users signup for our application(Users: 1, 2 & 3), our conversion_history table looks like,

(a) User 1, 2 & 3 sample data set

tenant_identifier column denotes who owns the row. To retrieve data of single users, add tenant_identifier condition in where condition. Also, to make the tenant_identifier query faster, we are creating an index on the tenant_identifier column.

Creating an Index on the tenant_identifier column,

ALTER TABLE conversion_history ADD INDEX (tenant_identifier);

To retrieve data of the user (2),

SELECT * FROM conversion_history where tenant_identifier = 2;

This query runs faster as we have an index on tenant_identifier. Looks everything fine now. But, What will happen if our application grows? Say, our application grows to 1M users. Does our schema withstand 1Million users? How does our table behave in such a condition?

In such a condition, the tenant_identifier index tree will have 1Million nodes. The compute node may not have enough memory to have the full index tree in memory. The compute node will load the index tree page by page and do traversing. ie. The compute node will do hard disk reads and load a page of the index tree into the memory. This hard disk read is costly which will slower the queries.

How can we make the schema more perfect?
To make it more perfect, add LIST PARTITION in our schema. To learn about partitioning in MySQL refer to this link.

The schema looks like,

CREATE TABLE conversion_history(
id BIGINT NOT NULL,
tenant_identifier BIGINT,
input_value BIGINT,
conversion_mode INTEGER,
result BIGINT,
PRIMARY KEY(id, tenant_identifier)
)
PARTITION BY LIST(tenant_identifier)(
PARTITION user_1 VALUES IN (1),
PARTITION user_2 VALUES IN (2)
);

Using partition, we’re splitting the large index tree into “n” number of subtrees corresponding to the number of users.

How does the below query behave after partition,

SELECT * FROM conversion_history WHERE tenant_identifier = 2;

Partition pruning will decide which partition to query using the tenant_identifier condition and eliminate unwanted partition from traversing.

(b) MYSQL EXPLAIN RESULT

You can see from the above explain statement that our select query chooses the partition user_2.

How to add a new partition for new users?

When a new user signs up to the application, create a new partition for that new user.

ALTER TABLE conversion_history ADD PARTITION (PARTITION user_3 VALUES IN (3));

--

--