What is a materialized view?
Materialized View (MV in short) is a global index. When you create a MV, a new table gets created and distributed to different nodes using the standard table distribution mechanisms.
To create a materialized view, you need to define a query that you want to run against the base table. A base table is the table you create the MV on top of. Once the MV is created, it is kept in-sync with the base table automatically. It’s not possible to directly update a MV, its content is changed when the base table changes – this includes INSERTs, UPDATES, DELETEs. It’s only possible to query the MV and not change it directly.
Performance considerations
Reads from a MV are just as fast as regular reads from a table and just as scalable. Changes (including writes, deletes and updates as well) are applied on the MV automatically and asynchronously once the MV is created. As expected, updates to a base table with materialized views are slower than regular updates since these updates need to update both the base table and the materialized view and ensure the consistency of both updates. However, doing this in the application without server help would be even slower. In case the server gets overloaded, base table writes and the availability of the base table are prioritized over MV updates hence it might take some time for all changes to apply on the MV.
Some common use cases for MVs:
Filtering by a different column – avoid full-table scan
Different sort order (different clustering key)
Pre-compute query
Usage
Create table (in the context of MVs, this is called the base table):
CREATE TABLE buildings (
name TEXT,
city TEXT,
height INT,
PRIMARY KEY (name)
);
Create a materialized view:
CREATE MATERIALIZED VIEW building_by_city AS
SELECT * FROM buildings
WHERE city IS NOT NULL
PRIMARY KEY(city, name);
Query:
SELECT * FROM building_by_city WHERE city = ‘Paris’
Or
SELECT * FROM building_by_city WHERE city = ‘Paris’ AND name = ‘Eiffel Tower’
Drop MV:
DROP MATERIALIZED VIEW building_by_city;
Modify table attributes (e.g. compaction strategy):
ALTER MATERIALIZED VIEW building_by_city
WITH COMPACTION = {‘class’: ‘LeveledCompactionStrategy’};
A MV is another table, which will take space, and that depending on your partition key can be on different nodes than the base table.
In the write path whenever you write a new record to the base table, ScyllaDB will automatically create a record in the corresponding view.
On the read path, you should query your view directly and reads on the base table will not touch the views. So there is no implicit joining.
When can you use materialized views?
You can use MVs when you want to filter by a column other than the partition key and you need the query to be performant.