Slides:
[Music]
now that I gave you an introduction let’s talk about the basic architecture and some basic definitions in Sila so the first one which I already mentioned is the node the node is simply cldb software running on a computer on a server so that’s a node anode includes or consists of shards and each node in Sila DB contains a part of the database content okay so I mentioned data is replicated and each node is responsible for some part of our data typically we have clusters which are collection collections of nodes and this is how we think of a cluster as a node ring or a collection of nodes in this case there are five nodes in production we would typically have between three nodes so that would be the minimal number that we recommend in production and it can go all the way up to hundreds of nodes in a single cluster of course depending on your requirements and your needs all the nodes are created equal okay so we don’t have any concept of uh leader nodes Master Slave however you want to call it uh all nodes in Sila are equal and that goes together with high availability okay so there’s no single point of failure if one of the nodes uh goes down we can still serve requests other nodes can um still make sure the system is up and running
um I mentioned when I talked about scaling that the size of the cluster can change over time so it’s not um that you start a cluster and that’s it you cannot change the number of nodes you can add or remove nodes based on your requirements and you can do that without any downtime okay so you can do that as the system is running and serving requests
okay so I mentioned that in Sila as well as for Cassandra by the way data is replicated across different nodes so each node holds a certain amount of data and the way that it works under the hood is that if we have a piece of data say in this example we have a row with ID name address and phone and ID is defined as the partition key so when we try to write this row to the database what Sila does is it performs a consistent hash function on the partition key column and according to the hash it knows which nodes are responsible for that piece of data okay so each node is assigned a range of tokens and according to the consistent hash function Sila knows which node is responsible for which piece of data according to the partition key okay and that’s why choosing a partition key is highly important
so the replication Factor also abbreviated as RF is something that you can configure you set it when you define the key space and that determines the number of nodes where the data is replicated okay some examples of values would be let’s say RF of 3 and that would mean that each piece of data will be replicated three times to three different nodes okay the replication is something that happens automatically so you don’t have to worry about it however it’s important to know that this is something that’s happening of course and there’s a trade-off here right between uh if we have a very high replication Factor we’re going to get more redundancy and better availability however that doesn’t come for free because we would need more resources right um more storage more computers um so there’s the trade-off there and that’s something to think about when setting up your specific use case and I mentioned mentioned it before my talk one of the tips that I can give you is that if you’re migrating from another database if you have a use case you’re not sure about get in touch with us uh we have a team of field field engineers and solution Architects and they have a lot of experience from the field and they can help you with your specific use case and another good good place for questions is our community Forum where you can ask questions or maybe the question you want to ask has already been asked and you can get help either from the community or from our Engineers on the community forum
so I mentioned RF or the replication Factor another important concept is the consistency level so the consistency level abbreviated as CL is the number of nodes that must acknowledge a read or a write request okay some examples are one Quorum and all and we can set the consistency level per operation so maybe for a specific write operation the consistency level should be one value and for another operation it should be a different value and that’s uh configurable
so Sila is built in a way that’s aware of the data center topology it run it can run on multiple data centers while taking that topology into account also racks within a Data Center and that has two important uh benefits one is again redundancy so even if one data center goes down but we have uh data replicated the other data centers can still serve requests the other one is performance so in this example uh it would make more sense to serve a user from the US from the U.S data center right just because it’s physically closer and uh we wouldn’t have to wait so long uh in terms of latency foreign foreign