Mastering MySQL Group Replication
Manage episode 420775881 series 3568157
http://insidemysql.libsyn.com/mastering-mysql-group-replication
Mastering MySQL Group Replication
00;00;00;00 - 00;00;31;20 Welcome to Inside MySQL: Sakila Speaks, a podcast dedicated to all things MySQL. We bring you the latest news from the MySQL team, MySQL product updates, and insightful interviews with members of the MySQL community. Sit back and enjoy as your hosts bring you the latest updates on your favorite open-source database. Let's get started. 00;00;31;23 - 00;00;58;27 Welcome to Inside MySQL: Sakila Speaks. I'm leFred and I'm Scott Stroz. Today we are joined by Luis Soares. Hi, Luis. Hello. Hi. So, you are the MySQL replication team lead. You are responsible for the MySQL binary Logs replication related code base and MySQL bin lock tool. Yeah, that's correct. I'm happy to lead a bunch of very great people, very knowledgeable people. 00;00;58;27 - 00;01;22;22 So that thing comes easy. So, we all know you as the face of all things application in MySQL and this is why you are also the responsible of HA, point in time recovery and channels in MySQL HeatWave. So, for HA, we decided to eat our own dog food and am I correct HA in MySQL HeatWave is also using group replication isn't it? 00;01;22;24 - 00;01;52;10 Yeah. So along with the other OCI technology, we use group replication under the hood to build the fault tolerant DB system in in the MySQL HeatWave database service when deployed in single primary mode. By the way, a "DB System” for those that are wondering what it is it's the abstraction that captures all these things that are managed by the service on behalf of the user. 00;01;52;12 - 00;02;22;19 Like setting up replication, keeping everything running perfectly fine. Orchestration, orchestrating all these things related to backups and so on and so forth. In terms of how businesses can benefit, what are some of the benefits of using group replication. In the MySQL HeatWave services, there's the DB system. That's what users relate to. And under the hood we have group replication to provide fault full tolerance, right? 00;02;22;21 - 00;02;51;09 Group replication at its core relies on a quorum to commit a transaction and therefore if the primary server fails within the cluster, there is this guarantee that it's if there is a survive, there's a majority surviving this failure event, then the changes that have been produced so far will continue to be in the cluster, right? So, in other words, the data is preserved. 00;02;51;12 - 00;03;34;13 If there is a surviving majority in the event of a failure. The act of switching over the application when a failure happens is also relatively fast, because if there's a failure, there's a standby in the cluster ready to take over. So, we call that a secondary. So, usually, the time is relatively fast. So just for clarification, when you say as long as a majority of the nodes are unaffected, so you're saying like if there is a 5 node cluster and two of the nodes go down, then the transaction is still going to be committed, correct? 00;03;34;15 - 00;04;05;27 Yeah. In group replication, it happens like that, right? As long as you have the majority surviving the failure, the change will be carried on forward. Right. In which case in the MySQL HeatWave service clusters typically have three nodes primary and two secondaries. And so that's how it works. I think also it's because as we are operating this, when we have one failure, our guys can jump in directly and fix everything. 00;04;05;27 - 00;04;31;13 So, we don't need to have too many nodes there either, I guess, right? Yeah, it's I think so. It's a combination of automation and sometimes manual work. Right. So, since your team Luis is operating all the clusters in MySQL HeatWave service did group application got some improvement related to that? Yeah. I mean over the years group replication has been always evolving, right? 00;04;31;13 - 00;05;20;28 And with this with a need to power a cloud service, of course group application had to be...well it had to keep up with that right with that task or with the requirements for that task. And therefore there's has been a lot of enhancements to observability, especially with, especially with memory, more memory, implementation. So, I think over the years, if you look back into what are the replication performance schema tables, what are the replication related stages, variables and so on, you'll see an increase of, of things that have been instrumented and exposed through new columns in performance schema tables or new stages variables. 00;05;20;28 - 00;05;51;10 That's these things are extremely useful when you're when you're running things at scale. So, you need to observe to understand what exactly is going on, how, how exactly to operate and intervene in the system. So, the self-serve ability is paramount. Right? Moreover, we've been making group replication more efficient, making it more resilient to cope with that with the different types of failures. 00;05;51;10 - 00;06;25;00 Some of them are edge cases, but they happen. Also be able to deal in a more graceful way with slow instances so that it doesn't affect so much the overall state of the cluster. And we we've also extended the GR to be able to automatically recover from a few edge cases too. So, this is a whole plethora of, you know, new things that have been happening in GR over the years. 00;06;25;02 - 00;06;58;09 Yeah. And we can also say that by improving group replication for HeatWave HA right, you are also improving group replication in general. And then for example, InnoDB cluster benefits from that too, which is very great for everybody. Yeah. I think you can say that. Yeah. When I think of replication I think of making sure that the data is actually, to be redundant, replicated to a different server, to serve as a backup, read replica or something along those lines. 00;06;58;11 - 00;07;37;12 But how else can we use replication in MySQL HeatWave database service? Let me start by saying that MySQL replication has a lot of different uses in traditionally within the MySQL context, right? So, replication is or can be used for implementing things like high availability, things like read scale out by replicating for many different servers and then reading from these servers and therefore offloading the read load from the primary server. 00;07;37;17 - 00;08;07;26 So, it can also be used for and for things like point-in-time restoring particular by leveraging the binary log and even some sort of change data captured by mining the binary log which itself is a log that captures deltas or changes to your data over time. So in the MySQL HeatWave service, we follow more or less the same pattern, right? 00;08;07;28 - 00;08;42;14 We use, like I said before, from a on a previous question, we use group replication to implement fault tolerant DB systems by having redundant servers and therefore if there is a failure we can fail over from the failed server to 1 of the secondaries. But we also use replication for things like replicating from on premise to the MySQL HeatWave service and vice versa. 00;08;42;14 - 00;09;17;02 Right. So, it's this is what we typically call inbound and outbound, respectively. Some of the use cases are for things like live migrations or for setting up hybrid setups. If you couple that together with replication filters, you can even do things like partial replication from one place to another and just, you know, just having parts of the data in, for instance, an on premise and parts of the data on. 00;09;17;02 - 00;09;43;04 MHS right. So, this allows this is a very flexible, a very flexible tool, if you will, to do these types of setups, as well. Like a like I mentioned, I mentioned before, the binary log that can be used for point-in-time restore. We also use it for point-in-time restore that in the cloud, too. 00;09;43;04 - 00;10;29;25 So, replication as a whole is present everywhere in the MySQL HeatWave Service. Thank you Luisa I saw as you said, you were talking about filters that we have integrated in MySQL HeatWave. Because you said we can migrate from on prem to HeatWave. But we have also I have also seen that we can migrate from to cloud to MySQL HeatWave in OCI and we have predefined or the team has predefined some features to migrate from non-vanilla MySQL so from Amazon or whatever to our own cloud very easily for the users. 00;10;29;25 - 00;11;18;00 Right now there's a few things that when you when you replicate from an external source anything can come down the replication pipe and anything can that comes to down the replication pipe might be conflicting what with what you're running on your target instance. So, you mentioned other types of but well how to call it you mentioned some other maybe cloud providers or something that can have their own modified syntax of commands that are executed against their own, their own versions of MySQL 00;11;18;00 - 00;11;45;05 All these things, of course, can have or cannot be applied on a vanilla maestro and therefore need to be filtered out. So, this is the type of thing that these filters can help you with to avoid this, corner cases or this specific cases where you have incompatible traffic coming down the pipe and you would like to avoid it without breaking replication right. 00;11;45;07 - 00;12;24;06 So that's about it. So, thank you very much, Luis. So, keep improving HA, keep improving MySQL HeatWave high availability and in the same way everything replication yInnoDB cluster replica set, cluster set all that as a replication on the route so it's it's cool and everybody likes it. This is why MySQL are still very known for this replication. I would like to say one thing before we close... we talked about replication as if it's a cool new thing and so on. 00;12;24;06 - 00;13;03;16 But it's it's interesting it's very interesting that within my MySQL itself, has traditionally been around with the with MySQL for a very long time that, right. So, it's very interesting how things have evolved over the years and continue to evolve right. Like, Fred you were saying let's continue to improve it let's it's everywhere and so on and so forth. So, it's a very interesting thing to observe over the years seeing these this gradual evolving of replication itself. 00;13;03;18 - 00;13;23;24 It's nothing new in my spell but it's been evolving so much over the years and it's very, very interesting from I mean, I'm biased but from where I stand it's a great thing to observe, I think. Thank you, Luis. That's a wrap on this episode of Inside MySQL: Sakila Speaks. Thanks for hanging out with us. 00;13;23;29 - 00;13;47;04 If you enjoyed listening, please click subscribe to get all the latest episodes. We would also love your reviews and ratings on your podcast app. Be sure to join us for the next episode of Inside MySQL: Sakila Speaks.
5 episode