Previously…

We discussed some initial prep tasks to do for containzerizing an app, and the key premise is that the way to get data into a container is by envrironmental variables.

In this post, we will discuss why you would or wouldn’t run MySQL (or a persistent service) in a container.

What do I require?

Generally, for topics like this I tend to avoid any technology solutions straight-up and ask some questions. This is because all to often a problem is rammed into a solution, oppose to taking an objective approach.

Some things which are good to ask - What requirements do I have? What constraints do I have? How am I currently operating, what am I willing to sacrifice and change and what won’t/can’t I change?

Here are some things my solutions must do or specifics it must meet:

Must be as cheap
Must perform to some standard at least, or be easily configurable to scale if need be
Musn’t be much more complex then what I’m currently doing
Musn’t restrict me further then what I currently am, or at least let me move towards the more serverless model this series is aiming towards.

What are your/our/the options?

You’ve decided to modernize your app. We’ve discussed above what we need and don’t need. Here are some options we can look at:

Keep your DB on-premise, and connect to it from your k8s cluster (over something like Azure ExpressRoute or a site-to-site VPN)
Install your database into a VM in the cloud, next to your k8s cluster (IaaS pay-as-you-go)
Create/manage/run the cluster yourself, in kubernetes
Use a PaaS service, like Azure SQL/Azure SQL MI for your relational data needs

For each option, we will rate it 5 out of (roughly) each aspect above, where we identified what our requirements are:

Cost
Complexity to implement
Performance
Administrative overhead (Development, administering, troubleshooting etc.)

With 0 being terrible and 5 being great. The lower the score, the less attractive an option is.

But first, why would you run a storage persistent service in k8s?

There’s so many decision points you have to think about when doing this, and given the shared and stateless characteristics of k8s, it makes things hard. It definitely sounds/looks/feels like right-tool/wrong job. K8s is great for stateless services (such as web services) but is difficult with persistent, scalable and distributed services such as a database.

What are some things you have to think about running this service in k8s?

How do you share storage between pods?
How do you manage replication/kitchen-sink of this storage?
How do you manage backup and DR of this storage and pods?
How do you manage HA of the database cluster?
How do you route traffic in a way the database cluster expects?
How do you effectively monitor the cluster and respond to events?

For and against, for each option above

Leaving data on-prem, connecting over a given layer-2.5/3/4 method

For

You don’t have to change a thing with your database, more or less.
Once your app is containzerized, you can deploy it into your k8s cluster and it should work with minimal effort
Excluding the new container development process, the development process remains relatively unmodified.

Against

Connecting on-prem is slow. If it’s over something like expressroute, you have to pay for that (and can take a while for your provider to configure). Either way, you now need infrastructure to do so and manage.
Will have to get firewall ports opened or something in your DMZ configured - either way connectivity back to this on-prem server from a security perspective.
You’re still managing your cluster, backups, DR, hardware, storage, networking, kitchen-sink etc.

Overall

Cost	Complexity	Performance	Administrative	Total
3	3	0	5	11

Run your DB in IaaS

For

You will have a guaranteed like-for-like for features and compability
Somewhat better flexability to scale your service (can scale VM)
Ability to take advantage of other cloud products e.g. Azure Security Centre, one-click backups etc.

Against

Connecting on-prem is slow, as above.
Is basically the most expensive way to run a DB (MSSQL for example). Cost of VM plus licensing.
Still have to manage your SQL infrastructure - cluster etc. Plus you now have to manage a VM in the cloud.

Overall

Cost	Complexity	Performance	Administrative	Total
0	3	3	4	10

Run DB in kubernetes (in the cloud we assume)

For

You perhaps are already paying for the kubernetes compute
Somewhat aligned from an architecture standpoint with your other services
Will ensure your HA/DR is thoroughly tested, either properly tested or straight from production
Yay, you’re in containers

Against

As above, persistent storage in kubernetes isn’t ideal
Still managing your cluster but now with the complexity of kubernetes
Did I mention there’s complexity involved?

Overall

Cost	Complexity	Performance	Administrative	Total
3	0	3	1	7

Run your DB in a PaaS service

For

No administrative overhead
HA/DR/active-active etc. with click of buttons
Options for near like-for-like compaitibility with what you’re currently using
Guaranteed and up-front performance, easy to scale
Point your service at your instance and go

Against

You now have to pay for a PaaS service
Can’t emulate locally for local development, in most instances. Cosmos DB has a local emulator though, for example.

Overall

Cost	Complexity	Performance	Administrative	Total
3	4	4	5	16

Overall

	Cost	Complexity	Performance	Administrative	Total
On-prem	3	3	0	5	11
IaaS	0	3	3	4	10
K8S	3	0	3	1	7
PaaS	3	4	4	5	16

What will be doing with our review site? I began objectively looking at running the MySQL backend component in a container. But given the assessment above, I’m just going to use a cheap PaaS MySQL service (for under $10 a month).

It makes absolute sense to offload all of the administrative overhead, cluster config and maintnenace etc. as well as everything else, to a cloud provider for that cost. Yes, I’m already paying for the compute for k8s, but from an opportunity cost perspective it makes sense.

The PaaS service guarantees certain performance, is in the same DC as my k8s cluster (more or less) and because it’s hosted MySQL I can point my PHP app straight at it, change a connection string and I’m done.

Does that mean that you shouldn’t run your DB or other in k8s? As always, “it depends” and there certainly are situations when that’s the right thing to do, just not for me and this specific scenario. It’s always best to not talk about technology at the start, but instead identify what requirements, constraints and bounds you or your company/organization/customer are operating in and under.