Let's boot up a Linux VM and use it as a hypervisor to run our code and Docker containers within it. Then let's automate it for a seamless development experience and impressive performance gains.
Important:This is part 2/3. Today,We are moving our docker to a VM for an order of magnitude performance boost.
Part 1 Mitigation strategies for common D4M performance issues.
Part 2 Replace xhyve and build our own higher performing hypervisor.
Part #3 First, go to the container and switch to developing on a VPS.Wow, these posts blew up and became the #1 search results on Google.
No previous post aboutpoor performance for docker on macOSAbout a year ago, I spent an incredible amount of time trying impractical solutions to overcome the slow performance of macOS.
I was wrong all the time, it was a stupid idea to try to beat D4M. Instead, I should have stalled from the start.
This is a long and fun post about the discovery process. If you're impatient, jump right inno github repository!
Take a look at the summary and choose:
- 💡Brief summary of how we got here
-A bit about how Docker for Mac works-A bit about shared file systems-A bit about Docker contexts and the Docker machine - 🔭The research part: looking to the past
-Parallels file system sharing
-bum environments - 🛠put it all together
-How to use remote interpreters: Visual Studio Code
-Using Remote Interpreters: JetBrains IDE
-Future prospects: first container development - 💿Update: Docker on CPU M1 Max
I was leading a pretty big team on a pretty big project at a pretty big company. We needed Docker. We all had Macs. Every solution had to be 1-click proof and dummy proof.
- Docker 4 Mac performs terribly on some very common workloads. (think: Ruby, NodeJS, PHP…)
- The performance bottleneck is due to the shared file system. The more the disk spins, the more the CPU load increases, resulting in a runaway effect.
- A 400% CPU load is common.
- In the previous post, we tried different coping strategies.
- Back then, the mutagen was simply the best option.
- Around the same time, the D4M team integrated the mutagen. But the implementation left a lot to be desired.
- Additionally, we explore various caching strategies you can use in your application to make the pain bearable. K8s or databases were still a big problem, code editors lacked advanced intelligence, but overall it was better than pure Docker for Mac.
Here's some more information on what exactly is happening (not why).
Lee Hambley. As he explains, these particular "problem" workloads seem to be hitting the hard part with filesystem operation counts.
Worth knowing: Caching in symfony can cause hundreds of thousands of filesystem operations while looking for circular references.
“Ruby applications also generate an incredibly high number of callsstate
milstat
, I'm sure it's due to bootsnap (trying to cache bytecode to boot faster), these particular system calls are very slow on osxfuse compared to 70 microseconds or less on native OS.
A bit about shared file systems
Sharing the file system is tricky, as you saw in the previous post. If you don't trust me, just check out severalMore than 600 comments on issues on GitHub. Much smarter minds than mine work on Docker and have tried at least 4 solutions that I know of, two of which involved creating closed source filesystems. Unlucky. Its best performer so far is the new gRPC-Fuse file system. Unfortunately, benchmarks aren't much fun.
There are different file systemsoptimized for different workloads.
You might be wondering how well mutagen performed compared to osxfs. Basically, it bypasses the entire file system that Shebang shares and uses rsync to externally sync all files. It includes fancy stuff like ignore syntax and fsevents.
A bit about how Docker for Mac works
D4M implements a Docker context (more on this later) that communicates with a Linux virtual machine (xhyve-hipervisor) running on your system. Exactly what it sounds like, a real virtual machine.
The virtual machine "shares" its entire home folder. When you run a command locally, it's actually running inside the VM, but since all the folders have the exact same paths, you never notice it.
A bit about Docker contexts and the Docker machine
You can create context. A context is just a reference to a local or remote Docker installation.
Docker-Machine is a utility that works quite similarly depending on the available environment variables.
I love it when we work so hard in our tunnel vision to find a solution to a problem, only to find out that there was a guy at MIT in the '70s who did a PhD on the subject. This happens very often in software development.
Parallels file system sharing
If you really sit down and check out the older GitHub issues for docker-for-mac (I did, all 600) from time to time, you'll findCommentsbriefly mention Parallels. They usually don't go beyond "works much better" and are quickly ignored by the docker team.
So I sat down, set up a simple Ubuntu Server VM, installed Docker, shared my project folder, and started the largest project I have. Oh boy was it better! I did a "composer install" on a project that consumes 6 GB of RAM on the D4M and my CPU did not go above 50%. Of course, in a fully shared folder, without caches or tricks.
I'm not the first to notice thisvery popular projectshows But the project is based on a really bad and unmaintained distro that I recorded earlier, so we'll skip it for today. But it's always good to see how others have solved this problem.
then i got interestedBECAUSEParallels is much better. And to be honest, there are no real explanations and it's not a miracle. This is a niche problem with a niche product, and the solution is a closed source commercial application.
The feature set they sacrificed (and the speed they achieved) points to NFS, but they still manage to fake permissions and even fsevents unlike NFS. Maybe some proprietary wrapping around it.
Gonzo Narrative:Essentially, someone at Parallels said, "Native stuff works awful, we have money, let's build our own and make more money."
The reason the Docker team decided not to go down the same path is (in their own words) that they didn't want to sacrifice certain POSIX guarantees that Parallels didn't need to provide to get up to speed, as they strongly believe that D4M is "good enough". for most users.
I was hoping to learn more so I tried my hand at a few platforms and was contacted by an Elixir Slack user. I didn't really have an answer, but I did have a better explanation than most of what seems to be going on:
The owner driver is my opinion.
On the Linux side, an fs driver that probably talks to the VM's hypervisor via shared memory or a similar very fast pipeline, which translates it to macOS fs calls and vice versa.
Skipping all the emulation layers and having a very fast pipeline between the fs driver and hypervisor-mediated macOS is pretty much a requirement for the speed that Parallels achieves.Pro Tip:
lsmod
to see which drivers are loaded to begin with.~Cees de Groot
Senior Programming Engineer
@ Canarias Monitoring Inc.
An added blessing is the fact that the folks at Parallels do their job, they seem to be pretty good at managing virtual machines. So the VM behaves much better with power management and throttles/brakes the CPU when the VM is idle. My Mac can breathe again, the days of sweaty palms when I touch the keyboard are over.
bum environments
There is a small company calledhashicorpI had been working with this cloud material for quite some time. 12 Factor Architecture, Terraform, Consul came from there and madetramp.
It's simply a standardized wrapper around local virtual machines that allows us to "infrastructure as code” native way. Write down a small file and deploy a VM repeatedly with a single command.
Open the repository and read the readme file. I did my best to keep it short.
We use Vagrant to create a nice little virtual machine. Use contexts and hosts to replicate the behavior of "local Docker" in a similar (albeit more primitive) way to Docker for Mac.
Every time you want to use Docker, you no longer have to click the Docker icon. you do this insteadof bums
as the repository says, and when it expires, go about your business as usual.
To do this, you need Parallels. This doesn't work with VMWare or Vbox, they suffer from the same problems as D4M.
Perhaps the part I'm most proud of is that we're now safely removing almost all of our caching mechanisms, container volumes, skips, and similar dirty code from the infrastructure and getting back on the path of dev/production parity.
Immediately after logging in, you'll want to start removing your local interpreters. Multiple ASDF or Brew installations of PHP, Elixir, Python.
If you do that, your IDE/editor will cry because you no longer have access to the language interpreter.
How to use remote interpreters: Visual Studio Code
VSC codeprovides the ability to connect to remote serversvia SSH. This also works for containers (remote or otherwise) as long as yourdocker context
indicates a Docker engine.
Select the container you want and a new window will open. Literally out of the container! It is almost completely isolated from the outside world and offers a unique view of exactly what your application would see around it. Best of all, it all happens in the container, so it uses the exact same interpreter as the app, and has access to the same caches, deps, metadata, or build files.
PS There may or may not be a bug with VSCode where sometimes there may not be a connection. It's weird and can be fixed by forcing the following setting in VSCode„docker.host“: „ssh://vagrant@workbox“
Using Remote Interpreters: JetBrains IDE
JetBrains has a slightly different way of doing this (unless we count in theprojector). You run your application on your computer, but decide to use the "Ferndolmetscher“ More than a place.
AFAIK all Rich JetBrains editors have this functionality. So far I have only used IntelliJ Ultimate, PyCharm and PHPStorm.
You can also generate aprojectorContainer inside the VM and just use it like this. What you like!
Future prospects: first container development
This concludes the second in a series of three articles. The next one will be released soon and it will teach you how to take container development one step further.
I have been involved in more than 30 projects in the last few months. My battery lasts 8 hours a day and my CPU never goes above 30%. I even used an iPad to develop once. I'm planning to trade in my 16-inch MacBook Pro BTO for an M1 drive soon.
Perceive: This is part #2 of a three part series. In Part 3, we'll go a step further and learn how to work in a container-first environment. With that, we get full native Linux performance, as well as 17 hours of battery life and zero fan rotations.
Part 1— Overcome performance issues with D4M.
Part 2 — Replace xhyve and build our own higher performing hypervisor.
Part #3— Use a container-first development environment.
Careful.
I completely switched to working on a VPS (Read part #3 here) and won't come back.
However, some use cases still require a Mac. The M1 Macs seem to have twice the performance of the D4Ms. Performance is definitely better than Intel Mac or WSL2 and still significantly worse than Linux.
Also, people with M CPUs used a UTM virtual machine for development. It is said to be very well optimized for M processors. Looking at the numbers below, my best guess is that the person is compiling C.
THESE ARE NOT MY REFERENCES. copiedbirds from MacRumors-Foren:
Mac OS Intel (D4M) – 460s
macOS M1 Max (D4M) – 220 seconds
Dell XPS 17 (Pop!_OS / WSL2) 70 seconds
macOS M1 Max (UTM-Debian) — 9 s(Video) RailsConf 2019 - Terraforming legacy Rails applications by Vladimir Dementyev