Lots of people make the mistake of thinking there’s only two vectors you can go to improve performance, high or wide. There’s a third direction you can go, I call it “going deep”.
The author talks about “high” and “wide” hardware changes, but this can apply to software too. It’s easier to throw a cache at a slow piece of code than going deep and fixing it.
We’re adding heavy runtimes to support multiple platforms instead of staying close to the metal, and we pay the price in performance.