Patents covering key breakthrough technologies often generate attention, but many fields of development generate large numbers of patents, each covering relatively small improvements. This is especially true in telecoms, where engineers seek to make new technology realise the promise of higher data rate services. That developmental process can take years, giving the opportunity for refinement and improvement, with many more patents.
All communications engineers know that the Shannon-Hartley theorem sets a theoretical limit on the data rate achievable over a communications channel. With each new generation of cellular network, there has been at least one revolutionary patented technology, for example: turbo codes for 3G (see US-5,446,747); and OFDMA in 4G (see US-3,488,445). The initial patents for each technology were filed long before the ideas were implemented in the relevant telecoms standards.
In the meantime, these fundamental patents were followed by many (typically thousands) of further patented inventions. Each additional invention refined the implementation and allowed the technology to increase the efficiency of communication significantly and come closer to the theoretical upper bound: higher data-rates without the need for more power. The economic incentive of a patent may have helped to spur on this development, especially over the long adoption period.
Massive MIMO dates back to the early 1990s (see US-5,515,378) and could provide efficiency gains for 5G, but the jury is still out on how revolutionary it will be. The technology may also have its drawbacks, with power consumption potentially being a problem, at least in early implementations. Further innovations to address these issues seem likely, resulting in more patents in this area.
Once you look outside the specific technologies related to 5G networks, like massive MIMO, there is a general issue that even if a new technology is more energy efficient, or consumes less energy, it will take time before it becomes popular enough to provide noticeable differences across a network