What should machine and artificial intelligence (ML/AI) do for IT operations, DevOps and container management? The following table represents my quick outline of the key challenges and specific problem ML/AI needs to address. The table is based on the believe that ML/AI needs to look over the shoulder of IT ops, DevOps, and business management teams to learn from their decision making. In other words, every virtualization administrator fulfills infrastructure provisioning or upgrade requests a little bit differently. Please regard the below table as a preliminary outline and basis for discussion. At this point, and probably at no future point either, I won’t claim to know the ‘ultimate truth.’

Key Challenges and Specific Problems Machine Learning and Artificial Intelligence Need to Address

Key Challenges Specific Problems to Address
Minimizing manual effort needed to keep the lights on: Applications have to be deployed and managed with minimal manual effort. There are too many manual tasks remaining in managing data center and cloud application infrastructure. Trivial daily, weekly, monthly and quarterly administration tasks use up nearly 50% of staff time. This dramatically reduces the organization’s ability to focus on innovative new projects that create competitive advantages for the business.
Enabling fully rational deployment decisions: Application infrastructure and architecture have to be selected a rational cost versus risk comparison. Enterprises need to make decisions under incomplete information because…

… they do not understand exactly what resources their apps need. EMA research shows that enterprises typically do not have answers to questions such as “how much database latency can my application stomach?” or “how much memory per user do I need to ensure a satisfactory end user experience?”

… they do not understand app usage patterns, which users matter more than others and what is needed to shift infrastructure resources toward an optimal user experience for VIP users, while standard users will still receive an adequate experience.

… they do not understand what resources are available and what the economics behind these resources are.

… they cannot stay on top of constantly changing public cloud and data center offerings.

… they do not have the skill and knowledge to leverage the optimal type of resources. E.g. for apps that show spotty use.

Enterprise IT is still siloed, with NOC, storage, server, OS, middleware, and app management leveraging separate tools and scripts.

Personal bias of operators and developers leads to the adoption of a diverse set of tools that are not integrated well.

Ensuring optimal business alignment: Application management must happen within the business context. This should be based on how important the application is for strategic and tactical business goals and it needs to be tailored to how many IT operations staff are available. For optimal business alignment the enterprise needs to be able to provide business metrics to IT. IT then scores the importance of investments and issue resolution activities.

These metrics could be:
Which and how many customers are affected?
Are any strategic customers or geographies affected?
How does it affect the revenue base versus future markets?

Constant optimization: Application environments need to be constantly optimized based on shifting business goals and the availability of better technology. Serverless functions, container management, and voice recognition are only three examples of new technologies that can change the economics behind IT operations.