CEMEX was faced with a flood of false positive alerts from its system monitoring software. Ultimately, the company software and best practices to cut the false positives for improved monitoring and a better experience for employees and customers.
Riddle me this! When is too much of a good thing not good? Well, for Juan Murguia Castillo, Cemex IT operations director, and his team in CEMEX’s Operations Control Center (OCC), that question presented itself every day.
Headquartered in Mexico, CEMEX is a building materials supplier and cement producer. It manufactures and distributes cement, ready-mix concrete, and aggregates in more than 50 countries.
And this is important – it has 66 cement plants, 2,000 ready-mix-concrete facilities, 400 quarries, 260 distribution centers, 80 marine terminals, and more than 41,000 employees worldwide. Whew!
So, how the heck does CEMEX manage all those facilities and people? One way is with its OCC.
Overseeing operations worldwide
In 2014, the company launched its center, utilizing the SAP technology platform to monitor 22 production systems and more than 18,000 users worldwide. The platform was designed so that when something goes wrong with the systems, an alert is automatically created — a good thing in theory — making it possible for the OCC team to detect and solve any problems before the business is affected.
But what happens when you’re flooded with those “good things,” and most are false alarms? That was the case for the OCC team. The situation became a needle in the haystack for the team trying to find true issues in the crowd.
Following a false trail
False-positive alerts of any kind, in any industry, and from any source can cause disruptions and drain the pocketbook of a business. Take security alerts, for example,”46% of all application downtime [is] caused by false positives. . . 75% of businesses spent as much, or more, time chasing false positives than they did dealing with actual security incidents,” according to Castillo. The average enterprise receives more than 10,000 security alerts a day. While often less in number, the flood of alerts not related to security can still be overwhelming as well, as the OCC discovered.
“We would often get an average of 200 alerts a day that had to be tracked down, taking up a lot of time and resources just to discover most were false positives,” says Castillo.
A solution that needed updating
The reason for the flood lay with the technology platform’s SAP Solution Manager. CEMEX’s older version adapted in the OCC could not to recurring behavior variations, which resulted in the high false-alert rate, or cover all unanticipated behavior.
The OCC team would set up alert thresholds with parameters to determine how, when, and where an alert should be created. But if an anomaly was under the threshold, the anomaly could not be detected.
The team had limited visibility into SAP applications, making it difficult to determine the status of processes. Alert predictions were also hampered by the Solution Manager’s lack of process intelligence. Ultimately, data created by the software had to be reviewed for reliability — all of which tied down Castillo’s team from pursuing higher-value work.
Raising the Solution Manager IQ
In the last two years, the OCC team and SAP have worked together to teach CEMEX’s SAP Solution Manager to run SAP operations intelligently, utilizing machine learning and artificial intelligence, supported by best practices. The Machine Learning Extension for ELP SAP Solution Manager 7.2 has been implemented as a set of micro-services, running in the CEMEX SAP HANA XS Advanced system space. The result?
The OCC team now has an expanded view of all SAP applications running in the CEMEX enterprise. Issues are easier to predict. Recurring metric and alert variations, unanticipated situations, and anomalies under the threshold are recognized and handled. Redundant alerts have been filtered out, cutting redundancy by 82%.
A win across the board
Overall, the intelligent Solution Manager has improved the reliability of operations. “The number of false-positive alerts has dropped, making it easier to find the ones that matter. The faster we can identify and resolve the issues, the less disruption to our business,” says Martha Leal Tapia, CEMEX SAP service delivery lead. The net benefit is that teams can work on proactive activities like focusing on other issues to enhance the experience of doing business with CEMEX.
To learn more about CEMEX’s creative solution that earned them a finalist’s position at the SAP Innovation Awards for 2022, check out their pitch deck.