On Friday a flawed update by endpoint virus detection and response
provider CrowdStrike triggered a global outage that crippled the travel
industry’s IT infrastructure. Airports ground to a halt, airlines canceled
flights en masse and frustrated travelers faced long queues and delays.
The financial repercussions will be substantial. The damage to customer
trust and the industry’s reputation will be even more profound. The incident is
a reminder of the golden rule-of-thumb that everything in technology can break
– and eventually will break.
There are segments of the industry that have fully embraced this
principle by avoiding single points of failure. Any modern commercial aircraft,
for example, will have a cascade of redundant backup systems (usually three).
You can fly and land an aircraft with only one engine and bring it to a standstill
on a runway just by using the wheel brakes.
For example, the “Master Minimum Equipment List” for an airworthy Airbus
A380-800, as
published by the United States Federal Aviation Association, is 216 pages long! We live in an era where fail-safes
for fail-safes for fail safes are mandated.
And yet, many still don’t possess this mindset. Now, the future of
travel hinges not just on recovering from this massive IT failure but also on what
comes next.
Besides having to re-deploy staff to manual IT operations, travel
companies will have to rethink their system setup to prevent such a catastrophic
failure from happening again anytime soon.
Embedded artificial intelligence systems can play a proactive role in building
resilience and ensuring seamless operations becomes the norm, because the next black
swan technology event is always just over the horizon.
We must be better prepared.
Today’s AI-powered solution: RAG-based script deployment
Something companies could implement tomorrow is a RAG (retrieve, augment
and generate) AI framework. This system is designed to remotely diagnose and
resolve CrowdStrike-like issues, utilizing several AI technologies working in concert.
A RAG framework collects data (including logs, system metrics etc.) from
endpoints in the systems it monitors. It then uses AI-driven tech like BERT [bidirectional
encoder representations from transformers] and GPT to add context by analyzing
the data.
BERT and GPT then scrutinize this enriched data to detect anomalies and
identify potential issues. Advanced language models like GPT4o, Claude, or
Llama then generate human-readable insights and recommendations, while a
library of custom scripts executes predefined troubleshooting steps remotely.
Finally, the RAG framework’s strict security protocols and continuous
learning mechanisms ensure compliance and data protection.
This integrated approach could have been able to prevent or significantly
mitigate the effects of the CrowdStrike outage.
In the future: the promise of embedded AI
Embedded AI, also known as Edge AI, has gained some momentum this year. Just
a few weeks ago Apple
announced the launch of the fully embedded “Apple Intelligence” on its
devices and Meta is already running Llama 3 as part of its Instagram and WhatsApp
apps.
By integrating AI directly into systems and devices (versus running it
solely on the cloud), travel companies can achieve unparalleled levels of system
independence, predictive maintenance, autonomous response and continuous
learning.
All of the above can be used to build more resilient systems. Black swans can’t
be stopped, but their impacts can be heavily mitigated as a result of this
process.
Here are some of the core benefits:
- Embedded Predictive Maintenance: AI can continuously monitor system
performance, predict potential failures, and trigger preventive measures before
issues escalate (or cascade, as they did in recent days).For
example – a computer mainboard is running hot. The AI script gets notified by
the hardware sensors that the temperature is off and starts checking the load
of its top running programs. It then identifies that there is an element of the
code that is stuck and has a memory leak using a nested AI logical search.It checks via API-connect for known issues and what needs to be done to
apply a fix or identifies the latest working backup date and escalates to an
end-user/admin for action. Or, if it doesn’t receive a response at all, it can
define its own action within a certain timeframe. - Device-level autonomous response: In the event of disruption embedded AI
can take immediate action by autonomously initiating recovery protocols. This
reduces downtime and keeps critical systems running, minimizing the impact on
operations and customers. - Single-node continuous learning: Embedded AI systems can learn from each
incident, constantly improving their ability to detect and mitigate future
threats. Each node of the system could then develop further measures and share
them. This dynamic learning capability means that the more the nodes are
exposed to issues, the better the overall AI foundation becomes at preventing
similar problems. - Stand-alone enhanced security: With real-time threat detection and
mitigation, AI can bolster the security of travel IT systems. Each node becomes
better able to protect itself against cyber threats that could otherwise
exploit system-wide vulnerabilities during outages. - Dynamic risk assessments: AI can provide continuous risk assessments,
identifying vulnerabilities as they emerge and suggesting proactive measures to
address them. A network of AI-embedded nodes can work with a “master” AI to
bolster its response to possible incidents. - Employee empowerment: Training employees to work alongside AI tools
ensures that human expertise complements technological advancements, creating a
robust defense against a Black swan event.
Looking ahead: The future of travel in an AI-enhanced world
The global CrowdStrike outage is a wake-up call – a challenge which
cannot be ignored. Using embedded AI will allow the travel industry to move
beyond mere recovery and build a future where disruptions are anticipated and
mitigated.
We can’t prevent black swans. However, we can significantly reduce the
risk of their re-occurrence and the damage they cause.
The future of travel lies in intelligent, proactive, and resilient
systems powered by AI. The question should no longer be “if” the industry will
adopt these technologies, rather “when,” and how quickly it can do so to
safeguard its operations and earn back the trust of customers.