EventOrOutage is leveraging LLMs to help SREs understand if a drop in traffic is due to an external event (holiday, election, sport event...) instead of an outage. For each event, it shows the probably that the event could have an impact on traffic, which geographies are impacted and how many people may be involved.
This standalone prototype shows how such a feature could be useful as part of an AI SRE or embedded in a monitoring tool.
$ eventoroutage -d "february 9th, 2025"
Super Bowl LVIII – 85%:
- Number of people involved: Over 100 million
- Countries involved: United States
Lunar New Year Celebrations – 80%:
- Number of people involved: Approximately 1.5 billion
- Countries involved: China, Singapore, Malaysia, Indonesia, Philippines
- Python > 3.10
- A
.env
file with OpenAPI/Gemini/Anthropic API Key (at least one)
Optional:
- A
HOLIDAY_API_KEY
in the.env
file Holiday API - A
CALENDARIFIC_API_KEY
in the.even
file Calendrific
Run the following in a terminal
python -m venv .venv
source .venv/bin/activate
pip install .
eventoroutage
Here are a few ways you can use EventOrOutage:
eventoroutage
– will look for events happening todayeventoroutage -d "February 14, 2025" -m "gpt-4o"
– look for events at a specific date, using a specific modeleventoroutage -l IN
– look for events in a specific location, here India-
eventoroutage -i "Social Media"
– look for events that could impact a specific industry, here social media websites
eventoroutage -f "artifacts/traffic_events.csv"
– analyze traffic logs from a filegeneratedata -d .
– generates synthetic traffic logs
- LLMs: GPT-4, Claude, Gemini and self-hosted (Deepseek).
- Agent: HuggingFace smolagents
- Data Sources: External APIs for holidays, news, and event tracking
Back when Jeba and Sylvain were working at LinkedIn, they faced a situation where a large chunk of the site traffic was gone. Leadership panicked, engineering could not find the cause.
Turns out a major holiday was happening in India and people were busy celebrating, instead of browsing LinkedIn. Knowing about every potential major event in every country your product is used for isn’t possible, but LLMs are great at this type of task.
- Add support for additional data sources such as everyeventapi, Google Calendar API for holidays etc
- Integration in data from logging tools such as Loggly, Splunk to analyze traffic anomalies
This project was developed by the Rootly AI Labs. The AI Labs is a fellow-led program designed to redefine reliability and system operations. We develop innovative prototypes, create open-source tools, and produce research reports we share with the community.