Risk Scoring
Risk scoring in Alert Manager Enterprise (AME) is a powerful feature that enhances event management by assigning numerical values to alerts and observables, helping IT Ops and Security teams prioritize and investigate potential threats effectively within Splunk Enterprise and Splunk Cloud.
Overview
Risk scoring in AME provides a structured way to assess the severity and urgency of alerts and events, enabling teams to focus on the most critical issues. By integrating with Splunk’s alerting capabilities and AME’s event management system, risk scoring helps reduce alert fatigue, minimize false positives, and streamline security and operational workflows. It leverages data from observables, alert triggers, and contextual information to calculate a risk score, which can guide decision-making and resource allocation.
Key Capabilities
- Prioritization: Assigns risk scores to alerts and events based on severity, impact, and confidence, allowing teams to prioritize high-risk issues.
- Enrichment: Integrates with observables (e.g., assets, identities) to enrich alerts with contextual data, such as device criticality or user roles, improving risk assessment accuracy.
- Automation: Supports automated responses and notifications based on risk thresholds, reducing manual triage time.
- Customization: Allows users to define risk impact, confidence levels, and modifiers to tailor scoring to specific organizational needs.
How Risk Scoring Works
In AME, risk scoring is calculated using a combination of factors, including:
- Risk Impact: The potential negative effect of an alert or event on your environment, such as data loss, system downtime, or security breaches. This is typically assigned a numerical value (e.g., 1–100 or 1–10) based on the severity of the detected activity.
- Risk Confidence: The reliability or certainty that the alert represents a true positive (e.g., a genuine threat) rather than a false positive. Higher confidence levels increase the risk score.
- Risk Modifier: A tuning factor that adjusts the score based on contextual data, such as the criticality of the affected asset or user (e.g., a privileged account or domain controller increases the score).
These factors are combined to generate a risk score, which is stored in the AME Risk Index—a centralized repository for all risk-related data. This score can be used to trigger actions, such as escalating alerts, notifying stakeholders, or initiating Splunk Workflow Actions (e.g., opening a Jira issue or sending a mobile notification).
You can view detailed risk events associated with an alert or event in the AME Event Details interface. Within the event details panel, navigate to the Risk Events tab to see a breakdown of risk occurrences, including:
- Occurrence: The timestamp when the risk event was detected (e.g.,
2025-02-24 06:45:00.000
). - Type: The type of observable or data contributing to the risk, such as
asset
oridentity
. - Matched Value: The specific value or field that triggered the risk (e.g., an IP address like
168.127.246.121
). - Risk Change: The numerical adjustment to the risk score caused by this event (e.g.,
100
).
For example, as shown in the event details for a vulnerability alert, the "Risk Events" tab might display an asset-based risk event with an IP address match, increasing the event’s risk score by 100, reflecting the criticality of the affected asset.
Configuration
To configure risk scoring in AME, follow these steps:
- Set Up Observables: Ensure your observables (assets and identities) are properly configured in AME, including assigning priority levels or criticality (e.g., “high,” “low”) to devices and users. This data enriches alerts and influences risk modifiers (see Observables for details).
- Define Risk Rules: In the AME Templates or Rules settings, create or modify risk rules to specify:
- Impact Level: Assign a numerical impact value (e.g., 100 for high-impact alerts) based on the potential severity.
- Confidence Level: Set a confidence value (e.g., 80% certainty) to indicate the likelihood of a true positive.
- Risk Modifier: Adjust the score for critical assets or users (e.g., increase the score by 50 for privileged accounts).
- Map Alert Fields: In the AME Template configuration, use the “Observable Matching” section to map alert fields (e.g.,
ip
,hostname
) to observable fields (e.g.,ip
,name
) for enriched risk scoring (see Templates for details). - Set Thresholds: Define risk score thresholds in AME Rules or Event Summary settings to trigger automated actions, such as notifications or escalations, when scores exceed a certain value (e.g., 80 out of 100).
- Test and Tune: Run test alerts and review the Risk Index to refine impact, confidence, and modifier values, ensuring scores accurately reflect organizational priorities.
Use Cases
- Reduce Alert Fatigue: Prioritize high-risk alerts (e.g., those with scores above 80) to focus team efforts on critical threats, reducing the overwhelm from low-risk or false-positive alerts.
- Detect Advanced Threats: Identify “low and slow” attacks by aggregating risk scores over time, tracking patterns across multiple alerts or observables (e.g., user behavior spanning MITRE ATT&CK tactics).
- Enhance Incident Response: Automatically escalate high-risk events to stakeholders via email, Slack, Teams, or Splunk Mobile, speeding up resolution and minimizing damage.
- Improve Reporting: Use risk scores in the Event Summary or dashboards to provide clear, quantifiable metrics for management, demonstrating the effectiveness of security measures.
Best Practices
- Standardize Scoring: Use a consistent scale (e.g., 0–100) across your organization to ensure all teams interpret risk scores uniformly.
- Leverage Machine Learning: Where possible, integrate Splunk’s machine learning capabilities to refine risk scoring over time, reducing false positives and improving accuracy.
- Monitor and Adjust: Regularly review risk scores in the Risk Index and adjust impact, confidence, and modifiers based on evolving threats and organizational priorities.
- Integrate with Frameworks: Map risk scoring to security frameworks like MITRE ATT&CK or CIS 20 to align with industry best practices and identify gaps in your security posture.
Example
Consider an alert triggered by a suspicious login from an IP address. AME configures the risk scoring as follows:
- Risk Impact: 50 (potential unauthorized access to a server).
- Risk Confidence: 90 (high certainty based on login patterns and geolocation mismatch).
- Risk Modifier: +30 (the target server is a domain controller, increasing criticality).
The resulting risk score is 170 (50 + 90 + 30). If the threshold is set at 150, AME escalates the alert, notifies the security team via Slack, and opens a Splunk Workflow Action for investigation. You can view this risk event in the "Risk Events" tab of the event details, showing the IP match (168.127.246.121
) and a risk change of 100, contributing to the overall score.
Limitations
- Risk scoring requires accurate and up-to-date observables and alert data to function effectively. Incomplete or misconfigured data may lead to inaccurate scores.
- Initial setup and tuning may require time and expertise to align with organizational risk tolerance and operational needs.
- Advanced features, such as machine learning integration, may depend on additional Splunk or AME configurations or subscriptions.
For more details on integrating risk scoring with AME’s event management and Splunk workflows, see Event Summary and Templates.