7.4 — Security Logging & Monitoring

Response & Improvement 90 min All Roles
0:00 / 0:00
Listen instead
Security Logging & Monitoring
0:00 / 0:00

Learning Objectives

  • Design an application security logging strategy that captures necessary security events while protecting sensitive data.
  • Implement structured logging with standard fields, appropriate levels, and tamper-evident storage.
  • Integrate application logs with SIEM platforms for correlation, alerting, and investigation.
  • Design security monitoring and alerting that detects real threats without creating alert fatigue.
  • Implement AI-specific audit logging to track AI tool usage, AI-generated code provenance, and AI-attributable defects.
  • Meet compliance requirements for security logging across PCI-DSS, HIPAA, SOC 2, and FedRAMP frameworks.

1. Why Application Security Logging Matters

Security logging is the immune system’s memory. Without it, every attack is a zero-day — not because the vulnerability is unknown, but because the organization has no ability to detect exploitation, reconstruct what happened, or learn from the event.

Detection: Security logs are the primary data source for identifying attacks in progress. A well-instrumented application can detect brute force attacks, injection attempts, privilege escalation, data exfiltration, and unauthorized access through its own logs, often before network-level monitoring tools see anything.

Investigation: When an incident occurs, application logs provide the evidence needed to understand what happened: who did what, when, how, and to what data. Without application-level logging, incident responders can see network traffic but cannot understand application-level semantics.

Compliance: Every major regulatory framework requires security logging. PCI-DSS Requirement 10, HIPAA Security Rule 164.312(b), SOC 2 CC7.2, and FedRAMP AU controls all mandate logging of security-relevant events.

Forensics: In the event of litigation, regulatory investigation, or law enforcement involvement, application logs serve as forensic evidence. Logs that are structured, timestamped, and tamper-evident carry significantly more evidentiary weight than ad-hoc log files.

CISA Secure by Design Goal 7 — Evidence of Intrusions: CISA’s Secure by Design initiative (Goal 7) requires software manufacturers to provide logs sufficient to detect and investigate intrusions. Specifically:

  • Provide audit logs at no extra charge (not as a premium tier).
  • Include sufficient detail to detect security-relevant events.
  • Support integration with enterprise security tools.
  • Retain logs for a meaningful period.

This is a direct rebuke to vendors who charge extra for security logging or provide only basic access logs while hiding detailed audit logs behind expensive license tiers.


2. What to Log: Security Events

The following categories of events must be logged for security purposes. This is not optional instrumentation for debugging — this is mandatory security telemetry.

Authentication Events

EventDetails to LogWhy
Successful loginUser ID, timestamp, source IP, authentication method (password, SSO, MFA), user agentBaseline for anomaly detection; evidence of account access
Failed loginAttempted user ID, timestamp, source IP, failure reason (bad password, account locked, MFA failed)Brute force detection, credential stuffing detection
Account lockoutUser ID, timestamp, lockout trigger (failed attempts count), source IPAttack detection, false positive analysis
Password changeUser ID, timestamp, change method (self-service, admin, reset)Account takeover detection
MFA eventsUser ID, timestamp, MFA method, success/failure, MFA enrollment/unenrollmentMFA bypass detection, enrollment anomalies
Token issuanceUser ID, timestamp, token type (JWT, session, API key), expiration, scopeToken theft detection, scope analysis
Token revocationUser ID, timestamp, token ID, revocation reasonIncident response verification
Session creation/terminationUser ID, session ID, timestamp, duration, termination reason (logout, timeout, forced)Session hijacking detection

Authorization Events

EventDetails to LogWhy
Access grantedUser ID, resource, action, timestamp, authorization rule appliedAccess pattern baseline, compliance audit
Access deniedUser ID, resource, action, timestamp, denial reasonPrivilege escalation attempts, misconfiguration detection
Privilege changeUser ID, old role/permissions, new role/permissions, changed by whom, timestampUnauthorized privilege escalation detection
Role assignment/removalUser ID, role, assigned/removed by whom, timestampAccess governance audit
Permission delegationDelegator, delegate, permissions, expiration, timestampDelegation abuse detection

Data Access Events

EventDetails to LogWhy
Sensitive data readUser ID, data type (e.g., PII, financial), record identifiers, timestamp, access methodData exfiltration detection, compliance audit
Sensitive data create/updateUser ID, data type, record identifiers, timestamp, fields modified (not values)Data integrity audit
Sensitive data deleteUser ID, data type, record identifiers, timestamp, soft/hard deleteData destruction audit
Bulk data exportUser ID, data type, record count, export format, destination, timestampMass data exfiltration detection
Data sharing/transferUser ID, recipient, data type, record count, transfer method, timestampUnauthorized sharing detection

Administrative Events

EventDetails to LogWhy
Configuration changeUser ID, setting name, old value, new value, timestampUnauthorized configuration change detection
User managementAdmin user ID, target user ID, action (create, disable, delete), timestampAccount management audit
Feature flag changeUser ID, flag name, old value, new value, timestampSecurity control change detection
API key managementUser ID, key ID (not the key itself), action (create, rotate, revoke), scope, timestampKey management audit
Deployment eventDeployer, artifact version, environment, timestamp, deployment methodChange tracking, rollback reference

Security-Relevant Business Events

Application-specific events that have security implications:

  • Financial transactions above a threshold.
  • Approval workflow actions (approve, reject, delegate).
  • Data classification changes.
  • Consent management actions (grant, revoke).
  • Report generation for sensitive data.
  • External integrations activated or modified.

API Access Events

EventDetails to LogWhy
API requestTimestamp, client ID, endpoint, HTTP method, source IP, user agent, request IDAPI abuse detection, rate limit monitoring
API responseTimestamp, request ID, status code, response timeError pattern detection, performance anomalies
Rate limit eventsClient ID, endpoint, limit, current count, action taken (throttle, block)DDoS/abuse detection
API authentication failureClient ID, endpoint, failure reason, source IPCredential compromise detection

Error Events

EventDetails to LogWhy
Unhandled exceptionTimestamp, exception type, stack trace (sanitized), request contextSecurity control failure detection
Security control failureTimestamp, control name, failure mode, affected requestDefense degradation detection
Input validation failureTimestamp, endpoint, validation rule, input type (not the input value if it could contain attack payloads with PII)Attack pattern detection
Dependency failureTimestamp, dependency name, failure type, impactSupply chain compromise indicator

3. What NOT to Log

Logging too much sensitive data creates its own security risk. Log files become high-value targets if they contain credentials, payment data, or PII. The following must never appear in logs:

Absolute Prohibitions

Passwords: Never log passwords, whether plaintext, hashed, or encrypted. Not in authentication logs, not in debug logs, not in error logs. If a password appears in a stack trace, sanitize the stack trace before logging.

Session tokens, API keys, and secrets: Never log authentication tokens, session identifiers, API keys, bearer tokens, or any other secret material. Log a reference identifier (e.g., last 4 characters of a token hash) if you need to correlate events to a specific token.

Credit card numbers: PCI-DSS explicitly prohibits logging full primary account numbers (PANs). At most, log the first 6 and last 4 digits (BIN and last 4).

Social Security Numbers: Never log SSNs. If you need to reference a specific individual in a log, use an internal user ID.

PHI/ePHI: HIPAA restricts logging of protected health information. Log the minimum necessary for security purposes — typically a patient ID or encounter ID, never diagnosis, treatment, or clinical details.

Full PII beyond necessity: Log the minimum PII necessary for identification. A user ID or email address for audit trail purposes is acceptable. Logging full names, addresses, phone numbers, dates of birth, or biometric data is not necessary for security logging and creates risk.

Encryption keys: Never log encryption keys, key material, or initialization vectors. If you need to reference a key, log a key identifier.

Sanitization Practices

  • Implement a logging sanitizer that automatically detects and redacts patterns matching credit card numbers, SSNs, email addresses (where not needed), and common secret formats.
  • Use structured logging with typed fields so sanitization can be applied by field type rather than pattern matching.
  • Review log output as part of code review, specifically checking for sensitive data leakage.
  • Periodically audit log stores for sensitive data that may have been logged inadvertently.

4. Log Format and Structure

Structured Logging

JSON is the preferred log format. Plain text logs (unstructured) require complex regex parsing for each log source, break when formats change, and resist automated analysis. JSON logs are machine-parseable, self-describing, and extensible.

Example: Unstructured (bad):

2026-03-19 14:32:15 INFO UserController - User john.doe@example.com logged in from 10.0.1.50 using SSO

Example: Structured (good):

{
  "timestamp": "2026-03-19T14:32:15.123Z",
  "level": "INFO",
  "event_type": "authentication.login.success",
  "service": "user-service",
  "version": "2.3.1",
  "environment": "production",
  "trace_id": "abc123def456",
  "actor": {
    "user_id": "usr_9f8e7d6c",
    "type": "human"
  },
  "action": "login",
  "outcome": "success",
  "details": {
    "auth_method": "sso",
    "mfa_used": true,
    "source_ip": "10.0.1.50",
    "user_agent": "Mozilla/5.0..."
  }
}

Standard Fields

Every security log event should include these fields:

FieldFormatDescription
timestampISO 8601 UTC (e.g., 2026-03-19T14:32:15.123Z)When the event occurred. Always UTC to avoid timezone confusion during cross-system correlation.
levelEnum: ERROR, WARN, INFO, DEBUGSeverity of the log entry. Security events should be at INFO minimum.
event_typeDotted notation (e.g., authentication.login.failure)Categorized event type for filtering and alerting.
serviceStringThe service that generated the event.
versionSemVerThe version of the service. Critical for correlating events with deployments.
environmentEnum: production, staging, developmentThe environment.
trace_idString (UUID or distributed trace ID)Correlation ID linking this event to a request trace.
actorObject: { user_id, type }Who performed the action. Type is human, service, system, or ai.
actionStringWhat was done.
targetObject: { type, id }What was acted upon.
outcomeEnum: success, failure, errorResult of the action.
source_ipStringIP address of the requestor.
detailsObjectEvent-specific additional details.

Log Levels for Security Events

LevelUse For SecurityExample
ERRORSecurity control failure, unhandled exception with security implicationWAF bypass detected, encryption service unavailable
WARNSuspicious but not confirmed malicious activityMultiple failed login attempts, unusual API call pattern
INFONormal security events that need to be recordedSuccessful login, access granted, configuration change
DEBUGDetailed security context for investigationFull request/response details (sanitized), detailed authorization decision chain

Security events must be logged at INFO level minimum. Do not log security events at DEBUG level because DEBUG logging is typically disabled in production. If you log a security event at DEBUG, you lose visibility in production — precisely when you need it most.

Immutable Logging

Logs must be tamper-evident to have forensic value. An attacker who compromises a system will attempt to modify or delete logs to cover their tracks.

Append-only storage: Use log storage that supports append-only writes. Once written, log entries cannot be modified or deleted by the application that created them.

Separate log infrastructure: Send logs to a separate system from the application. The application should not have delete or modify access to its own logs.

Cryptographic integrity: Consider log integrity verification using hash chains (each log entry includes a hash of the previous entry) or digital signatures.

Retention locks: Cloud providers offer immutable storage with retention locks (AWS S3 Object Lock, Azure Blob Storage immutability policies). Configure these for log buckets.


5. Log Aggregation and SIEM Integration

Individual application logs are useful for debugging. Aggregated, correlated logs across all applications and infrastructure are essential for security.

Centralized Log Collection

All application logs must flow to a centralized platform where they can be searched, correlated, and analyzed.

Platform options:

PlatformTypeStrengths
Elastic (ELK/EFK)Open-source / commercialFlexible, powerful search, large community, self-hosted option
SplunkCommercialEnterprise standard, powerful SPL query language, extensive app ecosystem
DatadogSaaSIntegrated APM + logs + security, easy setup, good for cloud-native
Sumo LogicSaaSCloud-native, ML-powered analytics, compliance dashboards
Grafana LokiOpen-sourceCost-effective, label-based indexing, excellent Grafana integration
CrowdStrike LogScale (Humio)CommercialReal-time streaming, minimal indexing, excellent for high-volume
Microsoft SentinelSaaSNative Azure integration, SOAR capabilities, AI-powered detection
Google ChronicleSaaSMassive scale, 12-month hot retention, YARA-L detection rules

Log Shipping Methods

Agent-based: Deploy a log shipping agent on each host (Filebeat, Fluentd, Fluent Bit, Vector, Datadog Agent). The agent reads log files, applies parsing and filtering, and ships to the central platform.

Advantages: Works with any log format, handles backpressure and buffering, supports log rotation. Disadvantages: Requires agent deployment and management, consumes host resources.

Direct API: Application sends logs directly to the central platform’s API (e.g., Datadog’s log intake API, Splunk’s HEC).

Advantages: No agent to manage, immediate delivery, structured data preserved. Disadvantages: Application must handle delivery failures, buffering, and retries. Tight coupling to the logging platform.

Syslog (RFC 5424): Traditional syslog forwarding, either via UDP (unreliable) or TCP/TLS (reliable).

Advantages: Universal support, no vendor lock-in, works with legacy systems. Disadvantages: Limited structure, message size limits, requires parsing at the destination.

Cloud-native: Use cloud provider log services as the initial collection point (AWS CloudWatch Logs, Azure Monitor Logs, GCP Cloud Logging), then forward to the central SIEM.

Advantages: Native integration, minimal configuration, handles scale automatically. Disadvantages: Cloud-vendor-specific, additional cost for forwarding, potential latency.

Retention Policies

Log retention must balance compliance requirements, investigation needs, and storage costs:

Requirement SourceMinimum Retention
PCI-DSS (Requirement 10.7)12 months, with 3 months immediately accessible
HIPAA6 years for audit logs
SOX7 years
SOC 21 year (typical)
FedRAMP1 year minimum, 3 years for AU-11
GDPRAs long as necessary for the purpose (minimize)
General recommendation90 days hot (searchable), 1 year warm (retrievable), 7 years cold (archived)

Log Correlation and Enrichment

Raw logs become actionable intelligence through correlation and enrichment:

Correlation: Linking events across multiple sources using shared identifiers (trace ID, user ID, session ID, source IP). A failed login attempt on the authentication service, followed by a successful login from a different IP, followed by an unusual data export — each event is unremarkable alone, but correlated they indicate account compromise.

Enrichment: Adding context to log events from external sources:

  • GeoIP: Resolve IP addresses to geographic locations.
  • Threat intelligence: Flag known-malicious IPs, domains, or user agents.
  • Asset inventory: Add asset criticality and ownership information.
  • User directory: Add user role, department, and last-known-good access patterns.
  • Vulnerability data: Correlate with known vulnerabilities in the affected service.

6. Security Monitoring and Alerting

Logs without monitoring are just expensive storage. Security monitoring transforms log data into actionable detection and response.

Real-Time Alerting on Critical Security Events

Configure alerts for events that require immediate human attention:

Critical alerts (page the on-call):

  • Multiple failed logins followed by a success (credential compromise indicator).
  • Admin account login from an unusual location or at an unusual time.
  • Bulk data export or access pattern anomaly.
  • Security control failure (WAF down, encryption service unavailable).
  • Privilege escalation: user gains admin role.
  • Known attack signature detected (SQL injection, SSRF, path traversal).

High alerts (notify during business hours):

  • New API key created for a service account.
  • Configuration change to security-relevant settings.
  • Failed access to sensitive resources above threshold.
  • Unusual deployment activity (deployment outside normal windows).
  • Certificate or secret approaching expiration.

Informational (dashboard only):

  • Login/logout activity summary.
  • API usage trends.
  • Error rate trends.
  • Dependency health status.

Anomaly Detection

Rule-based alerting catches known attack patterns. Anomaly detection catches unknown patterns by establishing behavioral baselines and alerting on deviations.

Behavioral baselines:

  • Normal API call volume per user/service over time.
  • Normal data access patterns (which users access which data, when, how much).
  • Normal authentication patterns (login times, locations, devices).
  • Normal deployment patterns (who deploys, when, how often).

Deviation detection:

  • Statistical: Alert when a metric exceeds N standard deviations from the baseline.
  • ML-based: Machine learning models trained on historical behavior that score each event’s anomaly probability.
  • Peer comparison: Alert when a user’s behavior differs significantly from their peer group.

Alert Fatigue Management

Alert fatigue is the single biggest threat to security monitoring effectiveness. When analysts receive too many alerts, they start ignoring all of them — including the real ones.

Tuning:

  • Regularly review alert volumes and false positive rates.
  • Suppress or reduce severity of alerts with consistently high false positive rates.
  • Adjust thresholds based on actual observed behavior, not vendor defaults.
  • Remove duplicate alerts that fire from multiple systems for the same event.

Prioritization:

  • Use risk-based priority: asset criticality times alert confidence times threat relevance.
  • Auto-close alerts that correlate with known benign activity (e.g., scheduled jobs, authorized pen tests).
  • Group related alerts into a single incident rather than alerting on each component separately.

Progressive alerting:

  • First occurrence: log only.
  • Repeated occurrence within window: create a ticket.
  • Sustained occurrence: page the on-call.
  • This prevents single anomalous events from generating noise while catching persistent issues.

7. Application-Level Monitoring

Runtime Application Self-Protection (RASP)

RASP instruments the application runtime to detect and block attacks from inside the application:

  • Intercepts function calls related to common vulnerability types (SQL execution, file system access, command execution, deserialization).
  • Validates inputs at the point of use, not just at the perimeter.
  • Can operate in monitoring mode (detect and log) or blocking mode (detect, log, and block).
  • Provides application-level context that network-based tools cannot see.

Tools: Contrast Security, Sqreen (now Datadog Application Security), Hdiv Security, OpenRASP.

Consideration: RASP adds runtime overhead (typically 2-5% latency). Evaluate the performance impact for latency-sensitive applications. RASP is most valuable for legacy applications that are difficult to patch or for applications with a high attack surface.

Application Performance Monitoring (APM) with Security Context

Modern APM tools provide distributed tracing, error tracking, and performance monitoring. When integrated with security context, APM becomes a security tool:

  • Distributed tracing: Follow a request through every service it touches. When a security event occurs, the trace shows the complete request path, making it possible to identify the entry point, propagation path, and impact scope.
  • Error tracking with security classification: Classify errors as security-relevant or operational. A NullPointerException is operational. An InvalidCipherTextException may be security-relevant.
  • Performance anomaly detection: Sudden performance changes can indicate security issues. A database query that suddenly takes 10x longer may indicate a SQL injection time-based attack. An API endpoint that suddenly receives 100x normal traffic may be under attack.

Synthetic Monitoring for Security Controls

Use synthetic monitoring to verify that security controls are functioning:

  • Authentication endpoint monitoring: Regularly verify that login pages respond correctly, MFA is enforced, and account lockout works.
  • Authorization boundary testing: Synthetic requests that verify access controls are enforced (attempt to access resource A with credentials for resource B, expect 403).
  • Security header verification: Monitor response headers to ensure Content-Security-Policy, Strict-Transport-Security, X-Frame-Options, and other security headers are present.
  • Certificate monitoring: Verify TLS certificate validity and warn before expiration.
  • WAF health monitoring: Send known-bad requests and verify they are blocked.

8. AI Audit Logging Requirements

The integration of AI tools into the development lifecycle creates new logging requirements. Organizations must track AI usage, AI-generated code provenance, and AI-related security events.

What to Log for AI

AI code generation events:

{
  "timestamp": "2026-03-19T14:32:15.123Z",
  "event_type": "ai.code_generation.completion",
  "actor": {
    "user_id": "usr_9f8e7d6c",
    "type": "human"
  },
  "ai_tool": {
    "name": "github-copilot",
    "version": "1.234.0",
    "model": "gpt-4o-2025-05-13"
  },
  "action": "code_suggestion",
  "outcome": "accepted",
  "details": {
    "language": "python",
    "file_path": "src/api/auth.py",
    "lines_generated": 15,
    "suggestion_id": "sug_abc123",
    "confidence_score": 0.87,
    "security_sensitive_area": true,
    "review_status": "pending"
  }
}

AI suggestion acceptance tracking:

  • Track the acceptance rate of AI suggestions by developer, team, language, and code area.
  • Track whether accepted suggestions are later modified during code review.
  • Track whether accepted suggestions introduce defects (security or functional).

AI tool usage patterns:

  • Which AI tools are being used.
  • How frequently each tool is used.
  • What types of tasks each tool is used for.
  • Usage patterns that deviate from approved tool policies.

Shadow AI detection:

  • Network-level monitoring for connections to known AI service endpoints that are not on the approved list.
  • DNS monitoring for AI service domain lookups.
  • Proxy/CASB logs for AI service API calls.
  • DLP monitoring for code or data being submitted to unapproved AI services.

AI tool API calls and data submitted:

  • Log the API calls made to AI services (endpoint, method, request size).
  • Log metadata about the data submitted (data type, classification, size) without logging the data itself.
  • Log the AI service’s response metadata (response size, latency, error codes).
  • Alert on submissions that may contain sensitive data (PII patterns, secret patterns, proprietary code markers).

AI-attributable defects post-deployment:

  • Track which production defects were introduced in AI-generated code.
  • Compare defect rates between AI-generated and human-written code.
  • Track defect types to identify patterns (e.g., “AI consistently generates insecure deserialization patterns”).
  • Feed this data back into AI tool configuration and developer training.

AI for Monitoring: AIOps

AI is not just something to monitor — it is increasingly the tool doing the monitoring.

ML-based anomaly detection (e.g., Datadog Watchdog): AIOps platforms use machine learning to:

  • Automatically detect anomalies in metrics, logs, and traces.
  • Correlate anomalies across services to identify root causes.
  • Predict future issues based on trends (e.g., “disk will be full in 3 days at current growth rate”).
  • Reduce alert noise by clustering related alerts and suppressing known benign anomalies.

AI-powered log analysis:

  • Natural language query interfaces (“show me all failed login attempts from external IPs in the last 24 hours”).
  • Automatic pattern extraction from unstructured log data.
  • Summarization of long log sequences into human-readable narratives.
  • Identification of rare or novel log entries that may indicate security events.

AI-powered threat detection:

  • User and Entity Behavior Analytics (UEBA): ML models that establish normal behavior patterns for each user and entity, alerting on deviations.
  • Attack chain detection: AI that identifies multi-step attacks across multiple log sources.
  • Threat hunting assistance: AI that suggests investigation queries based on observed indicators.

Predictive alerting:

  • Predict security incidents before they occur based on leading indicators.
  • Identify conditions that historically preceded breaches (e.g., unusual account creation patterns often precede insider threats).
  • Forecast vulnerability exploitation likelihood based on environmental factors.

9. Compliance Requirements

PCI-DSS (Requirement 10)

PCI-DSS Requirement 10 specifies detailed logging requirements for cardholder data environments:

  • 10.2: Log all individual user access to cardholder data.
  • 10.2.1: Log all individual user access to cardholder data.
  • 10.2.2: Log all actions taken by any individual with root or administrative privileges.
  • 10.2.3: Log access to all audit trails.
  • 10.2.4: Log invalid logical access attempts.
  • 10.2.5: Log use of and changes to identification and authentication mechanisms.
  • 10.2.6: Log initialization, stopping, or pausing of audit logs.
  • 10.2.7: Log creation and deletion of system-level objects.
  • 10.3: Record specific details for each auditable event (user ID, type, date/time, success/failure, origination, identity/name of affected data/component).
  • 10.5: Secure audit trails so they cannot be altered.
  • 10.7: Retain audit trail history for at least one year, with a minimum of three months immediately available.

HIPAA (Security Rule)

  • 164.312(b): Implement hardware, software, and procedural mechanisms that record and examine activity in information systems that contain or use ePHI.
  • 164.308(a)(1)(ii)(D): Implement procedures to regularly review records of information system activity (audit logs, access reports, security incident tracking reports).
  • Minimum necessary principle: Log only the minimum ePHI necessary for security purposes.

SOC 2 (Trust Services Criteria)

  • CC7.2: The entity monitors system components and the operation of those components for anomalies that are indicative of malicious acts, natural disasters, and errors affecting the entity’s ability to meet its objectives; anomalies are analyzed to determine whether they represent security events.
  • CC7.3: The entity evaluates security events to determine whether they could or have resulted in a failure of the entity to meet its objectives (security incidents) and, if so, takes actions to prevent or address such failures.
  • Logging must demonstrate continuous monitoring and detection capability.

FedRAMP

  • AU-2: Auditable Events: Define events to be audited and the content of audit records.
  • AU-3: Content of Audit Records: Ensure records contain sufficient information.
  • AU-6: Audit Review, Analysis, and Reporting: Review and analyze system audit records regularly.
  • AU-9: Protection of Audit Information: Protect audit information and tools from unauthorized access, modification, and deletion.
  • AU-11: Audit Record Retention: Retain audit records for a defined period.
  • AU-12: Audit Generation: Generate audit records for defined auditable events.
  • Continuous monitoring requirement: FedRAMP requires ongoing monitoring with monthly reporting.

10. Implementation Guide

Logging Library Selection

Choose a structured logging library for each language in your stack:

LanguageLibraryFeatures
JavaLogback + SLF4J + Logstash EncoderStructured JSON output, MDC for context propagation
Pythonstructlog or python-json-loggerNative structured logging, context binding
Node.jsPino or WinstonHigh-performance JSON logging, child loggers for context
GoZap or ZerologZero-allocation structured logging
.NETSerilogStructured logging with typed properties
RubySemantic LoggerStructured, high-performance, built-in sanitization

Implementation Checklist

Phase 1: Foundation (Week 1-2)

  • Select and configure structured logging library for each service.
  • Define standard field schema (timestamp, event_type, actor, action, target, outcome).
  • Implement log sanitization to prevent sensitive data leakage.
  • Configure authentication event logging (login, logout, failure, lockout, MFA).
  • Configure authorization event logging (access granted, denied, privilege changes).

Phase 2: Coverage (Week 3-4)

  • Implement data access event logging for sensitive data operations.
  • Implement administrative event logging (config changes, user management).
  • Implement API access logging (requests, responses, rate limit events).
  • Implement error event logging with security classification.
  • Configure log shipping to centralized platform.

Phase 3: Monitoring (Week 5-6)

  • Create alerting rules for critical security events.
  • Configure anomaly detection baselines.
  • Build security monitoring dashboards.
  • Implement synthetic monitoring for security controls.
  • Configure alert routing and escalation.

Phase 4: AI and Compliance (Week 7-8)

  • Implement AI tool usage logging.
  • Configure shadow AI detection monitoring.
  • Verify compliance coverage (PCI-DSS, HIPAA, SOC 2, FedRAMP as applicable).
  • Configure log retention policies and immutable storage.
  • Document the logging architecture for audit purposes.

11. OWASP Logging Cheat Sheet

The OWASP Logging Cheat Sheet provides a comprehensive reference for application security logging. Key recommendations:

  1. Use a common logging framework across all applications in the organization.
  2. Define log levels consistently and enforce them through code review.
  3. Include sufficient context in each log entry to support investigation without requiring access to the application.
  4. Protect log data with appropriate access controls, encryption in transit and at rest, and integrity verification.
  5. Do not log sensitive data (passwords, tokens, PII beyond necessity).
  6. Log security events at the appropriate level (not DEBUG).
  7. Ensure logging does not introduce vulnerabilities (log injection, CRLF injection, XXE in log parsing).
  8. Test logging as part of the application’s test suite — verify that security events generate the expected log entries.
  9. Monitor logs in near-real-time for security events.
  10. Review and update logging regularly as the application evolves.

Log Injection Prevention

A commonly overlooked vulnerability: if user-supplied data is included in log entries without sanitization, attackers can inject malicious content into logs.

Risks:

  • Log forging: Attacker injects fake log entries that look legitimate, potentially covering tracks or framing other users.
  • CRLF injection: Attacker injects newline characters to create new log entries.
  • Log parser exploitation: If logs are processed by tools that interpret special characters (e.g., ANSI escape codes, XML/HTML parsers), injected content can exploit those parsers.

Mitigations:

  • Use structured logging (JSON) — the logging framework handles escaping.
  • If using plain text logs, sanitize user input before including it in log entries (encode or remove newlines, tabs, and control characters).
  • Never allow user input to control the log level or event type.
  • Validate log entry content length to prevent log flooding.

Key Takeaways

  1. Application security logging is mandatory infrastructure, not optional instrumentation. Without it, detection, investigation, compliance, and forensics all fail.
  2. Log what matters, protect what is sensitive. Comprehensive security event logging with strict exclusion of passwords, tokens, PII, and secrets.
  3. Structure enables automation. JSON-structured logs with standard fields enable automated parsing, correlation, alerting, and investigation at scale.
  4. Monitoring without tuning creates alert fatigue. Alert fatigue is the most common failure mode of security monitoring. Invest in tuning, prioritization, and progressive alerting.
  5. AI audit logging is a new requirement. Track AI tool usage, AI-generated code provenance, shadow AI, and AI-attributable defects. This is the new dimension of application security logging.
  6. Logs must be immutable and retained per compliance requirements. Append-only storage, separate infrastructure, and defined retention policies are non-negotiable.
  7. Test your logging. Verify that security events generate expected log entries as part of your test suite. Logging that breaks silently is worse than no logging.

Practical Exercise

Exercise 1: Logging Design Review Review the logging in one of your team’s services. For each category in Section 2, determine whether the events are logged, whether they include sufficient detail, and whether any sensitive data is being logged that should not be. Create a gap analysis document and prioritize the missing logging by security value.

Exercise 2: Alert Fatigue Assessment Audit your current security alerts for the past 30 days. For each alert rule:

  • How many times did it fire?
  • What percentage were true positives?
  • What was the average response time?
  • Was the alert actionable (did the responder know what to do)? Identify the three noisiest alerts and propose tuning strategies.

Exercise 3: AI Logging Design Design the logging schema for AI tool usage in your organization. Define the events to log, the fields for each event, the retention policy, and three monitoring rules that would detect concerning AI usage patterns.


References

Study Guide

Key Takeaways

  1. Security events at INFO level minimum — DEBUG is typically disabled in production; logging security events at DEBUG loses visibility when you need it most.
  2. Never log passwords, tokens, or PII beyond necessity — Log files become high-value targets if they contain credentials or sensitive data.
  3. JSON structured logging enables automation — Machine-parseable, self-describing, extensible; prevents log injection through framework-handled escaping.
  4. Alert fatigue is the biggest monitoring threat — Analysts receiving too many alerts start ignoring all of them including real threats.
  5. Progressive alerting reduces noise — First occurrence: log only; repeated: create ticket; sustained: page on-call.
  6. AI audit logging is a new requirement — Track AI tool usage, code generation events, acceptance rates, and AI-attributable defects.
  7. CISA Secure by Design Goal 7 — Software manufacturers must provide intrusion-detection logs at no extra charge.

Important Definitions

TermDefinition
RASPRuntime Application Self-Protection — instruments application runtime to detect/block attacks; 2-5% latency overhead
Alert FatigueAnalysts ignoring all alerts including real threats due to excessive alert volume
Progressive AlertingGraduated response: log only -> ticket -> page, based on occurrence frequency
Log InjectionAttacker injecting malicious content into log entries through unsanitized user input
Immutable LoggingAppend-only storage preventing modification or deletion of log entries
Structured LoggingJSON-format logs with typed fields enabling automated parsing and correlation
UEBAUser and Entity Behavior Analytics — ML models detecting deviations from normal behavior
Shadow AI DetectionNetwork/DNS/DLP monitoring for connections to unapproved AI service endpoints
Trace IDCorrelation identifier linking events across multiple services in a request path
CSAFCommon Security Advisory Framework — OASIS standard for machine-readable security advisories

Quick Reference

  • Retention Requirements: PCI-DSS 1 year (3 months hot), HIPAA 6 years, SOX 7 years, SOC 2 1 year, FedRAMP 1-3 years
  • Standard Log Fields: timestamp (ISO 8601 UTC), level, event_type, service, version, environment, trace_id, actor, action, outcome, source_ip
  • Never Log: Passwords, session tokens, API keys, credit card numbers, SSNs, PHI details, encryption keys
  • SIEM Platforms: Elastic, Splunk, Datadog, Sumo Logic, Grafana Loki, CrowdStrike LogScale, Sentinel, Chronicle
  • Common Pitfalls: Security events at DEBUG level, logging sensitive data, unstructured plain text logs, no alert tuning, missing AI audit logging, no log integrity verification

Review Questions

  1. Why must security events be logged at INFO minimum and not DEBUG, and what happens in production when DEBUG is disabled?
  2. Design a structured JSON logging schema for authentication events including all fields necessary for correlation and investigation.
  3. Your monitoring has 200 alerts per day with 85% false positives — describe a strategy to reduce alert fatigue while maintaining detection.
  4. What AI audit logging events should be captured, and how do these logs enable tracking AI-attributable defect rates?
  5. Explain log injection, provide an example attack, and describe how structured JSON logging prevents it.
Security Logging & Monitoring
Page 1 of 0 ↧ Download
Loading PDF...

Q1. Why should security events be logged at INFO level minimum, not DEBUG level?

Q2. Which of the following must NEVER appear in application logs?

Q3. What does CISA Secure by Design Goal 7 specifically require from software manufacturers?

Q4. What is the recommended log format for application security logging and why?

Q5. What is the PCI-DSS Requirement 10.7 retention requirement for audit trail history?

Q6. What is alert fatigue and why is it the biggest threat to security monitoring?

Q7. What type of logging should track when AI coding assistants generate and accept code suggestions in security-sensitive areas?

Q8. What is log injection and how is it prevented?

Q9. What is RASP and what is its typical performance overhead?

Q10. In progressive alerting, what happens on the first occurrence of a suspicious event?

Answered: 0 of 10 · Score: 0/0 (0%)