Discovery/Mapping (SNMP/CDP/LLDP) #287

Closed
opened 2026-03-28 04:23:07 +00:00 by mfreeman451 · 1 comment
Owner

Imported from GitHub.

Original GitHub issue: #792
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/792
Original created: 2025-05-16T03:31:04Z


Implement Discovery Engine for ServiceRadar

Overview

Implement an discovery engine that probes network devices using SNMP to gather detailed information about devices, interfaces, and interconnections. This component will publish discovered data to Timeplus Proton streams for eventual ingestion into ArangoDB's network knowledge graph.

Requirements

Core Functionality

  • Implement SNMP communication module using gosnmp library:

    • Support for SNMP v2c (Community Strings)
    • Support for SNMP v3 (AuthNoPriv, AuthPriv with MD5/SHA auth and DES/AES privacy)
    • Configurable timeouts and retries
    • SNMP GET, GETNEXT, and WALK operations
  • Create seeding mechanism:

    • Support manual configuration of seed IP addresses
    • Support manual configuration of seed IP subnets/ranges (e.g., 192.168.1.0/24)
    • Load seed configuration from file or KV store (per ADR-01)
  • Build data collection core:

    • System information collection (sysDescr, sysObjectID, sysName, etc.)
    • Interface information collection (ifIndex, ifDescr, ifName, ifAlias, etc.)
    • IP address mapping for interfaces
    • Neighbor discovery via LLDP
    • Neighbor discovery via CDP
  • #802

    • Publish basic device info to sweep_results stream with discovery_source='snmp_discovery'
    • Publish interface details to discovered_interfaces stream
    • Publish neighbor relationships to topology_discovery_events stream

Configuration Management

  • Develop configuration module:
    • Global SNMP v2c community strings
    • Global SNMP v3 credentials (username, auth/priv protocols and passwords)
    • Per-target/subnet credential overrides
    • Configurable OID lists
    • Discovery schedule/interval
    • Configuration loading from file or KV store

Engine Operation

  • Implement discovery workflow:
    • Reachability check (ping/connect)
    • SNMP connection with configured credentials
    • Collect system, interface, and neighbor data
    • Data formatting and publishing
    • Efficient subnet scanning with concurrency controls

Error Handling and Logging

  • Create comprehensive logging:
    • Log all discovery attempts, successes, and failures
    • Log SNMP communication errors
    • Handle gracefully devices not supporting certain MIBs/OIDs

Technical Details

Project Structure

/serviceradar-discovery/
  /cmd/
    /serviceradar-discovery/
      main.go                  # Entry point
  /internal/
    /config/                   # Configuration loading
    /discovery/                # Core discovery engine
      discovery_engine.go      # Main engine logic
      snmp_client.go           # SNMP operations wrapper
      topology.go              # LLDP/CDP processing
      device_info.go           # System info processing
      interfaces.go            # Interface info processing
    /models/                   # Data models matching Proton streams
    /publisher/                # Proton stream publishing
    /utils/                    # Utility functions
  /pkg/                        # Public packages (if any)
  go.mod
  go.sum
  Dockerfile
  config.example.json

Data Models

Implement structures that match these Proton stream schemas:

  1. For sweep_results stream:
type SweepResult struct {
    AgentID         string                 `json:"agent_id"`
    PollerID        string                 `json:"poller_id"`
    DiscoverySource string                 `json:"discovery_source"` // Always "snmp_discovery"
    IP              string                 `json:"ip"`
    MAC             string                 `json:"mac,omitempty"`
    Hostname        string                 `json:"hostname"`
    Timestamp       time.Time              `json:"timestamp"`
    Available       bool                   `json:"available"`
    Metadata        map[string]interface{} `json:"metadata"`
}
  1. For discovered_interfaces stream:
type DiscoveredInterface struct {
    Timestamp     time.Time              `json:"timestamp"`
    AgentID       string                 `json:"agent_id"`
    PollerID      string                 `json:"poller_id"`
    DeviceIP      string                 `json:"device_ip"`
    DeviceID      string                 `json:"device_id"`
    IfIndex       int                    `json:"ifIndex"`
    IfName        string                 `json:"ifName"`
    IfDescr       string                 `json:"ifDescr"`
    IfAlias       string                 `json:"ifAlias"`
    IfSpeed       int64                  `json:"ifSpeed"`
    IfPhysAddress string                 `json:"ifPhysAddress"`
    IPAddresses   []string               `json:"ip_addresses"`
    IfAdminStatus int                    `json:"ifAdminStatus"`
    IfOperStatus  int                    `json:"ifOperStatus"`
    Metadata      map[string]interface{} `json:"metadata"`
}
  1. For topology_discovery_events stream:
type TopologyDiscoveryEvent struct {
    Timestamp               time.Time              `json:"timestamp"`
    AgentID                 string                 `json:"agent_id"`
    PollerID                string                 `json:"poller_id"`
    LocalDeviceIP           string                 `json:"local_device_ip"`
    LocalDeviceID           string                 `json:"local_device_id"`
    LocalIfIndex            int                    `json:"local_ifIndex"`
    LocalIfName             string                 `json:"local_ifName"`
    ProtocolType            string                 `json:"protocol_type"` // "LLDP" or "CDP"
    NeighborChassisID       string                 `json:"neighbor_chassis_id"`
    NeighborPortID          string                 `json:"neighbor_port_id"`
    NeighborPortDescr       string                 `json:"neighbor_port_descr"`
    NeighborSystemName      string                 `json:"neighbor_system_name"`
    NeighborManagementAddr  string                 `json:"neighbor_management_address"`
    Metadata                map[string]interface{} `json:"metadata"`
}

SNMP OIDs to Query

  • System Information:

    • sysDescr: '1.3.6.1.2.1.1.1.0'
    • sysObjectID: '1.3.6.1.2.1.1.2.0'
    • sysName: '1.3.6.1.2.1.1.5.0'
    • sysUpTime: '1.3.6.1.2.1.1.3.0'
    • sysContact: '1.3.6.1.2.1.1.4.0'
    • sysLocation: '1.3.6.1.2.1.1.6.0'
  • Interface Information:

    • ifIndex: '1.3.6.1.2.1.2.2.1.1'
    • ifDescr: '1.3.6.1.2.1.2.2.1.2'
    • ifName: '1.3.6.1.2.1.31.1.1.1.1'
    • ifAlias: '1.3.6.1.2.1.31.1.1.1.18'
    • ifType: '1.3.6.1.2.1.2.2.1.3'
    • ifSpeed: '1.3.6.1.2.1.2.2.1.5'
    • ifPhysAddress: '1.3.6.1.2.1.2.2.1.6'
    • ifAdminStatus: '1.3.6.1.2.1.2.2.1.7'
    • ifOperStatus: '1.3.6.1.2.1.2.2.1.8'
  • IP Address Information:

    • ipAdEntAddr: '1.3.6.1.2.1.4.20.1.1'
    • ipAdEntIfIndex: '1.3.6.1.2.1.4.20.1.2'
    • ipAdEntNetMask: '1.3.6.1.2.1.4.20.1.3'
  • LLDP Information:

    • lldpLocPortId: '1.0.8802.1.1.2.1.3.7.1.3'
    • lldpLocSysName: '1.0.8802.1.1.2.1.3.3.0'
    • lldpRemChassisId: '1.0.8802.1.1.2.1.4.1.1.5'
    • lldpRemPortId: '1.0.8802.1.1.2.1.4.1.1.7'
    • lldpRemPortDesc: '1.0.8802.1.1.2.1.4.1.1.8'
    • lldpRemSysName: '1.0.8802.1.1.2.1.4.1.1.9'
    • lldpRemSysDesc: '1.0.8802.1.1.2.1.4.1.1.10'
    • lldpRemManAddr: '1.0.8802.1.1.2.1.4.2.1.3'
  • CDP Information:

    • cdpCacheAddress: '1.3.6.1.4.1.9.9.23.1.2.1.1.4'
    • cdpCacheVersion: '1.3.6.1.4.1.9.9.23.1.2.1.1.5'
    • cdpCacheDeviceId: '1.3.6.1.4.1.9.9.23.1.2.1.1.6'
    • cdpCacheDevicePort: '1.3.6.1.4.1.9.9.23.1.2.1.1.7'
    • cdpCachePlatform: '1.3.6.1.4.1.9.9.23.1.2.1.1.8'

Implementation Details

  1. Configuration:

    • Use a JSON configuration format with sections for:
      • SNMP credentials (global and per-target)
      • Target seeds (IPs and subnets)
      • Scan schedule/interval
      • Proton connection details
      • OID configurations (for extensibility)
  2. Engine Operation Workflow:

    • Initialize engine with config
    • For each seed IP/subnet:
      • Check basic reachability
      • Try SNMP connection with credentials
      • If successful:
        • Collect system information
        • Collect interface details
        • Collect IP address mappings
        • Collect LLDP/CDP neighbor data
      • Format and publish collected data to appropriate streams
  3. Concurrency Model:

    • Use a worker pool pattern for efficient scanning
    • Configurable concurrency limits to prevent network/device overload
    • Cancellable context for clean shutdown
  4. Error Handling:

    • Graceful handling of timeouts and retries
    • Skip devices/OIDs that consistently fail
    • Detailed error logging for troubleshooting

Testing Strategy

  • Unit tests for individual components:
    • SNMP client wrapper
    • Data parsers
    • Configuration loading
  • Integration tests with mock SNMP responses
  • End-to-end tests against test devices (if available)

Dependencies

  • github.com/gosnmp/gosnmp
  • Configuration library (viper or similar)
  • Logging library (logrus or similar)
  • Proton client SDK

Acceptance Criteria

  1. Engine successfully discovers devices and publishes to correct Proton streams
  2. Proper handling of different SNMP versions and credentials
  3. Complete collection of system, interface, and neighbor data where available
  4. Efficient scanning with proper error handling
  5. Proper security handling of credentials
  6. Comprehensive logging for troubleshooting
  • ServiceRadar PRD: "ServiceRadar SNMP Discovery Engine" v1.1
  • ADR-01: Configuration Management
  • ADR-02: ArangoDB Sync Service

Next Steps

After completion, consider these enhancements:

  • Discovery of VLANs (Q-BRIDGE-MIB)
  • Discovery of LAGs (IEEE8023-LAG-MIB)
  • Custom MIB support for vendor-specific information
  • More sophisticated device role identification
Imported from GitHub. Original GitHub issue: #792 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/792 Original created: 2025-05-16T03:31:04Z --- # Implement Discovery Engine for ServiceRadar ## Overview Implement an discovery engine that probes network devices using SNMP to gather detailed information about devices, interfaces, and interconnections. This component will publish discovered data to Timeplus Proton streams for eventual ingestion into ArangoDB's network knowledge graph. ## Requirements ### Core Functionality - [x] Implement SNMP communication module using `gosnmp` library: - [x] Support for SNMP v2c (Community Strings) - [ ] Support for SNMP v3 (AuthNoPriv, AuthPriv with MD5/SHA auth and DES/AES privacy) - [ ] Configurable timeouts and retries - [x] SNMP GET, GETNEXT, and WALK operations - [x] Create seeding mechanism: - [x] Support manual configuration of seed IP addresses - [x] Support manual configuration of seed IP subnets/ranges (e.g., 192.168.1.0/24) - [x] Load seed configuration from file or KV store (per ADR-01) - [x] Build data collection core: - [x] System information collection (sysDescr, sysObjectID, sysName, etc.) - [x] Interface information collection (ifIndex, ifDescr, ifName, ifAlias, etc.) - [x] IP address mapping for interfaces - [x] Neighbor discovery via LLDP - [x] Neighbor discovery via CDP - [x] #802 - [x] Publish basic device info to `sweep_results` stream with `discovery_source='snmp_discovery'` - [x] Publish interface details to `discovered_interfaces` stream - [x] Publish neighbor relationships to `topology_discovery_events` stream ### Configuration Management - [x] Develop configuration module: - [x] Global SNMP v2c community strings - [ ] Global SNMP v3 credentials (username, auth/priv protocols and passwords) - [x] Per-target/subnet credential overrides - [x] Configurable OID lists - [x] Discovery schedule/interval - [x] Configuration loading from file or KV store ### Engine Operation - [x] Implement discovery workflow: - [x] Reachability check (ping/connect) - [x] SNMP connection with configured credentials - [x] Collect system, interface, and neighbor data - [x] Data formatting and publishing - [x] Efficient subnet scanning with concurrency controls ### Error Handling and Logging - [x] Create comprehensive logging: - [x] Log all discovery attempts, successes, and failures - [x] Log SNMP communication errors - [x] Handle gracefully devices not supporting certain MIBs/OIDs ## Technical Details ### Project Structure ``` /serviceradar-discovery/ /cmd/ /serviceradar-discovery/ main.go # Entry point /internal/ /config/ # Configuration loading /discovery/ # Core discovery engine discovery_engine.go # Main engine logic snmp_client.go # SNMP operations wrapper topology.go # LLDP/CDP processing device_info.go # System info processing interfaces.go # Interface info processing /models/ # Data models matching Proton streams /publisher/ # Proton stream publishing /utils/ # Utility functions /pkg/ # Public packages (if any) go.mod go.sum Dockerfile config.example.json ``` ### Data Models Implement structures that match these Proton stream schemas: 1. For `sweep_results` stream: ```go type SweepResult struct { AgentID string `json:"agent_id"` PollerID string `json:"poller_id"` DiscoverySource string `json:"discovery_source"` // Always "snmp_discovery" IP string `json:"ip"` MAC string `json:"mac,omitempty"` Hostname string `json:"hostname"` Timestamp time.Time `json:"timestamp"` Available bool `json:"available"` Metadata map[string]interface{} `json:"metadata"` } ``` 2. For `discovered_interfaces` stream: ```go type DiscoveredInterface struct { Timestamp time.Time `json:"timestamp"` AgentID string `json:"agent_id"` PollerID string `json:"poller_id"` DeviceIP string `json:"device_ip"` DeviceID string `json:"device_id"` IfIndex int `json:"ifIndex"` IfName string `json:"ifName"` IfDescr string `json:"ifDescr"` IfAlias string `json:"ifAlias"` IfSpeed int64 `json:"ifSpeed"` IfPhysAddress string `json:"ifPhysAddress"` IPAddresses []string `json:"ip_addresses"` IfAdminStatus int `json:"ifAdminStatus"` IfOperStatus int `json:"ifOperStatus"` Metadata map[string]interface{} `json:"metadata"` } ``` 3. For `topology_discovery_events` stream: ```go type TopologyDiscoveryEvent struct { Timestamp time.Time `json:"timestamp"` AgentID string `json:"agent_id"` PollerID string `json:"poller_id"` LocalDeviceIP string `json:"local_device_ip"` LocalDeviceID string `json:"local_device_id"` LocalIfIndex int `json:"local_ifIndex"` LocalIfName string `json:"local_ifName"` ProtocolType string `json:"protocol_type"` // "LLDP" or "CDP" NeighborChassisID string `json:"neighbor_chassis_id"` NeighborPortID string `json:"neighbor_port_id"` NeighborPortDescr string `json:"neighbor_port_descr"` NeighborSystemName string `json:"neighbor_system_name"` NeighborManagementAddr string `json:"neighbor_management_address"` Metadata map[string]interface{} `json:"metadata"` } ``` ### SNMP OIDs to Query * System Information: * `sysDescr`: '1.3.6.1.2.1.1.1.0' * `sysObjectID`: '1.3.6.1.2.1.1.2.0' * `sysName`: '1.3.6.1.2.1.1.5.0' * `sysUpTime`: '1.3.6.1.2.1.1.3.0' * `sysContact`: '1.3.6.1.2.1.1.4.0' * `sysLocation`: '1.3.6.1.2.1.1.6.0' * Interface Information: * `ifIndex`: '1.3.6.1.2.1.2.2.1.1' * `ifDescr`: '1.3.6.1.2.1.2.2.1.2' * `ifName`: '1.3.6.1.2.1.31.1.1.1.1' * `ifAlias`: '1.3.6.1.2.1.31.1.1.1.18' * `ifType`: '1.3.6.1.2.1.2.2.1.3' * `ifSpeed`: '1.3.6.1.2.1.2.2.1.5' * `ifPhysAddress`: '1.3.6.1.2.1.2.2.1.6' * `ifAdminStatus`: '1.3.6.1.2.1.2.2.1.7' * `ifOperStatus`: '1.3.6.1.2.1.2.2.1.8' * IP Address Information: * `ipAdEntAddr`: '1.3.6.1.2.1.4.20.1.1' * `ipAdEntIfIndex`: '1.3.6.1.2.1.4.20.1.2' * `ipAdEntNetMask`: '1.3.6.1.2.1.4.20.1.3' * LLDP Information: * `lldpLocPortId`: '1.0.8802.1.1.2.1.3.7.1.3' * `lldpLocSysName`: '1.0.8802.1.1.2.1.3.3.0' * `lldpRemChassisId`: '1.0.8802.1.1.2.1.4.1.1.5' * `lldpRemPortId`: '1.0.8802.1.1.2.1.4.1.1.7' * `lldpRemPortDesc`: '1.0.8802.1.1.2.1.4.1.1.8' * `lldpRemSysName`: '1.0.8802.1.1.2.1.4.1.1.9' * `lldpRemSysDesc`: '1.0.8802.1.1.2.1.4.1.1.10' * `lldpRemManAddr`: '1.0.8802.1.1.2.1.4.2.1.3' * CDP Information: * `cdpCacheAddress`: '1.3.6.1.4.1.9.9.23.1.2.1.1.4' * `cdpCacheVersion`: '1.3.6.1.4.1.9.9.23.1.2.1.1.5' * `cdpCacheDeviceId`: '1.3.6.1.4.1.9.9.23.1.2.1.1.6' * `cdpCacheDevicePort`: '1.3.6.1.4.1.9.9.23.1.2.1.1.7' * `cdpCachePlatform`: '1.3.6.1.4.1.9.9.23.1.2.1.1.8' ## Implementation Details 1. **Configuration**: * Use a JSON configuration format with sections for: * SNMP credentials (global and per-target) * Target seeds (IPs and subnets) * Scan schedule/interval * Proton connection details * OID configurations (for extensibility) 2. **Engine Operation Workflow**: * Initialize engine with config * For each seed IP/subnet: * Check basic reachability * Try SNMP connection with credentials * If successful: * Collect system information * Collect interface details * Collect IP address mappings * Collect LLDP/CDP neighbor data * Format and publish collected data to appropriate streams 3. **Concurrency Model**: * Use a worker pool pattern for efficient scanning * Configurable concurrency limits to prevent network/device overload * Cancellable context for clean shutdown 4. **Error Handling**: * Graceful handling of timeouts and retries * Skip devices/OIDs that consistently fail * Detailed error logging for troubleshooting ## Testing Strategy - [ ] Unit tests for individual components: - [ ] SNMP client wrapper - [ ] Data parsers - [ ] Configuration loading - [ ] Integration tests with mock SNMP responses - [ ] End-to-end tests against test devices (if available) ## Dependencies - github.com/gosnmp/gosnmp - Configuration library (viper or similar) - Logging library (logrus or similar) - Proton client SDK ## Acceptance Criteria 1. Engine successfully discovers devices and publishes to correct Proton streams 2. Proper handling of different SNMP versions and credentials 3. Complete collection of system, interface, and neighbor data where available 4. Efficient scanning with proper error handling 5. Proper security handling of credentials 6. Comprehensive logging for troubleshooting ## Related Documentation - ServiceRadar PRD: "ServiceRadar SNMP Discovery Engine" v1.1 - ADR-01: Configuration Management - ADR-02: ArangoDB Sync Service ## Next Steps After completion, consider these enhancements: - Discovery of VLANs (Q-BRIDGE-MIB) - Discovery of LAGs (IEEE8023-LAG-MIB) - Custom MIB support for vendor-specific information - More sophisticated device role identification
Author
Owner

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/792#issuecomment-2919682807
Original created: 2025-05-29T15:02:46Z


closing this out as completed will address the snmpv3 stuff later

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/792#issuecomment-2919682807 Original created: 2025-05-29T15:02:46Z --- closing this out as completed will address the snmpv3 stuff later
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar#287
No description provided.