opnsense-src/sys/contrib/openzfs/cmd/zed/agents
Martin Matuska 9a5f0cb5b6 zfs: merge openzfs/zfs@256659204 (zfs-2.2-release) into stable/14
OpenZFS release 2.2.4

Notable upstream pull request merges:
 #15076 fdd97e009 Refactor dmu_prefetch()
 #15225 5972bb856 Use ASSERT0P() to check that a pointer is NULL
 #15381 7ea833100 ZIL: Detect single-threaded workloads
 #15515 8b1a132de ZIO: Optimize zio_flush()
 #15225 d6da6cbd7 Clean up existing VERIFY*() macros
 #15225 5dda8c091 Add VERIFY0P() and ASSERT0P() macros
 #15436 61f3638a3 Add prefetch property
 #15509 6f323353d Add ashift validation when adding devices to a pool
 #15539 ea3f7c12a Extend import_progress kstat with a notes field
 #15635 25ea8ce94 ZIL: Improve next log block size prediction
 #15784 16c223eec Do no use .cfi_negate_ra_state within the assembly on
                  Arm64
 #15839 706307445 vdev probe to slow disk can stall mmp write checker
 #15879 86b39b41a zpool: Fix locale-specific time
 #15927 fa5de0c5c Update resume token at object receive
 #15941 fdd8c0aea BRT: Skip duplicate BRT prefetches
 #15942 889152ce4 Give better message from 'zpool get' with invalid pool
                  name
 #15950 3e91a9c52 BRT: Skip getting length in brt_entry_lookup()
 #15951 19bf54b76 ZAP: Massively switch to _by_dnode() interfaces
 #15954 f7c1db636 BRT: Change brt_pending_tree sorting order
 #15955 457e62d7c BRT: Relax brt_pending_apply() locking
 #15967 c94f73007 BRT: Make BRT block sizes configurable
 #15976 dced953b6 ZAP: Some cleanups/micro-optimizations
 #15983 531572b59 Fix panics when truncating/deleting files
 #15992 5fc134ff2 zvol: use multiple taskq
 #16007 2ea370a4e BRT: Fix holes cloning
 #16008 67995229a zpool: Fix option string, adding -e and fixing order
 #16015 8a5604713 Add support for zfs mount -R <filesystem>
 #16022 026fe7964 Speculative prefetch for reordered requests
 #16040 575872cc3 L2ARC: Relax locking during write
 #16042 d5fb6abd3 Improve dbuf_read() error reporting
 #16051 5d859a2e2 xdr: header cleanup
 #16052 602b5dca7 Fix read errors race after block cloning
 #16057 97d7228f4 Remove db_state DB_NOFILL checks from syncing context
 #16072 f4ce02ae4 Small fix to prefetch ranges aggregation
 #16074 97889c037 return NULL at end of send_progress_thread
 #16086 7aaf6ce9d Add the BTI elf note to the AArch64 SHA2 assembly
 #16094 4d17e200d Add zfetch stats in arcstats
 #16128 3d4d61988 Fix updating the zvol_htable when renaming a zvol
 #16141 b3b37b84e Fix arcstats for FreeBSD after zfetch support

Obtained from:	OpenZFS
OpenZFS commit:	2566592045
OpenZFS tag:	zfs-2.2.4
2024-05-03 23:52:01 +02:00
..
fmd_api.c zfs: merge openzfs/zfs@256659204 (zfs-2.2-release) into stable/14 2024-05-03 23:52:01 +02:00
fmd_api.h zfs: merge openzfs/zfs@256659204 (zfs-2.2-release) into stable/14 2024-05-03 23:52:01 +02:00
fmd_serd.c zfs: merge openzfs/zfs@256659204 (zfs-2.2-release) into stable/14 2024-05-03 23:52:01 +02:00
fmd_serd.h zfs: merge openzfs/zfs@256659204 (zfs-2.2-release) into stable/14 2024-05-03 23:52:01 +02:00
README.md
zfs_agents.c zfs: merge openzfs/zfs@feff9dfed 2023-06-10 19:31:17 +02:00
zfs_agents.h
zfs_diagnosis.c zfs: merge openzfs/zfs@256659204 (zfs-2.2-release) into stable/14 2024-05-03 23:52:01 +02:00
zfs_mod.c zfs: merge openzfs/zfs@c883088df (zfs-2.2-release) into stable/14 2024-02-23 19:37:36 +01:00
zfs_retire.c zfs: merge openzfs/zfs@256659204 (zfs-2.2-release) into stable/14 2024-05-03 23:52:01 +02:00

Fault Management Logic for ZED

The integration of Fault Management Daemon (FMD) logic from illumos is being deployed in three phases. This logic is encapsulated in several software modules inside ZED.

ZED+FM Phase 1

All the phase 1 work is in current Master branch. Phase I work includes:

  • Add new paths to the persistent VDEV label for device matching.
  • Add a disk monitor for generating disk-add and disk-change events.
  • Add support for automated VDEV auto-online, auto-replace and auto-expand.
  • Expand the statechange event to include all VDEV state transitions.

ZED+FM Phase 2 (WIP)

The phase 2 work primarily entails the Diagnosis Engine and the Retire Agent modules. It also includes infrastructure to support a crude FMD environment to host these modules. For additional information see the FMD Components in ZED and Implementation Notes sections below.

ZED+FM Phase 3

Future work will add additional functionality and will likely include:

  • Add FMD module garbage collection (periodically call fmd_module_gc()).
  • Add real module property retrieval (currently hard-coded in accessors).
  • Additional diagnosis telemetry (like latency outliers and SMART data).
  • Export FMD module statistics.
  • Zedlet parallel execution and resiliency (add watchdog).

ZFS Fault Management Overview

The primary purpose with ZFS fault management is automated diagnosis and isolation of VDEV faults. A fault is something we can associate with an impact (e.g. loss of data redundancy) and a corrective action (e.g. offline or replace a disk). A typical ZFS fault management stack is comprised of error detectors (e.g. zfs_ereport_post()), a disk monitor, a diagnosis engine and response agents.

After detecting a software error, the ZFS kernel module sends error events to the ZED user daemon which in turn routes the events to its internal FMA modules based on their event subscriptions. Likewise, if a disk is added or changed in the system, the disk monitor sends disk events which are consumed by a response agent.

FMD Components in ZED

There are three FMD modules (aka agents) that are now built into ZED.

  1. A Diagnosis Engine module (agents/zfs_diagnosis.c)
  2. A Retire Agent module (agents/zfs_retire.c)
  3. A Disk Add Agent module (agents/zfs_mod.c)

To begin with, a Diagnosis Engine consumes per-vdev I/O and checksum ereports and feeds them into a Soft Error Rate Discrimination (SERD) algorithm which will generate a corresponding fault diagnosis when the tracked VDEV encounters N events in a given T time window. The initial N and T values for the SERD algorithm are estimates inherited from illumos (10 errors in 10 minutes).

In turn, a Retire Agent responds to diagnosed faults by isolating the faulty VDEV. It will notify the ZFS kernel module of the new VDEV state (degraded or faulted). The retire agent is also responsible for managing hot spares across all pools. When it encounters a device fault or a device removal it will replace the device with an appropriate spare if available.

Finally, a Disk Add Agent responds to events from a libudev disk monitor (EC_DEV_ADD or EC_DEV_STATUS) and will online, replace or expand the associated VDEV. This agent is also known as the zfs_mod or Sysevent Loadable Module (SLM) on the illumos platform. The added disk is matched to a specific VDEV using its device id, physical path or VDEV GUID.

Note that the auto-replace feature (aka hot plug) is opt-in and you must set the pool's autoreplace property to enable it. The new disk will be matched to the corresponding leaf VDEV by physical location and labeled with a GPT partition before replacing the original VDEV in the pool.

Implementation Notes

  • The FMD module API required for logic modules is emulated and implemented in the fmd_api.c and fmd_serd.c source files. This support includes module registration, memory allocation, module property accessors, basic case management, one-shot timers and SERD engines. For detailed information on the FMD module API, see the document -- "Fault Management Daemon Programmer's Reference Manual".

  • The event subscriptions for the modules (located in a module specific configuration file on illumos) are currently hard-coded into the ZED zfs_agent_dispatch() function.

  • The FMD modules are called one at a time from a single thread that consumes events queued to the modules. These events are sourced from the normal ZED events and also include events posted from the diagnosis engine and the libudev disk event monitor.

  • The FMD code modules have minimal changes and were intentionally left as similar as possible to their upstream source files.

  • The sysevent namespace in ZED differs from illumos. For example:

    • illumos uses "resource.sysevent.EC_zfs.ESC_ZFS_vdev_remove"
    • Linux uses "sysevent.fs.zfs.vdev_remove"
  • The FMD Modules port was produced by Intel Federal, LLC under award number B609815 between the U.S. Department of Energy (DOE) and Intel Federal, LLC.