Microsoft.ServiceFabricApps.ClusterObserver.Windows.FrameworkDependent
2.1.14
Prefix Reserved
See the version list below for details.
Requires NuGet 3.3.0 or higher.
dotnet add package Microsoft.ServiceFabricApps.ClusterObserver.Windows.FrameworkDependent --version 2.1.14
NuGet\Install-Package Microsoft.ServiceFabricApps.ClusterObserver.Windows.FrameworkDependent -Version 2.1.14
<PackageReference Include="Microsoft.ServiceFabricApps.ClusterObserver.Windows.FrameworkDependent" Version="2.1.14" />
paket add Microsoft.ServiceFabricApps.ClusterObserver.Windows.FrameworkDependent --version 2.1.14
#r "nuget: Microsoft.ServiceFabricApps.ClusterObserver.Windows.FrameworkDependent, 2.1.14"
// Install Microsoft.ServiceFabricApps.ClusterObserver.Windows.FrameworkDependent as a Cake Addin #addin nuget:?package=Microsoft.ServiceFabricApps.ClusterObserver.Windows.FrameworkDependent&version=2.1.14 // Install Microsoft.ServiceFabricApps.ClusterObserver.Windows.FrameworkDependent as a Cake Tool #tool nuget:?package=Microsoft.ServiceFabricApps.ClusterObserver.Windows.FrameworkDependent&version=2.1.14
ClusterObserver
ClusterObserver (CO) is a stateless singleton Service Fabric service that runs on one node in a cluster. CO observes cluster health (aggregated) and sends telemetry when a cluster is in Error or Warning. CO shares a very small subset of FabricObserver's (FO) code. It is designed to be completely independent from FO sources, but lives in this repo (and SLN) because it is very useful to have both services deployed, especially for those who want cluster-level health observation and reporting in addition to the node-level user-defined resource monitoring, health event creation, and health reporting done by FO. FabricObserver is designed to generate Service Fabric health events based on user-defined resource usage Warning and Error thresholds which ClusterObserver sends to your log analytics and alerting service.
By design, CO will send an Ok health state report when a cluster goes from Warning or Error state to Ok.
CO only sends telemetry when something is wrong or when something that was previously wrong recovers. This limits the amount of data sent to your log analytics service. Like FabricObserver, you can implement whatever analytics backend you want by implementing the IObserverTelemetryProvider interface. As stated, this is already implemented for both Azure ApplicationInsights and Azure LogAnalytics.
The core idea is that you use the aggregated cluster error/warning/Ok health state information from ClusterObserver to fire alerts and/or trigger some other action that gets your attention and/or some SF on-call's enagement via auto-creating a support incident (and an Ok signal would mean auto-mitigate the related incident/ticket).
You can change ClusterObserver configuration parameters by doing a versionless Application Parameter Upgrade. This means you can change settings for CO without having to redeploy the application or any packages.
Application Parameter Upgrade Example:
Open an Admin Powershell console.
Connect to your Service Fabric cluster using Connect-ServiceFabricCluster command.
Create a variable that contains all the settings you want update:
$appParams = @{ "RunInterval" = "00:10:00"; "MaxTimeNodeStatusNotOk" = "04:00:00"; }
Then execute the application upgrade with
Start-ServiceFabricApplicationUpgrade -ApplicationName fabric:/ClusterObserver -ApplicationTypeVersion 2.1.12 -ApplicationParameter $appParams -Monitored -FailureAction rollback
Example Configuration:
<?xml version="1.0" encoding="utf-8" ?>
<Settings xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.microsoft.com/2011/01/fabric">
<Section Name="ObserverManagerConfiguration">
<Parameter Name="ObserverLoopSleepTimeSeconds" Value="30" />
<Parameter Name="ObserverExecutionTimeout" Value="3600" />
<Parameter Name="AsyncOperationTimeoutSeconds" Value="120" />
<Parameter Name="ObserverLogPath" Value="cluster_observer_logs" />
<Parameter Name="EnableVerboseLogging" Value="false" />
<Parameter Name="EnableETWProvider" Value="true" />
<Parameter Name="EnableTelemetry" Value="true" />
<Parameter Name="TelemetryProvider" Value="AzureLogAnalytics" />
<Parameter Name="AppInsightsInstrumentationKey" Value="" />
<Parameter Name="LogAnalyticsWorkspaceId" Value="" />
<Parameter Name="LogAnalyticsSharedKey" Value="" />
<Parameter Name="LogAnalyticsLogType" Value="ClusterObserver" />
<Parameter Name="ObserverShutdownGracePeriodInSeconds" Value="1" />
</Section>
<Section Name="ClusterObserverConfiguration">
<Parameter Name="AsyncOperationTimeoutSeconds" Value="" MustOverride="true" />
<Parameter Name="Enabled" Value="" MustOverride="true" />
<Parameter Name="EnableVerboseLogging" Value="" MustOverride="true" />
<Parameter Name="EmitHealthWarningEvaluationDetails" Value="" MustOverride="true" />
<Parameter Name="MaxTimeNodeStatusNotOk" Value="" MustOverride="true" />
<Parameter Name="RunInterval" Value="" MustOverride="true" />
<Parameter Name="MonitorRepairJobs" Value="" MustOverride ="true" />
<Parameter Name="EnableOperationalTelemetry" Value="" MustOverride="true" />
</Section>
</Settings>
Example LogAnalytics Query
You should configure FabricObserver to monitor ClusterObserver, of course. 😃
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
.NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.0
- No dependencies.
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
2.3.0 | 92 | 8/7/2024 |
2.2.8 | 175 | 1/31/2024 |
2.2.6 | 199 | 10/9/2023 |
2.2.5 | 188 | 8/14/2023 |
2.2.4 | 194 | 4/27/2023 |
2.2.3 | 279 | 3/17/2023 |
2.2.2 | 259 | 3/7/2023 |
2.2.1.960 | 437 | 9/27/2022 |
2.2.1.831 | 432 | 9/27/2022 |
2.2.0.960 | 472 | 7/13/2022 |
2.2.0.831 | 463 | 7/12/2022 |
2.1.14 | 511 | 3/15/2022 |
2.1.13 | 466 | 2/9/2022 |
2.1.12 | 5,608 | 11/23/2021 |
2.1.11 | 359 | 10/11/2021 |
2.1.10 | 428 | 7/13/2021 |
2.1.9 | 407 | 5/6/2021 |
2.1.8 | 406 | 4/27/2021 |
2.1.7 | 383 | 4/16/2021 |
2.1.6 | 363 | 4/7/2021 |
2.1.5 | 352 | 3/10/2021 |
2.1.4 | 366 | 2/25/2021 |
Important Bug Fix in Telemetry config processing.
Reverted EnableTelemetry back to non-App parameter.
Updated Cluster and Application Upgrade monitoring impl.
Added support for latest FO Error/Warning Codes.
Code improvements with better error handling.