Local Outlier Factor (LOF)

· Local Outlier Factor (LOF) is an unsupervised machine learning anomaly detection method that identifies data points with significantly lower local density compared to their neighbours.

· It compares the local density of a point to the densities of its k-nearest neighbours.

· A higher LOF score (>1.0) indicates a potential outlier — the point is in a sparse region.

· LOF is effective in detecting local anomalies, which might not stand out globally but deviate from nearby data.

How LOF Works (Detection Phase)

· For each data point, LOF computes:

The average distance to its k nearest neighbours.
A score representing how isolated the point is relative to its neighbourhood.

· The algorithm assigns a LOF score:

~1.0 → normal
>1.5 (configurable) → likely outlier

Outlier Correction Approach

· Local Outlier Factor (LOF) is designed for detecting outliers but does not modify the data itself.

· To correct the detected outliers:

First, compute the median or mean of the values classified as normal (anomaly == 1) for each feature.
Then, replace the values identified as outliers (anomaly == -1) with the corresponding computed median or mean.

Local Outlier Factor (LOF) Parameters

Param Name	Description	Default Value	Possible Values
N-NEIGHBORS	Number of neighbors to use for LOF calculation	20	Integer ≥ 1
CONTAMINATION	Proportion of outliers in the data	`auto`	Float in (0, 0.5] or `"auto"`
METRIC	Distance metric used	`euclidean`	`'euclidean'`, `'manhattan'`, etc.
N-JOBS	Number of parallel jobs to run	-1	Integer or `None` (uses 1 or all cores)
INTERPOLATION-METHOD	Method for interpolation when imputing missing values	`linear`	`'linear'`, `'nearest'`, `'spline'`, etc.