Anomaly / outlier detection using isolation forest in Scala


What’s the goal?

That sounds simple?

Multiple dimensions

Real-time detection

Isolation forest algorithm

Point anomaly detection?

Isolation forests algorithm

Intuition behind isolation forests ?

Kafka ingestion



Building a tree

/** ADT for the Isolation tree */
sealed trait ITree[+A]
case class ExternalNode[A](value: A, featureName: String, featureIndex: Int) extends ITree[A]case class InternalNode[A](splitValue:Double, featureName: String, featureIndex: Int, left: ITree[A], right: ITree[A]) extends ITree[A]

Checking the depth of a record in our tree structure

Building the forest

Calculating the anomaly score

Reading a Kafka topic


End note


