Anomaly / outlier detection using isolation forest in Scala


What’s the goal?

That sounds simple?

Multiple dimensions

Real-time detection

Isolation forest algorithm

Point anomaly detection?

Isolation forests algorithm

Intuition behind isolation forests ?

Kafka ingestion



Building a tree

/** ADT for the Isolation tree */
sealed trait ITree[+A]
case class ExternalNode[A](value: A, featureName: String, featureIndex: Int) extends ITree[A]case class InternalNode[A](splitValue:Double, featureName: String, featureIndex: Int, left: ITree[A], right: ITree[A]) extends ITree[A]

Checking the depth of a record in our tree structure

Building the forest

Calculating the anomaly score

Reading a Kafka topic


End note


father, husband and data engineer / ml engineer@continuum consulting

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store