Anomaly / outlier detection using isolation forest in Scala

Introduction

What’s the goal?

That sounds simple?

Multiple dimensions

Real-time detection

Isolation forest algorithm

Point anomaly detection?

Isolation forests algorithm

Intuition behind isolation forests ?

Kafka ingestion

Code

Configuration

Building a tree

/** ADT for the Isolation tree */
sealed trait ITree[+A]
case class ExternalNode[A](value: A, featureName: String, featureIndex: Int) extends ITree[A]case class InternalNode[A](splitValue:Double, featureName: String, featureIndex: Int, left: ITree[A], right: ITree[A]) extends ITree[A]

Checking the depth of a record in our tree structure

Building the forest

Calculating the anomaly score

Reading a Kafka topic

Conclusion

End note

References

father, husband and data engineer / ml engineer@continuum consulting

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store