|
As probably most of you know, almost any type of algorithm has been applied, sooner or later, to the topic of anomaly detection. Their mileage varies; sometimes the idea is good, sometimes it is plainly crazy. Host-based anomaly detection through the analysis of system calls sequences has been done in almost any way you can think of, but something no one (almost no one) has tinkered with until now is how to deal with system call arguments.
Even informally, you can understand that the argument of a system call is much more indicative of anomalous activity than the call itself. For instance, an "open" may not be suspicious per se, but a "read-write"open of the "/etc/passwd" file by a process which usually does not add users to the system may very well look suspicious.
We have developed a tool which analyzes each argument of the system call, models the contents of each, and then compares it against a "normal" model of previous calls. It is able to cluster system calls and thus detect "different uses" of the same syscall at different points of different programs. It then builds a Markovian model of the sequence, which is then used to trace and flag anomalies.
|