Detection Talking Points
I am going to take a moment to enumerate some of the known limitations of the different methods.
- Certain methods of short-sequence system call profiling have a weakness demonstrated in Wagner et al's "Mimicry Attacks" paper--specifically the look-ahead pairs model used by Somayaji in pH. I might argue there are practical constraints based on the target system and in exploit methods themselves that make mimicry attacks very difficult. More importantly, the other methods of short-sequence analysis are not vulnerable to the mimicry class of attack.
- Both short-sequence analysis and system call flow graphs generally do not concern themselves with the actual semantics of each system call, which in part is why mimicry is a possibility at all, but this actually represents a very reasonable trade-off between a high degree of complexity/overhead and the possible susceptibility to a difficult attack (pragmatically speaking).
- Semantic constraint of system calls suffers from a lack of knowledge of the program flow, therefore it only operates in an atomic manner at the time of each system call. For example, network services often need to be able to query DNS, so they are allowed to connect using UDP to port 53 on any host and send and receive data. Without knowing the context in the program where it is expected to do DNS queries, it is difficult or impossible for something like Systrace to stop injected code from using UDP on port 53 as a control channel. The program may be constrained so it cannot exec arbitrary programs and access arbitrary files, but that does not matter if the kernel has a vulnerable system call like brk.
- When you are dealing with machine learning techniques, there are always issues associated with learning. Learning issues affect most of these techniques, but honestly there are pragmatic ways of dealing with the associated issues. It is too specific of a topic to address today, but I do believe the dogma around "training" is unwarranted in general.
- "Loose" signatures, as I call them, try to broaden the detection range of a static signatures by looking for base attributes associated with a class of attack. There are certain things we know just should not happen on a system under normal use cases, and this method tries to define signatures that match on such conditions. Really, this sounds more suited to co-stimulation than detection in my opinion, as I hope to high-light shortly.