Context Approximation and Refinement, i.e. CAR, is Michelle Wong‘s latest innovation for making the analysis of Android applications for malware and vulnerability detection more tractable. Accepted to appear at AsiaCCS 2022 in June, it presents a better way or performing symbolic analysis.
Symbolic code analysis has always had to deal with the tension between precision and scalability. Analysis precision is desirable because it will return accurate results — if code indicating malicious behavior is found, then the app is more likely to be truly malicious, not that the analysis found something that turned out not to be correct. However, highly precise symbolic analysis doesn’t scale — it is slow and takes a lot of memory and CPU resources. We ran our base case experiments on a machine with over 512GB of memory and it could barely do any of the applications at full precision. We can improve the scalability by reducing precision and making some assumptions about the code — we might fix some variables to have constant values for example — but these might turn out to be wrong leading to incorrectly finding malicious behavior, or even worse, missing malicious behavior. CAR makes progress on this classic problem by approximating the “context” in which a code path is being analyzed, the context being global variables and application state, and then iteratively refining the precision of that context by filling in variables and state if and only if the code being analyze needs it. For example, it might fill in an approximated object with a real one it has harvested from elsewhere in the code only if the analyzed code tries to access the object.
Our experiments show that CAR achieves better accuracy and scalability than previous dynamic tools, and is able to find 3.1x more interesting behaviors in Android apps than the state-of-the-art while only having a 9% false positive rate. You can read the full paper, which has just been posted, here. Congratulations Michelle on the fantastic work!