Local explanation of Machine Learning model with shapkit, a Python module that approximate Shapley Values

4th Oct 2021

Nowadays, Machine Learning models are used for various applications with already successful or promising results. Unfortunately, a common criticism is the lack of transparency associated with these algorithm decisions. This is mainly due to a greater interest in performance (measurable on specific tasks) at the expense of a complete understanding of the model. This results in a lack of knowledge of the internal working of the algorithm by the developer and the end user. The most obvious consequences are firstly a difficulty to correct the algorithm by an expert (different assumptions, removing outliers, adding new variables or diverse samples). Secondly, limiting its adoption by operational staff. There is even an urgent need for an explainable Artificial Intelligence (AI). There is no single definition of interpretability or explainability concerning model prediction. Therefore, there are several ways to proceed. Assessing them objectively is a real problem because we do not have unanimous criteria. Most studies analyze the feedback from a panel of individuals, expert or not, to demonstrate the contribution of a method in terms of understanding. Methods are rarely compared directly with each other, but rather against a lack of interpretation of algorithm decisions. However, some references try to create quantitative indicators to evaluate the complexity of a model. We can however separate methods into two dimensions. If the method is local or global, and if its approach is model agnostic or on the contrary inherent to it. A global method aims at explaining the general behaviour of a model, whereas a local method focuses on each decision of a model. The agnostic category (also called post-hoc explanation) considers the model as a black box. On the other hand, inherent or non-agnostic methods can modify the structure of a model or the learning process to create intrinsically transparent algorithms.

Shapley values offer a solution for the local explanation from additive feature importance measure class ensuring desirable theoretical properties. A prediction can be explained by assuming that each feature value of the instance is a “player’’ in a game where the prediction is the payout. The objective is to fairly distribute the payout among all features to obtain the prediction. A major challenge for Shapley values is the overall computational cost that grows exponentially as a function of the number of features.

During the project, we implement shapkit, dedicated to the computation of Shapley Values in the context of local machine learning explanation. The method used is:

Agnostic: no particular information on the model is needed, it works with black box algorithms.
Local: the explanation is computed at instance level.
More suitable for tabular data with meaningful features.

Moreover, we have implemented a dashboard for local explanation of machine learning model, based on dash, that used shapkit. For instance, it can be used for the detection of network intrusions protects a computer network from unauthorized users, including perhaps insiders. The objective is to build a predictive model capable of distinguishing between illegitimate (intrusions) and legitimate connections and to be able to understand the prediction made and the global behaviour of the model. This second task is important for the operational who uses the model, to be able to detect some false positive and effectively categories the attacks. The common features are the basic features of individual TCP connections (e.g. number of seconds of the connection, type of protocol, etc.), some content features within a connection suggested by domain knowledge (e.g. number of failed login attempts, number of ``root’’ accesses, etc.) and traffic features computed using a two-second time window (e.g. number of connections to the same host as the current connection in the past two seconds, etc.). We focus on DoS attacks, without distinguishing between the different categories of DoS attacks. Denial-of-service attack is a cyber-attack in which the attacker seeks to make a ma-chine or network resource unavailable to its intended users by temporarily or indefinitely disrupting services of a host connected to the Internet. A ML model is trained to distinguish between normal connection and DoS attacks. This model performs perfectly on a testing set (AUC and accuracy equal to 1). Then, we use Shapley Values with two objectives: understand what are the important elements that leads to an alert and use these elements to try to refine the characterization of the attack undergone. We will apply the component on a instances that thas been predicted as DoS attack by the model. For the instance, the protocol uses is ICMP. The feature src_bytes is the number of data bytes from source to destination. For this instance, this number is strong. count is number of connections to the same host as the current connection in the past two seconds. Again, for this instance, this number is strong, as srv_count, which is the number of connections to the same service as the current connection in the past two seconds. We select as references sub-population 1000 random instances that has been predicted as normal by the ML model. On the plot, we represent only the ten first ranking by considering the ab-solute values of the Shapley Values. The protocol used, ICMP, and the number of data bytes from the source to the destination, which is strong, have the strongest contribution to the high score. These elements are characteristics of a smurf attack, which consists in a distributed DoS attack in which large numbers of ICMP packets with the intended victim’s spoofed source IP are broadcast to a computer network using in IP broadcast address.