Role Mining with Probabilistic Models
Mario Frank, Joachim M. Buhmann, David Basin

Role mining pursues the goal of finding a role-based access control(RBAC) configuration 
that is extracted from the assignment of users to access permissions given by an 
access-control matrix. Most approaches to role mining work by constructing a large set 
of candidate roles and use a greedy selection strategy to iteratively pick a small 
subset such that the differences between the resulting RBAC configuration and the 
access control matrix are minimized. In this paper, we advocate an alternative approach 
that recasts role mining as an inference problem rather than a lossy compression 
problem. Instead of using combinatorial algorithms to minimize the number of roles 
needed to represent the access-control matrix, we derive probabilistic models to learn 
the RBAC configuration that most likely underlies the given matrix.

Our models are generative in that they reflect the way that permissions are assigned to 
users in a given RBAC configuration. We additionally model how user-permission 
assignments that conflict with an RBAC configuration emerge and we investigate the 
influence of constraints on role hierarchies and on the number of assignments. In 
experiments with access-control matrices from real-world enterprises, we compare our 
proposed models with other role mining methods. Our results show that our probabilistic 
models infer roles that generalize well to new system users for a broad variety of data, 
while other models' generalization abilities depend on the particular dataset given.