Universiteit Leiden

nl en

Dissertation

Data science for tax administration

In this PhD-thesis several new and existing data science application are described that are particularly focused on applications for tax administrations.

Author
Pijnenburg, M.G.F.
Date
24 June 2020
Links
Thesis in Leiden Repository

In this PhD-thesis several new and existing data science application are described that are particularly focused on applications for tax administrations. The thesis contains a chapter on the managerial side of analytics with a balanced overview of the pros and cons of applying analytics within taxpayer supervision. Another topic is (tax) fraud detection with unsupervised anomaly detection techniques. Here a new type of outliers is described (singular outliers) and an algorithm is provided for finding them. Attention is also paid to improving risk selection models. It is noted that most current algorithms cannot treat interactions of categorical variables with many levels very well. An extension of logistic regression is provided that uses Factorization Machines, which resulted in a ten percent improvement in precision. A fourth topic is statistical testing on similar treatment of similar cases. A contribution is made by providing an algorithm to statistically test on similar treatment based on process logs. The thesis contains further a benchmark study of different anomaly detection algorithms. Finally HR Analytics, Reinforcement Learning and applications of fuzzy sets are shortly described.

This website uses cookies.  More information.