Universiteit Leiden

nl en


DAG Meeting: Text mining & automated feature detection

Thursday 12 October 2017
Van Steenis
Einsteinweg 2
2333 CC Leiden

Alex Brandsen: Text mining techniques to unlock the information in archaeological grey literature

Over 60.000 Dutch archaeological reports are available online, and this number is growing by around 4.000 a year. Currently it is only possible to search on the metadata (or descriptions) related to these documents, not the full text. This causes potentially important information to remain hidden, for example 'by-catch' finds, where a single find from a different time period than the rest of the site isn't mentioned in the metadata, making it impossible to find. Archaeological terms can often have multiple meanings or multiple terms can have 1 meaning, creating further difficulties for search. By applying Natural Language Processing techniques it is possible to unlock the knowledge in these reports for scientific research.

Wouter Verschoof-van der Vaart: Automated feature detection in LiDAR data

Nowadays the surface of the earth is constantly being monitored by a multitude of airborne and satellite sensors that record a wide variety of environmental parameters. This causes a huge influx of data of high complexity and quality. To cope with this ever growing dataset archaeologists started developing computer-aided methods for the (semi-) automated detection of archaeological features. These handcrafted algorithms are highly specialized on specific object categories and data sources, which limits their use in different contexts and limits their usability in general for archaeological prospection.

The research project Automating archaeological object detection in remotely sensed data will explore the implementation of advanced computational methods to develop a generic, flexible and robust automated detection method for archaeological features in remote sensed data. The proposed new technique is based on recent developments in Deep learning and Convolutional Neural Networks (CNNs). CNNs are a type of computer program that learns to classify (object within) images by ‘training’ it on a large generic image set. Deep learning CNNs have gained much ground in other disciplines, such as image classification and face recognition. Recent case studies show that CNNs have a huge potential for archaeological applications.

DAG Meeting: Text mining & automated feature detection

This website uses cookies.  More information.