I am working as a Senior Data Scientist at AIT's Digital Insight Lab. My research interest lies in finding and applying quantitative methods for gaining new insights from large-scale, connected datasets. I often act as bridge reseacher between fields and contribute practical methods and tools drawn from machine learning, network analytics, and text mining. My current research topics are:
- Cryptocurrency Analytics: I am leading the GraphSense project and spend most of my time on investigating and developing new methods and tools for analyzing the structure and dynamics of cryptocurrency ecosystems, such as Bitcoin.
- Predictive Maintenance: in cooperation with industry partners, I am working on algorithms for predicting machine outages in order to lower maintenance costs of manufacturing or production plants.
- Knowledge Graph Engineering: I work on developing novel text mining and machine learning methods for constructing (enterprise) knowledge graph from large document collections.
Recent Publications (see all ...)
Paquet-Clouston, Masarah and Haslhofer, Bernhard and Dupont, Benoit : Ransomware Payments in the Bitcoin Ecosystem. 17th Annual Workshop on the Economics of Information Security (WEIS), 2018.
Haslhofer, Bernhard and Isaac, Antoine and Simon, Rainer: Knowledge Graphs in the Libraries and Digital Humanities Domain. Encyclopedia of Big Data Technologies, 2018.
Rörden, Jan and Revenko, Artem and Haslhofer, Bernhard and Blumauer, Andreas: Network-based Knowledge Graph Assessment. SEMANTICS 2017, Amsterdam, 2017.
Filtz, Erwin and Polleres, Axel and Karl, Roman and Haslhofer, Bernhard: Evolution of the Bitcoin Address Graph - An Exploratory Longitudinal Study. International Data Science Conference (DSC 2017), Salzburg, Austria, 2017.
Haslhofer, Bernhard and Karl, Roman and Filtz, Erwin: O Bitcoin Where Art Thou? Insight into Large-Scale Transaction Graphs. In: SEMANTICS 2016, Leipzig, Germany, 2016.
Haslhofer, Bernhard and Sanderson, Robert and Simon, Rainer and van de Sompel, Herbert: Open Annotations on multimedia Web resources. In: Multimedia Tools and Applications 70(2), pgs 847-867, Springer US, 2014.
Momeni, Elaheh and Haslhofer, Bernhard and Tao, Ke and Houben, Geert-Jan: Sifting useful comments from Flickr Commons and YouTube. In: International Journal on Digital Libraries 1-19, 2014.
Open Source Software Contributions
GraphSense: A Scalable Cryptocurrency Analytics Platform build on Apache Spark and Cassandra
Wikigrouth: A Python tool for extracting entity mentions from a collection of Wikipedia documents.
qSKOS: A command line tool and API for finding quality issues in SKOS vocabularies.
DSNotify: A generic change detection framework for Linked Data sources that informs data-consuming actors about the various types of events (create, remove, move, update) that can occur in data sources.
Open Data Publishing Contributions
Ransomware Payments in the Bitcoin Ecosystem: This dataset contains 7,222 Bitcoin seed addresses related to 67 ransomware families as well as addresses that were identified by applying the expansion procedure described in in our paper.
Grants and Third-Party Funded Projects
TRAVELOGUES (04/2018-03/2018), FWF DACH, Co-Principal Investigator. This interdisciplinary international (DACH FWF-DFG) digital humanities project aims at gaining insight into the perception of the Other (focusing on the Orient) by analyzing an extensive collection of German language travelogues covering the period from 1500 until 1875. It will bring together a team of researchers from history, computer science, as well as library and information science from Austria and Germany. They will jointly develop a novel mixed qualitative and quantitative method for the serial analysis of large-scale text corpora and apply that method on a comprehensive corpus of travelogues originally published in the German language (ca. 3,000 - 3,500 books) and drawn from the Austrian Books Online (ABO) project (ca. 600,000 books) of the Austrian National Library.
VIRTCRIME (01/2018-12/2019), FFG KIRAS, Principal Investigator. The goal of the VIRTCRIME project lies in the development of novel algorithms and methods for tracing criminal transactions in post-Bitcoin era cryptocurrencies, while considering illegitimate activities in Darknet market places. Orthogonally, the project will provide novel criminological procedures and law enforcement approaches, and investigate legal pre-conditions and consequences.
TITANIUM (05/2017-05/2020), EU Horizon 2020, Senior Scientist. TITANIUM will research, develop, and validate novel data-driven techniques and solutions designed to support Law Enforcement Agencies (LEAs) charged with investigating criminal or terrorist activities involving virtual currencies and/or underground markets in the darknet.
GraphSense (09/2015-11/2017), FFG - IKT der Zukunft, Project Lead, Principal Investigator. The goal of the GraphSense project is to research and develop novel algorithmic solutions for detecting anomalies in large-scale, dynamically changing graph datasets. The focus will be on developing anomaly detection techniques for transaction networks constructed from virtual currencies (Bitcoin) and investigate their applicability for enterprise financial fraud detection settings.
BITCRIME (10/2014-09/2016), Bilateral: BMBF (DE) + BMVIT (AT), Scientist, WP Lead. Research and develop methods to prevent and prosecute organised crime in virtual currencies. The project also investigates novel Anti Money Laundering (AML) strategies taking into account the pseudo-anonymity of Bitcoin users.
ResourceSync (12/2011-04/2014), Alfred P. Sloan Foundation, Researcher. Research, develop, prototype, test, and deploy mechanisms for the large-scale synchronization of web resources. Building on the OAI-PMH strategies for synchronizing metadata, this project will enhance that specification using modern web technologies, but will allow for the synchronization of the objects themselves, not just their metadata.
SciLink (03/2011-02/2014), EU PEOPLE IOF (Marie Curie), Research Fellow (beneficiary). Research on (i) interactive links discovery in scholarly publications, (ii) strategies for maintaining link integrity, and (iii) novel Web-based resource aggregation and presentation interfaces for scholarly publication workflows.
Maphub (12/2011-02/2013), Andrew W. Mellon Foundation, Principal Investigator. Examine application of the Open Annotation Specification in the context of digitized historical maps. Design and build a collaborative Web environment in which scholars and citizens can contribute their knowledge to digitized high-resolution online maps. We experimented with designs that integrate the annotation process with the re-use of data from public data sources, such as Wikipedia.
MEKETRE (07/2009-12/2012), Austrian Research Fund (FWF), Proposal Co-author. An interdisciplinary project with the Institute for Egyptology at the University of Vienna. It aimed at building a collaborative Web-based solution for efficiently organizing the collected and digitized content objects from the Egyptian middle kingdom period by means of open collaboratively developed vocabularies.
ResourceSync Framework Specification (co-editor): describes a synchronization framework for the web consisting of various capabilities that allow third party systems to remain synchronized with a server's evolving resources.
Open Annotation Data Model (contributor): specifies an interoperable framework for creating associations between related resources, annotations, using a methodology that conforms to the Architecture of the World Wide Web. Open Annotations can easily be shared between platforms, with sufficient richness of expression to satisfy complex requirements while remaining simple enough to also allow for the most common use cases, such as attaching a piece of text to a single web resource.
Semantic Knowledge Representation and Linked Data, (2017, University of Applied Sciences - FH Technikum Wien), instructor: The goal of this course is to introduce the design principles and technologies for building global information, data, and financial networks, show the practical applications of such systems, and discuss their design and their social and policy context.
Application Development in Media Informatics, (2015-2017, University of Vienna), instructor: An undergraduate course involving development of an application related to media informatics.
Technology Applications, (Spring 2014, University of Salzburg), Instructor: A masters-level course introducing technologies for building data-centric Web information systems in the library domain. Discussion of cross-cutting issues such as Linked (Open) Data.
INFO/CS 4302 - Web Information Systems, (2011-2012, Cornell University), Instructor: This course introduces technologies for building data-centric information systems on the World Wide Web, show the practical applications of such systems, and discuss their design and their social and policy context by examining cross-cutting issues such as citizen science, data journalism and open government.
CS 5999 - Master of Engineering Project, (2011-2012, Cornell University), Instructor: Independent or group project under the direction of a CS field member or researcher. Projects involve the development of a computer science application (software or hardware) useful in exploring and/or solving an engineering problem with a computer science focus.
Multimedia Information Systems 2, (2007-2011, University of Vienna), Co-instructor: A masters-level course in Media Informatics examining technologies and available applications for building (multimedia) Web information systems. Focus on XML, Semantic Web technologies and, metadata standards.
Multimedia Information Retrieval, (2009-2011, University of Vienna), Co-instructor: An advanced masters-level course focusing on the principles of information retrieval in distributed environments such as the Web, with a special focus on multimedia information.
Information System Technologies for Multimedia Applications, (2008-2010, University of Vienna), Co-instructor: An undergraduate course focusing on the technical properties of various media types (image, audio, video) and their technical processing (e.g., with Java Media Framework) in multimedia applications.
Media Informatics Student Projects, (2008-2011, University of Vienna), Instructor: An undergraduate course involving the development of an application related to the media informatics field.
Modeling Techniques and Methods, (2007-2011, University of Vienna), Co-instructor: An undergraduate introductory course covering basic data modeling standards such as EER, UML, etc.
Invited Talks and Panels
Insight into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-based Ecosystems, Austrian Economic Chambers, February 2018, Vienna, Austria. (slides)
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics, Oesterreichische Nationalbank (OeNB), Research Seminar, January 2018, Vienna, Austria. (slides)
Cryptocurrency Analytics (Bitcoin and beyond), Monero Meetup Vienna, December 2017, Vienna, Austria. (slides)
Panel: Cryptocurrencies & eCrime, APWG.EU eCrime Cyber-Security Symposium, October 2017, Porto, Portugal. (slides)
Insight Into Virtual Currencies and Darknet Market Activities, Media4Sec - Policing the Dark Web Workshop, September 2017, The Hague. (slides)
Blockchain und Cyber-Währung (Panel), Digital Days 2017, September 2017, Vienna, Austria. (slides)
Inisght into Virtual Currency Ecosystems (by making use of Big Data technology), Big Data Europe (BDE) Webinar, February 2017, Online. (slides)
Virtual Currencies and Cybercrime, KSÖ Workshop Urban Security, November 2016, Vienna, Austria. (slides)
Insights into Virtual Currency Ecosystems, APWG eCrime.EU Symposium 2016, October 2016, Bratislava, Slovakia. (slides)
Exploring and Tracking Bitcoin Transactions, 3rd Virtual Currencies Conference (EUROPOL), May 2016, The Hague. (slides)
Mind the Gap - Data Science Meets Software Engineering, Vienna Semantic Web Meetup, March 2016, Vienna, Austria. (slides)
GraphSense - Real-time Insight into Virtual Currency Ecosystems, Austrian Data Forum, November 2015, MQ, Vienna. (slides)
Bitcoin Panel, APWG - Symposium on Electronic Crime Research, May 2015, Barcelona, Spain. (slides)
Bitcoin - Introduction, Technical Aspects and Ongoin Developments, FMA - Austrian Financial Market Authority, April 2015, . (slides)
Maphub und Pelagios: Anwendung von Linked Data in den Digitalen Geisteswissenschaften, Workshop - Linked Data Within The Humanities and Beyond, December 2014, Austrian Academy of Sciences. (slides)
The value of open data and the OpenGLAM network, Putting Linked Library Data to Work: the DM2E Showcase, November 2014, Austrian National Library, Vienna, Austria. (slides)
Things, not Strings, ADV Tagung - Suchstrategien für heute und morgen, November 2014, Vienna, Austria. (slides)
Offene Daten im Kulturbereich - Die pragmatische Perspektive, Alles Offen, alles frei. Open Data in Kultureinrichtungen , June 2014, Wien Museum, Vienna, Austria. (slides)
Open Data - Principles and Techniques (Guest Lecture), Technical University of Vienna, May 2014, Vienna, Austria. (slides)
The Story behind Maphub, Open Knowledge Conference (OKCon), September 2013, Geneva, Switzerland. (slides)
Semantic Tagging for old maps...and other things on the Web, The Web As Literature Symposium, June 2013, British Library, London, UK. (slides)
Linked Open Data (Guest Lecture), Technical University of Vienna, May 2013, Vienna, Austria. (slides)
Maphub and Annotorious, iAnnotate 2013, April 2013, San Francisco, USA. (slides)
Maphub - Annotations and Semantic Tags on Historical Maps, Stanford University - Open Annotation Rollout, April 2013, Palo Alto, USA. (slides)
Old Maps, Annotations, and Open Data Networks, Harvard University, January 2013, Cambridge, USA. (slides)
Linked Data and SKOS, Workshop on Physics Classification, December 2011, Boston, USA. (slides)
Linked Data in Scholarly Communication, Cornell University - AAHEP5 Information Provider Summit, Cornell University, October 2011, Ithaca, USA. (slides)
Metadata is back! (Keynote), Semantic Web Technologies for Libraries and Readers Workshop, co-located with JCDL 2011, June 2011, Ottawa, Canada. (slides)
Research on Scholarly Practices and Communication at Cornell Information Science. (with Carl Lagoze), Microsoft Research, May 2011, Redmond, USA. (slides)
Linked Data als Perspektive für die bibliothekarische Inhaltserschließung, Österreichisches Online-Informationstreffen und Österreichischer Dokumentartag (ODOK), September 2010, Leoben, Austria. (slides)
Linked Data im Kontext Digitaler Bibliothekssysteme, Semantic Web in Bibliotheken (SWIB), September 2009, Cologne, Germany. (slides)
CIDOC CRM in Practice - Experiences, Problems, and Possible Solutions, Workshop Vernetzte Datenwelten, October 2009, Berlin, Germany. (slides)
Linked Data Tutorial, Vlaams Theater Instituut, June 2009, Brussels, Belgium. (slides)
Research Visit Los Alamos National Labs, May 2012
Open Humanities Award, 2013
"Certificate of Appreciation", University of Vienna, Faculty of computer science. 2010, 2011
13th International Conference on Semantic Systems (SEMANTICS 2017), Data Science track chair
11th International Conference on Web Engineering (ICWE 2011), Doctoral consortium co-chair
Very Large Databases Conference (VLDB 2007), local organization
Linked Data Camp 2009, Museumsquartier (MQ) Vienna
Web of Data Practitioner’s Days 2008, University of Vienna
Program Committees and Reviewing for Scientific Journals
Journal of Web Semantics (JWS) (2013, 2014, 2017)
Semantic Web Journal (SWJ) (2014)
Future Internet (2013)
Multimedia Tools and Applications (2010, 2011)
International Journal on Digital Libraries (2009, 2012, 2017)
ACM Computing Surveys (2009)
Conferences, Workshops, Symposia
Digital Libraries (2014)
Dublin Core Conference (DC) (2008 - 2015)
Asian Digital Library Conference (ICADL) (2015)
Linked Data on the Web Workshop (LDOW) (2017)
iChallenge - Linked Data Cup (2013)
Linked Data Triplification Challenge (2011)
Networked Knowledge Organization Systems and Services Workshop (NKOS) (2006-2017)
International Workshop on Web Semantics (WebS) (2004-2013)