INISTA2020

Embeddings for texts, graphs, and relations

Abstract: Currently, the most successful machine learning methods are numeric, e.g., deep neural networks or SVMs. On the other hand, many important real-world problems use symbolic representation, e.g., graphs, relations, texts, or electronic health records. If we are to harness the power of successful numeric deep learning approaches for these learning problems, the symbolic data has to be embedded into a numeric vector form, suitable for numeric algorithms. The embeddings shall preserve the information in the form of similarities and relations contained in the original data by encoding it into distances and directions in the numeric space. For example, in graphs, nodes representing similar entities or having connections with similar other nodes shall have similar numerical representations.
In the tutorial, we are going to present embeddings of unstructured data, such as texts, graphs, and relations. We will use text to introduce the main ideas exploited in successful embeddings: transfer learning and unsupervised approaches. More specifically, we will cover matrix factorization based LSA and language model based word2vec. As these embeddings do not cover well the ambiguity of language, we will present modern contextual embeddings such as ELMo and BERT. In graphs, we will first present random-walk based embeddings such as nodevec and HINMINE, but also touch recent graph convolutional networks.
The most general form of embeddings can use any similarity-based function to embed different entities. We will describe the idea of StarSpace embedding technique and show how to adapt it for relations.

Biographical note: Marko Robnik-Sikonja is Professor of Computer Science and Informatics and Head of Artificial Intelligence Chair at the University of Ljubljana, Faculty of Computer and Information Science. His research interests span machine learning, data mining, natural language processing, network analytics, and application of data science techniques. He is (co)author of over 150 scientific publications that were cited more than 4,500 times. He is author and maintainer of three open-source R data mining packages.

NEWS

INISTA 2020 program is out!

Take a look at INISTA 2020 conference program.
Fourth keynote speaker announced!

Take a look at INISTA 2020 fourth keynote speaker.
New tutorial announced!

Take a look at INISTA 2020 third tutorial.
List of accepted papers is online!

Take a look at INISTA 2020 list of accepted papers.
Third keynote speaker announced!

Take a look at INISTA 2020 third keynote speaker.
INISTA 2020 registration has been opened!

Please visit INISTA 2020 registration page in order to register for the conference.
Camera-ready papers submission information is available!

Take a look at INISTA 2020 paper submission page.
Second keynote speaker announced!

Take a look at INISTA 2020 second keynote speaker.
New dates announced!

Submission deadline has been extended to May 29th!
New tutorial announced!

Take a look at INISTA 2020 second tutorial.
Two new special sessions approved!

Take a look at INISTA 2020 special sessions section.
First tutorial announced!

Take a look at INISTA 2020 first tutorial.
First keynote speaker announced!

Take a look at INISTA 2020 first keynote speaker.
New special issues announced!

Take a look at INISTA 2020 special issue opportunities.
Call for Special Sessions announced!

Take a look at INISTA 2020 call for special sessions.