The European Association for Data Science organises a

Summer School on Data Science for Social Media

Monday July 19th to Thursday July 22nd 2021 in Kirchberg, Luxembourg.

Social media is ubiquitous in our modern world. Its ubiquity makes it an attractive object for study in various fields. Applications start with social science, over economics, computer science and go even to health science and disaster control.

On the other hand the validity of such data has been repeatedly brought into question and, given the rich portfolio of approaches and ideas this field is certainly in danger of overpromising.

Regardless of these concerns, social media analysis will be an important method in future research in many different fields. Discussing it potentials and limitations, transfer of methods and ideas, is what this summer school is about.

The Summer School is primarily aimed at advanced PhD students, postdoctoral and early-career researchers with an interest and basic grounding in data science, machine learning, and/or statistics.


Conference and Training Centre at the Chambre de Commerce Luxembourg
7, Rue Alcide de Gasperi
L-2981 Luxembourg Kirchberg


  • STATEC National Institute of Statistics and Economic Studies Luxembourg
  • Chambre de Commerce Luxembourg
  • BiCDaS, Bielefeld University

Public Event

The summer school will be preceded by a public event on Monday, July 19th starting 12:30 p.m.

At the heart of the public event is the Sabine-Krolak-Schwerdt-Lecture, in memoriam of EuADS founding president.

We are delighted to announce that Alexander Pentland from MIT Media Lab will deliver this year’s public lecture.

The public event will be followed by a welcome reception.

Understanding Human Network Behavior: Ideas for reforming social media

Alexander Pentland (MIT Media Lab)

Recent large-scale experiments have given us quantitative models of human decision making that allow predictive modeling of crowd behavior across many situations. These models suggest ways of reforming social media that go beyond suppression of fake news and bots, and have proven track records in other digital platforms. Two innovations in particular appear to be critical: unique, reliable identity, and use-specific reputation mechanisms.

Speaker's Website

Topics and Presenters

The tentative schedule for the event including speakers and topics:

July 20th
9:30 a.m.
to 1 p.m.
Static and Dynamic Mapping Method for
Uncovering Competitive Positions
Bernd Skiera
(Goethe U Frankfurt, D)
2:30 p.m.
to 6 p.m.
Social media metrics: definitions and applicationsZohreh Zahedi
(U of Leiden, NL)
July 21st
9:30 a.m.
to 1 p.m.
The co-evolution of digital behavioral trace
and survey data in social networks
Christoph Stadtfeld
(ETH Zürich, CH)
2:30 p.m.
to 6 p.m.
Qualitative and Quantitative Data Analytics in
Data Science, with Correspondence Analysis and Clustering.
Fionn Murtagh
(U of Huddersfield, UK)
July 22nd
9:30 a.m.
to 1 p.m.
Responsible social-media based collective intelligenceEirini Ntoutsi
(U of Hannover, D)

For details see below!

Tuesday, July 20th

9:30 a.m. to 1 p.m.

2:30 p.m. to 6 p.m.

Static and Dynamic Mapping Method for Uncovering Competitive Positions

Prof Dr Bernd Skiera (Goethe University Frankfurt, Germany)

A market map provides managers with a static snapshot of the competitive positions of a market’s participants (such as products or brands). Today, most markets are rather large (e.g., comprising hundreds of products so that a comprehensive visualization of competitive market structures can be cumbersome and complex. Yet, reduction of the analysis to smaller representative product sets can obscure important information. The first part of the workshop outlines (i) data sources to derive consideration sets of consumers that reflect competition between products and (ii) approaches (e.g., building upon social network analysis) that integrate these data into a modeling and mapping approach to visualize competition in large markets and to identify distinct submarkets.

Yet, as markets tend to be in flux, knowledge about the trajectories of competitive positions of market’s participant over time would be more informative than a static snapshot. In contrast to static snapshots, trajectories create a forward-looking perspective on competition, reveal whether positions are converging or diverging and help managers evaluate the impact of their positioning efforts. Although data for market structure analysis is increasingly available in high frequency (see part 1 of the workshop), extant mapping methods are exclusively static, and do not reveal market participants’ trajectories. Therefore, I focus in part 2 of the workshop on dynamic mapping method that generate a sequence of cohesive maps that enable market analysts to track the trajectories of competitive positions over time.

Speaker's Website

The co-evolution of digital behavioral trace and survey data in social networks

Prof Dr Christoph Stadtfeld (ETH Zürich, Switzerland)

The increasing availability of digital behavioral trace (DBT) data promises novel social science studies that simultaneously scale up on a large number of study participants and zoom in on fine-grained individual behavioral actions. These data may, for example, stem from social media platforms, social sensor experiments, or wearable technologies such as smart phones or watches. DBT data offer a seemingly objective perspective on how people behave individually and socially – how they eat, sleep, travel, interact, socialise, and date. DBT network data often come in the form of relational events – time-stamped dyadic observations that can be represented as time-ordered edge lists. Several new models for the statistical analysis of relational events have been proposed over the past years.

However, studies that merely rely on DBT data have some obvious blind spots. Individual behavior is to a large extent based on how individuals perceive their environment, their relationships, and themselves. Such perception data can be well collected through traditional surveys. Survey data also have known challenges such as cognitive burdens, necessary time investments by participants, and measurement biases. Traditional social network data often stem from surveys and may represent who individuals perceive as friends, whom they like or dislike, and whom they trust. Dynamic network data collected through surveys can, for example, be statistically analysed with stochastic actor-oriented models.

In this workshop, I present a new statistical and computational framework for the joint analysis of event sequences from DBT and survey network data. By combining the analysis of the two data types in one framework, I aim at addressing challenges and shortcomings of both data types as discussed above. In a practical session, participants will acquire basic skills in DBT- and survey-based network data analysis.

Speaker's Website

Wednesday, July 21st

9:30 a.m. to 1 p.m.

2:30 p.m. to 6 p.m.

Social media metrics: definitions and applications

Dr Zohreh Zahedi (University of Leiden)

Social media metrics (altmetrics) refers to metrics derived from social media platforms (such as Facebook, Twitter, Wikipedia, mainstream news websites, etc.). These metrics offer possibility of studying the relations and interactions between social media users, scholarly contents, and different actors. Altmetrics data aggregators provide access to social media metrics differ in terms of methodological choices in collecting, updating, tracking, and reporting metrics. This course focuses on defining and interpreting social media metrics, data possibilities and challenges, social media metrics data analysis and their uses and applications.

Speaker's Website

Qualitative and Quantitative Data Analytics in Data Science, with Correspondence Analysis and Clustering.

Prof Dr Fionn Murtagh (University of Huddersfield)

Geometric Data Analysis is another term for Correspondence Analysis. Qualitative data can be categories of data values. For contextualizing the analytical work being undertaken, this is helped in this way: in the Correspondence Analysis, the active variables make the factor space to be formed, and then there can be supplementary variables, mapped into the factor space. In effect therefore, this can allow both the focus of the analysis and, with that, context. Quite a lot of data sources are to be analyzed, with description of the objectives and the application.

There is to be description of Pierre Bourdieu's work in Social Science, using Correspondence Analysis. Also with applications in questionnaire and survey analysis, and textual narrative and Twitter. The factor space formed can be very good for visualization (for example, to start with the principal factor plane, from the first and second factors from eigenvalues determined in the mapping into the factor space). Then often applied is clustering.

To be provided are details of using R for the analysis, and with practical details of carrying out the analysis work described, with datasets provided and, if possible, with listings of implementation in R provided. For all that is presented, and then also having datasets that are used, and quite possibly also a listing of the R processing carried out, all of that is to be available on a web site of mine, just with a name and password there to access all of that.

Speaker's Website

Thursday, July 22nd

9:30 a.m. to 1 p.m.

Responsible social-media based collective intelligence

Prof Dr Eirini Ntoutsi (FU Berlin, Germany)

The Web offers enormous benefits for information sharing, collective organization and distributed activity with great impact in all areas of our lives. However, along with the benefits come also negative consequences like hate speech, fake news, surveillance, etc. Ambivalences lie at the heart of the Web and we must deal responsibly with these ambivalences to amplify the benefits and counter the negative effects. Towards this direction, as data scientists we should work towards responsible analysis of data collected via the Web. While we all agree that the huge amounts of data generated in the Web offer paramount opportunities for data-science related applications and are the pre-condition for the success of modern machine learning methods, we cannot ignore the fact that data collection comes with assumptions, and moreover, further assumptions are made during the analysis pipeline which of course have great impact on the extracted knowledge. In this talk, we will focus on such assumptions, including data sampling, redundancies, proxy-labeling, temporality and bias), their effect on the learning process and how to build effective models under such assumptions.

Speaker's Website

Fees and Registration

Participation costs 300 € (250 € for EuADS members). This includes participation in the social event.

To ensure an interactive experience the number of participants is limited so early registration is strongly recommended. Please register by

1. Sending an email with your personal details to, with reference to EuADS Summer School 2021 on Social Media Analysis.

2. Transferring the amount to the Banque et Caisse d’Epargne de l’Etat, Luxembourg (BIC: BCEELULL; IBAN: LU47 0019 4655 6967 1000).

Once the personal details and registration fee have been received you will receive an email confirming your participation.

You can cancel your participation in the summer school and get your participation fee refunded until May 31st, 2021. Just send an email to

Pandemic Caveat

Many activities have gone digital in the past month to an extend considered impossible a year ago. Some activities have, on the other hand, proven difficult to digitalise. Effective networking and informal exchange of ideas, a central part of the format of a summer school, is one such activity.

For that reason we are aiming to conduct the summer school as an in-person event. We will constantly monitor the current development, travel restrictions and recommendations. We are currently optimistic that events such as the summer school will be possible in July 2021. However, we point out that it is still possible that we will need to postpone the summer school to a later date. We will announce such a rescheduling here. We ask all attendees to prepare for such occurrences, e.g. by travel cancellation insurances and similar. EuADS can unfortunately not cover cancellations fees or other expenses caused by an eventual cancellation of the event.

We consider it important that the speakers of the summer school are on the venue side and available to questions and discussion e.g. in coffee breaks and similar. Should some speakers, however, be unable to travel to Luxembourg, we will make every effort for a remote talk to the plenum on the conference venue.


From the following hotels you can walk to the location:
Meliá Luxembourg
Coque Hôtel
Hôtel Novotel Luxembourg Kirchberg
Sofitel Luxembourg Europe Hotel


  • Serge Allegrezza (STATEC, Luxembourg; EuADS Treasurer)
  • Matthias Böhmer (U Luxembourg, Luxembourg)
  • Reinhold Decker (U Bielefeld, Germany; EuADS Vice-President)
  • Andreas Geyer-Schultz (KIT, Germany)
  • Nils Hachmeister (U Bielefeld, Germany; EuADS Vice-President)
  • Marc Pauly (STATEC, Luxembourg)
  • Denise Schroeder (STATEC, Luxembourg)
  • Myra Spiliopoulou (U Magdeburg, Germany)