"The industries of sport and health are not immune to the use of big data and artificial intelligence"

 Big data and the use of sport and health

Photo: Alexander Sinn/Unsplash

Slvia Oller
UOC researchers Angel A. Juan, Anna Bach and Milagros Sinz, help to identify emerging trends in big data, health and sport, with a genre perspective


Artificial intelligence (AI) and big data are now an integral part of the worlds of sport and health. In order to publicise the research being undertaken in both fields, the Spanish network Sports and Health Analytics Research with a Gender Perspective (SHARP), funded by the Spanish Ministry of Culture and Sport, in coordination with three research groups at the Universitat Oberta de Catalunya (UOC), organized a workshop with around twenty experts in October. The principal investigator of the Internet Interdisciplinary Institute's Internet Computing & Systems Optimization group (ICSO@IN3), Angel A. Juan; FoodLab researcher, director of the Master's Degree in Food for Physical Exercise and Sport, and professor at the UOC Faculty of Health Sciences, Anna Bach; and the principal investigator from the IN3's Gender and ICT (GenTIC) group, Milagros Sinz, helped to identify emerging trends and predict the progress of a sector that is no longer in the realm of science fiction. 

We have long been accustomed to getting instant statistical data while watching all kinds of sporting events on television. Clubs are increasingly reliant on tools such as big data when it comes to signing players. What other uses does AI have in sports?

NGEL JUAN / ANNA BACH: Indeed, analytical and machine learning techniques are being used more and more by professional football and basketball clubs to attract young talents who mesh well with the team's characteristics. The boom in the use of data and artificial intelligence has affected virtually every aspect of our lives, and the sports industry has not been immune to this. In the field of sport, data-based analyses can be linked not only to the performance and state of health of athletes (injury prevention, etc.) but also to the decisions made by the management of these sports teams. 

Is it possible to predict sports injuries with big data? And how can they be prevented?

A.B. / A.J.: In his presentation "Com les dades poden ajudar a prevenir lesions esportives: restriccions actuals i perspectives de futur" (How data can help to prevent sports injuries: current restrictions and future prospects), Javier Pea, a researcher at the University of Vic-Central University of Catalonia, explained that injuries are an extremely complex issue and their occurrence is hard to predict. The cost of sports injuries is high, and it is something that needs to be handled carefully if we want to encourage people to take up physical activity. The main limitations of injury prevention studies in sports settings are that, because of data protection policies, the information on injuries is neither public nor shared, and reporting on injuries is far from flawless. Furthermore, many of the studies that do get to be published are retrospective, which gives little leeway for making behavioural changes in real time.  

The figure of the sports scout is giving way to big data departments in many sports clubs. What are the results of this tool in terms of the benefits or outcomes for these clubs?

A.J. / A.B.: Researcher Mart Casals, also from the University of Vic-Central University of Catalonia, explained that the role of sport biostatistics is to provide information that facilitates communications between athletes, coaches and medical staff. Casals emphasized the importance of the quality and transparency of these data, and the fact that they can be replicated and implemented. To ask the right questions about data, we need the scientific evidence provided by biostatistics, but we should also bear in mind – and in fact this was one of the conclusions of the workshop – that data per se do not add value if they are not of sufficiently high quality, if the research and analysis process is not designed properly, and if the results are not interpreted correctly. 

Are big data and AI able to predict whether a child athlete will end up being equally talented as an adult? What other factors are involved?

A.B. / A.J.: In his presentation "Contribuci de l'anlisi de dades en l'adquisici de talents esportius" (The contribution of data analytics to signing sports talent), David Lpez, a researcher at ESADE Business School, explained that the most frequently-used technique is expert opinion from data analysis and the use of simulation models. Firstly, a descriptive study is made to give an overview of the activity and generate candidates, and then predictive analyses use data for finding patterns and using them to forecast what might happen. However, one of the obstacles to this work is that full potential is still a mystery, as many clubs' departments only use statistics as a source of information rather than forecasting. Another key factor is that professional and semi-professional clubs do not share their data, and hence the talent they may have on their books. 

Is it true that there is gender bias in algorithms and AI? What is the reason behind this bias?

MILAGROS SINZ: Gender bias is mainly due to the absence of women and a gender perspective in the design and production of technology and hence algorithms. Algorithms are fed by different types of information or data from different sources. If these data are not of sufficiently high quality, because they only take into account biased information according to gender roles or stereotypes, any decisions that are subsequently made based on these data will have a gender bias. And this is also the case in entertainment platforms such as Netflix and Spotify, which feature gender-biased playlists. 

How can we reduce or correct this?

M.S.: We need to make sure that more women have access to these professions, but before that we need to change the culture of educational and professional contexts and ensure that the people educated in the fields of science and technology do so from a gender perspective. In addition, primary, secondary and university teachers also need to be trained from a gender perspective. 

What needs to be done to guarantee high-quality data? What do research teams need to bear in mind when analysing data?

M.S.: Guaranteeing high-quality data means employing experts in data analysis from different disciplines; people with different types of knowledge and skills who can help to ensure that the data taken as a reference do not exclude people from certain social groups. Artificial intelligence is fed by professionals involved in data processing from natural language who help to develop technological tools such as the virtual voice assistants or chatbots used by different companies to provide services for their users. These professionals bring invaluable added value to AI because they have been educated in the different specialist areas of social sciences or humanities, such as philology or translation, disciplines in which there is a high percentage of women. 

Extraordinary situations such as the one we're currently experiencing as a result of the health emergency require the collaboration of people from different scientific disciplines (as well as social sciences, arts and humanities) to feed AI with sufficiently rich and rigorous data to respond to the healthcare and social challenges of the future. 

What is the percentage of women in artificial intelligence teams? 

M.S.: Studies show that only 11% of people who programme source code are women. In addition, a recent report confirms the low gender diversity in AI-related research, with just 13.8% of authors being women. All the publications on AI that have at least one female co-author tend to focus on social issues like justice, human mobility, health and gender. 

At present, fewer than 25% of artificial intelligence researchers in academic institutions and organizations are women. According to various studies, more than 80% of university professors working in AI are men, while just 11.95% of staff researching AI at Microsoft and 15.66% at IBM are women. 

Why do you think there are so few women in this field?

M.S.: Very few women enrol for studies and professions associated with artificial intelligence. The professional and employment opportunities in this field are unlimited, and we need to take an interdisciplinary approach so that from fields which, at first glance, may not be technical we can find solutions to the different challenges posed by AI. There are still a lot of stereotypes about the type of people who work in this sector. There is a perception that women who work in the field of artificial intelligence are weird or nerdy, that they can't relate to other people or that they spend their life in front of a computer screen developing codes.

There was talk about the difficulty women face in finding time to play sport, and the rate of female dropout from sport. Why is this happening? How can the situation be reversed?

M.S.: Generally speaking, more women than men take on the responsibilities of looking after the family and household chores, so they don't have much time left over to engage in sport or recreational activities. According to Mara Martn, a professor at the Universidad Politcnica de Madrid, future research should include aspects related to motivation for health reasons associated with sport, including walking as a sporting practice, and issues related to people's lifestyles such as bringing up children, work, and the home. 

It is crucial to use a methodology that allows the discriminant variable to be identified (why don't they play sports if they really want to?) as a complement to other methodologies. In her presentation, Professor Martn emphasized the need to reconsider the definition of practising sport, as many women walk but this type of exercise is not viewed as being a sport per se.

On a very topical subject, researcher Ldia Arroyo, from the IN3, is coordinating a project on COVID-19 and gender. What does it entail, Ldia?

LDIA ARROYO: This project aims to create an open data portal and scientifically analyse the impact of the pandemic in terms of gender inequalities in the labour market. The project will study the impact of COVID according to occupational segregation by gender. The research aims to study both the impact of horizontal occupational segregation, related to the concentration of women in certain specific sectors and occupations, such as care jobs. It will also analyse vertical occupational segregation, which concerns the higher proportion of women in lower professional categories, a fact that is also related to other intersectional inequalities such as class and origin. The project was approved by the Public Data Analysis for Health Research and Innovation Program (PADRIS) of the Agency for Health Quality and Assessment of Catalonia (AQuAS), which is part of the Spanish Ministry of Health. 

How is health improving along with data analytics? Are big data and AI helping to fine-tune diagnoses?

AB / AJ: They allow a large amount of data to be processed using smart algorithms, leading to the improvement of many processes in the field of health. Groups such as ICSO and FoodLab at the UOC are exploring various lines of research in this respect. For example, we're working on health logistics projects that should improve mass sampling processes and the administration of drugs or vaccines, helping to ensure that health protection teams can reach hospitals as quickly as possible. 

We're also working on projects that support medical specialists who have to decide on the best treatment combinations for certain cancers, taking into account aspects such as the severity of the disease, physical and genetic characteristics, the gender or age of the patient, the interdependence between treatments, and the future availability of resources such as operating theatres, specialists, etc. 

We're also working on facial recognition projects that use video cameras and artificial intelligence algorithms to identify patients when they arrive in a hospital emergency room so that all the available information on that patient is automatically loaded in real time on the computers of the doctors who have to treat them. 

Do you think that governments make the most of big data when it comes to implementing efficient policies, or is it still an underused tool?

A.J. / A.B.: It depends a lot on each country. There are some countries where research has better funding and careers in science are promoted, and other countries where it is more difficult to access resources such as personnel, equipment or centres. Investing in cutting-edge research should be a priority for any developed country, but here in Spain we are still a long way from having the facilities that other European and North American countries enjoy. 

M.S.: I think we're still not making the most of many mass database analysis tools, especially when it comes to the gender perspective. They are gradually being developed, which is why it is so important to establish the ethical and legal requirements that shape the use of big data. 

¿Qu investigaciones tenis en curso en vuestros respectivos grupos de investigacin?

AB/AJ: Natalia Forner, a PhD student and member of FoodLab and ICSO@IN3, is conducting a study on dietary patterns for health and sustainability, for which she will use big data to analyse the available data and predict scenarios. 

Along the same lines, Georgina Pujol-Busquets, a PhD student at the University of Cape Town (South Africa) and visiting researcher at FoodLab, is working with data on physical activity and diet, integrating gender aspects, on the Nutrition and Health Programme for Women in Poor Communities project. 

At FoodLab, on the subject of research into food and physical activity, we are also working on aspects of health, sport and performance. For example, researcher Xavi Santabrbara, a PhD student in Health and Psychology, spoke at the workshop on dietary supplements and doping.

M.S.: At the moment, in addition to working on the COVID-19 and gender project led by Ldia Arroyo, the Gender and ICT (GenTIC) research group has three European projects underway, including one on the promotion of equality plans in higher education institutions, and another on innovation in social economy companies with a gender perspective. We're also working on the creation of a European certification system on equality and gender for research institutions and on a state-wide plan with a series of studies on interventions to promote science and technology careers among young girls. In addition, there is a project underway funded by RecerCaixa on the mechanisms for preventing gender-based violence in secondary education. 

"Sports and Health Analytics Research with a Gender Perspective" (SHARP) is a Spanish network of researchers funded by the High Council of Sports (CSD) of the Spanish Ministry of Culture and Sport (Ref. 09/UPR/20)