My dear reader, how are you? السلام عليكم

The word of wisdom is the lost property of the believer. Wherever he finds it, he is most deserving of it — Sunan al-Tirmidhī 2687 

This post presents insights into the recent developments towards addressing the growing curse of obesity in children using Big Data.


Did you know, in 2016, 39% of all adults worldwide were overweight? 13% of them had obesity. Furthermore, 18% of children across the globe had overweight or obese. So, obesity is one of the most important global health issues, with significant cost implications both to the individual and society at large. Therefore, strong measures must be taken to intervene in childhood obesity. Children with obesity are prone to experience a range of health issues that are both physical and psycho-social. Obesity also contributes to critical health problems such as Type 2 Diabetes and Coronary Heart Disease that develop in childhood.

Figure 1: Global statistics regarding Obesity

The ecological model of obesity proposed 20 years ago, added structure and form to our understanding of obesity as a complex condition arising from biological, social, and environmental interactions. Here, ‘genes’ refers to factors within the body, or one’s internal, homeostatic environment. ‘Behaviour’ is an individual’s activity within their environment; and ‘Environment’ is an individual’s natural, social, and built surroundings. Three mainstream obesity research methods include biomedical methods, epidemiological methods, and Big Data-driven research methods. Big data are relatively new and underexploited in obesity research. As shown in the graph, gathering big data on obesity via mobile surveillance has the promise to link relevant variables at the level of individuals and populations. This allows synergy between the biomedical and epidemiologic approaches and covers a broad spectrum of the Foresight factors, ranging from Individual Psychology to Food Production.

A corollary to the Hadith (saying) of the Holy Prophet of Islam, Muhammad (SAW) stated above is “Big Data is the lost property of Modern-day Muslims, to find and to utilize”. So, let us get back to how we can address the problem of Obesity using Big Data Analytics.

The developments in the fields of behaviour change science, public health, clinical methods, Internet and communication technologies, citizen science and Big Data analytics can be harnessed to implement multidisciplinary research projects to address the prevention and treatment of childhood obesity at a population level. The EU H2020 project BigO stands for Big Data Against Childhood Obesity is one example of such research efforts. BigO project aims to redefine the way strategies that target childhood obesity prevalence are deployed in European societies.

Figure 2: BigO Program

BigO builds a technological platform and uses citizen-scientist data collection methods that gather large-scale data using different mobile technologies such as a smartphone, wristband, and mandometer (See Figure 3). The main goal is to study the effectiveness of specific policies on a community and the real-time monitoring of the population response. The platform measure obesogenic behavioural indicators and the environment. It also offers evidence and tools for targeted actions against obesity to stakeholders including public health authorities, health professionals, and schools. Data is uploaded by citizen scientists onto the BigO cloud infrastructure for aggregation, analysis, and visualisation. The large-scale data acquisition in BigO allows researchers to create models for analysing behavioural risk factors and predicting obesity prevalence, through associations with community behavioural patterns and their local environment.

We have developed a mobile app called myBigO app that works on all Android and IOS devices such as smartphones, smartwatches, and tablets. Using myBigO app, we record in real-time the following,

  • When, where and what a child eats.
  • How much, where and when a child moves.
  • When and what time a child sleep

Figure 3: High-level Illustration of BigO Technological Platform

There is also a web-based online portal where schools, clinics, and students can register and set up in groups and individuals. The aggregated and processed data is stored in the BigO cloud and the web-based portal helps to study the recorded physical activity, eating and sleeping per group or per student.

Once the data has been collected, the BigO platform allows generating meaningful visualisations such as comparison of BMI for male and female children of different age groups, tracking physical activity in terms of a number of steps per day during different days of the week and classifying the number of students who are most active, moderately active and least active, or even plot the activity maps. See Figure 4 for physical activity maps drawn on the portal during a pilot project carried out in one of the cities of Greece. We can spot on the map the details of the environment surrounding the children and make analysis based on that such as access to local parks, public transport, playgrounds etc.

Figure 4: Physical activity maps on BigO Portal

Since the BigO platform collects a lot of personal sensitive data, let us now look at the requirements of an ideal healthcare dataset. In terms of privacy, the key idea is to create an environment where the private and sensitive data be analysed without revealing the identity of individuals. We require the datasets to be of a high quality covering different perspectives of the children behaviour and environment. The Privacy protection should be flexible enough for various analysis tasks and mining techniques and should support the reference system architecture. Appropriate ethics mechanisms and access control should be implemented to allow researchers to revisit the patient data.

Let us take a look at the four main categories of the data that is collected by BigO; (1) personal or behavioural data sources, (2) Population data sources, (3) Regional data sources, and (4) Mapping data sources. 1 is collected via smartphone, smartwatches, and mandometers, 2 is collected via statistical authorities’ files. Regional data is collected via geospatial data files and mapping data is collected via web-based application programming interfaces.

The data attributes (or dimensions) in a typical healthcare dataset can be divided into four categories based on their sensitivity and their relationship with the subject: (1) sensitive, (2) non-sensitive, (3) identifiable, and (4) quasi-identifiable. The table here shows that, alongside clearly sensitive and identifiable attributes, there are a number of features called quasi-identifiers that can be combined to identify specific individuals. Preserving the privacy of a piece of data that is sensitive, identifiable, or quasi-identifiable is a critical challenge.

The data attributes (or dimensions) in a typical healthcare dataset can be divided into four categories based on their sensitivity and their relationship with the subject: (1) sensitive, (2) non-sensitive, (3) identifiable, and (4) quasi-identifiable. The table here shows that, alongside clearly sensitive and identifiable attributes, there are a number of features called quasi-identifiers that can be combined to identify specific individuals. Preserving the privacy of a piece of data that is sensitive, identifiable, or quasi-identifiable is a critical challenge.

Privacy and security are fundamental requirements for any information system. Data security methods implemented in BigO include Secure storage, Secure communications, and Data access control. BigO data privacy protections are based on the type of personal data. Sensor’s data is stored on personal devices and only statistical and generalized data is extracted and submitted to the BigO server. All the photos are reviewed by BigO admins and automated algorithms. All attributes, such as username, deviceID, etc., are removed before storing the data in the data warehouse via de-identification. Quasi-identifiable data is usually dealt with anonymization, but it impacts Data Quality. So, there is a need for privacy-aware protocol. Therefore, we proposed a novel privacy-aware protocol to deal with the problems of identifiable and quasi-identifiable attributes when sharing data for analysis. The protocol considers the high-dimensionality issue of the data. Further details are available in our recent publication on Big Data Ware House Architecture [1].

The privacy and security protocols implemented in the BigO system can be applied to any data-driven application apart from the healthcare sector, where there is a risk for sensitive personal information loss during the data acquisition, storage, transmission, and access while carrying out the analysis tasks. The essence of these protocols is that without degrading the quality of data, one should be able to perform mining and analysis tasks on data without a threat of sensitive information losses.

We believe the BigO platform paves new research directions in the modern-day age for Obesity prevalence.

We dream of a future where children are actively living a healthy lifestyle away from the curse of Obesity.

—END


References

[1] Shahid, Arsalan, Thien-An N. Nguyen, and M-Tahar Kechadi. 2021. “Big Data Warehouse for Healthcare-Sensitive Data Applications” Sensors 21, no. 7: 2353. https://doi.org/10.3390/s21072353


I hope you find this post useful. If you find any errors or feel any need for improvement, let me know in your comments below.

Signing off for today. Stay tuned! Happy learning.

LEAVE A REPLY

Please enter your comment!
Please enter your name here