Sunday, February 19, 2017

Big Data for Bigger Happiness?

Big data officially hit the happiness movement at the World Government Summit mid-February in Dubai. This blog post is the third that reports out the events at the Dialogue for Global Happiness and World Government Summit held in Dubai.  The first two posts are What I learned in Dubai and Bhutan's Recipe for Happiness. This post reports how big data made its debut, addressed three main concerns about big data and suggests ways big data can for the aims and purposes of the happiness movement.

To start, it is important to agree upon a definition of big data. A simple web search yields one of the
best answers: extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.  

This definition does not touch on the methods employed to attain the data.  Big data is collected in one of two ways: stealth and overt.  In the stealth method, users are not aware of the extent to which their data is being collected and used. They may have agreed to terms in order to use a service. These terms usually include a clause stating their data will be collected and used. In rare occasions, the user may have read the terms, and in rarer still the terms may specify the way the data will be collected and used. However, it is safe to say that most users are not aware of the extent to which their data is being collected and only vaguely aware of the purposes to which they are being put. Stealth methods include social media sites (facebook, twitter, linkedin, instagram, etc.) and websites, web services, and app services (games, online shopping, most email services, google maps, etc.).  

Overt methods of collecting big data include those in which the user intentionally provides data. Users know the purpose of data collection and the use of it, as well as the party using it. Overt methods include apps designed to collect data for a specific purpose, and surveys. 


These distinctions in mind, there does not exist a set of generally accepted guidelines for collecting or using big data. This is not true for survey-based happiness and well-being data. In 2013, the Organization for Economic Cooperation and Development (OECD) issued "Guidelines for Measuring Subjective Well-being," conclusively answering the questions of whether happiness and well-being can be measured (yes) and how. 


The role of data in the Happiness Movement, also called the Beyond GDP Movement, is crucial.  One of the primary aims of the Happiness Movement is for governments to replace their singular focus on monetary metrics (Gross Domestic Product, consumption, wealth and income) to wider measures of well-being; and their goals from economic growth and wealth generation to the happiness and well-being of people. The definition of happiness metrics, as determined by the OECD Guidelines for Measuring Subjective Well-being, include the feeling of happiness as well as satisfaction with life and the conditions of life: government, personal economy, environment, community, social support, culture, health, time-balance, work, and other aspects of life. 



"Happiness is a Plausible Goal for Our Planet"


At the World Government Summit, big data made its debut in the Happiness Movement with Martin Seligman.  

The author with Seligman in Dubai
Seligman opened his keynote speech with "Happiness is a plausible goal for our planet." He went on to say "Happiness is the most you can expect for yourself, for your nation, and for the world."  He announced his preference for big data for use by governments. Seligman gave four reasons he feels that big data is a better source for information about a population's happiness and well-being than survey-based data. Survey based data, and subjective well-being indicators, are being used by the governments of Bhutan and the United Kingdom to inform public policy, and over 40 countries are now measuring the happiness and well-being of their people using surveys).  

The four reasons Seligman gave are that measuring big data is non-reactive, unobtrusive, huge samples, and less gameable than questionnaires.


Non-reactive means that the subject does not know they are being observed or that data is being gathered about them.  The idea is that there is no interaction between the observer and the subject - the person from whom the data is being gathered.  Unobtrusive, an aspect of non-reactivity, means the subject is not being directly asked, unlike in an interview or online questionnaire.  Less gameable means the subject is less likely to lie, misrepresent or distort the truth.  A huge sample size means lots of people provided data. A sample size is generally determined to be large or small relative to the population it represents. 


Non-reactivity and Big Data  


In Dubai, the government is using an app developed to measure satisfaction with services the government provides.  When someone leaves a governmental agency, they are asked to fill out a simple survey. It looks like this:
Smart Dubai's Happiness App at www.smartdubai.ae
The app is called the happiness meter. Dr. Aisha Bin Bishr, Director General of the Smart Dubai Office, outlined of the benefits of the happiness meter: the entire population participates in the data collection, cost is limited to a one-time set-up cost, data is gathered in real time in an non-invasive manner. 



Dr. Aisha Bin Bishr at the Dialogue for Global Happiness

Dubai is not the only city to gather data in an overt method. Other areas have also developed an app gather data about people living or visiting an area. In some areas, the government works with community organizers to develop and implement an app. In other areas, the efforts are community driven. 


Mappiness, a non-profit in the United Kingdom, is one of the first organization to develop an app to gather happiness data. They have been gathering data since 2011. Mappiness' findings on the environments that lead to the greatest happiness provide important information to urban planners, park and natural resource managers, and other decision makers seeking to create livable and sustainable urban or rural environments. 


When big-data is collected with explicit permission from the population being observed, it can be a powerful tool to gather information on what places, situations and times that yield a feelings of happiness or unhappiness, and gain buy-in when implementing an intervention to promote or protect a population's happiness. This information and process can be extremely useful for decision makers and planners of every sort. That said, explicit permission is granted by the user may nullify the non-reactivity of the data. By definition, the use of this kind of app to gather data is not unobtrusive


Seligman used twitter and facebook for the big data examples given in his talk. He suggests that data collected from facebook and twitter are unobtrusive  The question of whether a facebook, twitter or other social media user understands that they are being observed by parties other than those they choose (their facebook friends, twitter followers, or linked-in connections) and whether a user understands that their data is being gathered is not clear.  One must also consider whether prompt "what's on your mind?" that appears on the top of every facebook user's timeline is a direct question or rhetorical.  Thus, the question of whether data gathered via stealth methods from social media and other sites without express and explicit permission is non-reactive and unobtrusive is not determined. 


Gamability and Big Data


To review: unobtrusive means the subject is not being directly asked and less gameable means means the subject is less likely to lie, misrepresent or distort the truth. 


If you take the Happiness Index, you will be asked questions about how happy you are and your satisfaction with life and other factors in your life: government, personal economy, social support, time balance, etc. The Happiness Index is one example of a happiness and well-being survey. There are many. 


Depending on the party administering the survey, you may want to make out that you are doing better - or worse - than you actually are. This makes sense if you feel you will be punished or treated differently if you fill out a survey at work or elsewhere saying you are very dissatisfied. It also makes sense if you want to make a good impression on people, and make it seem like you are having the time of your life.  When users do this, it is called gaming.  In surveys, analysis of the data can sometimes reveal when someone is gaming, and this data can be eliminated from a data set. For example, gaming is evident if a survey taker responds without variety at the top or bottom of a scale for every question or if text-box questions are answered with impossible responses (such as, in answer to nationality, someone fills in "Klingon").  There is general consensus among pollsters that most people, when asked about their happiness or satisfaction with life and the conditions of life, respond with the truth.  One way to deal with gaming in surveys is to collect data from a sufficient sample size that the gamed data becomes "noise" (i.e. outliers, etc.). 


The question of whether people misrepresent themselves on facebook, presenting a persona they wish that is far from reality, or whether social media sites provide a venue where people can be their true self, is not determined. Some believe that the relative anonymity of social media sites provides a venue that allows people to display their genuine feelings and nature. Others are more sceptical, seeing social media sites as rampant with manipulated or constructed reflections of a persona, or, at best, reflections of partial truths about the user. 


Another issue with big data is whether it is possible to effectively identify sarcasm, irony or insincerity when collecting data. Sarcasm, irony and insincerity could be considered a type of gaming. Adding further complexity to the question of whether big data is less gameable that survey-based data is context. On most social media sites, and many websites, the user is exposed to advertisements, sponsored posts, tweets or other displays, designed to persuade. Many of these persuasion methods are based on an emotional appeal. Persuasive content on social media and other sites are designed to have an effect on a person's feelings and sense of self. To this point, whether one is gamed when using social media and websites is unclear, and thus the data collected from these sites may reflect less than the truth about how a person feels or what they are thinking absent a persuasive or manipulative influence.  

In his talk Seligman used examples of big data gathered from facebook for happiness. Here is his word cloud for women's happiness:



And his word cloud for men's happiness:


In a world cloud, the biggest words are the most frequently used.   Notice that the biggest words for women are shopping, excited and the symbol for a heart (<3). The biggest words for men are Xbox., himself and vs. 

One of the last questions in the Happiness Alliance'a Happiness Index is "In one word, what makes you happy?" Below is a word cloud of over 1,100 people's responses who took the survey in January of 2017:

This word cloud is similar to with those generated from larger sample sizes. (The Happiness Alliance's data is available to researchers whose purposes are aligned with their mission).  Notice the biggest words are family, friends, relationships and nature. Happiness, well-being and quality of life research findings tells us that the factors that contribute to individual and societal happiness and well-being include community, sufficient income, social support (feeling loved, cared for, having someone to turn to when in need), trust and participation in government, psychological and physical health, and other factors. It does not tell us that shopping or playing games yields lasting or meaningful gains to our sense of happiness and well-being or quality of life. 

This is not to say that big data collected via facebook or other social media is not of any use. It can be very useful to get a temperature of emotions (affect) and an idea of what motivates a population. This information can be used to understand the momentum for a people and direction needed for raising awareness, protecting and promoting people's right to the opportunity to the pursuit of happiness. 

Sample Size and Big Data


Seligman and Dr. Bishr both cited the sample size for big data.  In Dubai's case, the population is defined as all who use government services. If everybody who uses government services completes the happiness meter, then the data reflects the entire population of people who use government services. This sample does not include people who do not use government service.

In the case of Seligman, and others who use big data collected from social media sites, the sample size is very large. Whether it reflects the entire population of an area or just those using social media is not clear. For many areas, there are entire segments of a population that do not get online: elders, illiterate, internet-averse, people without online access, some of the poor. 


Ali Mohammed Al Muwaiijei was one of the people in the audience when Seligman spoke. He works for the Dubai South for the Government of Dubai. Dubai South is "the emirate's flagship urban project that will set benchmarks for the rest of the emirate in terms of manifesting the themes of happiness as set our in Dubai Plan 2021."  When queried as to whether he felt big data reflected the happiness and well-being of a population, he responded that his department was aware of the issues regarding populations that did not use the internet or were adverse to apps and other internet and information technology interfaces. This is good news. 

Another other issues with the sample size for big data has to do with gaming of a different sort. People can have more than one profile, user interface or account, with multiple emails and other credentials. One may even install bots to make posts at regular intervals. This is the case for hackers, but also for people fulfilling different roles. For example, a person may post on facebook from a personal page, a page for their employer, another for a cause. 

Adding to the complexity of the issue of sample size in big data is the frequency at which someone uses a social media or other website service.Those who are on online all day ("always on") are contributing substantially more data than those who tune in once a day, week or more occasionally. Thus, even with a large amount of data, big data may reflect a smaller portion of a population than one may expect. The question of what population big data represents is not clear. 

This does not mean big data has no use. It can be a powerful tool when combined with subjective well-being and objective data. 

Three Proposed Uses for Big Data 


Big data can be a great way for a government to get a temperature or sense of where a population is on an issue. Big data can reveal what associations people have with happiness or unhappiness, such as shopping and gaming or family and friends. When a population is associating activities that may lead to greater isolation or environmental degradation, a government knowledgeable about the known influences for happiness understands that raising awareness about these is appropriate.  When a government or researcher frames big data with survey based data, they are in a position to exploit its full potential.

Big data is useful for understanding how geography, time use or a situation is affecting the affect (feelings) of a population.  This information can be used by urban planners and others to put in the environments, such as parks, fields, and other ways to access nature as well as spaces for community events. Mappiness provides a good example of this use.  

Big data can be collected by working with community members and organizations to conceive and create an app or other instrument for collecting the data. This method engages the people about whom the data is being gathered and gain their buy in for and participation in decisions and interventions based on the data. This is an excellent way to raise awareness and gather data for a specific purpose.

In conclusion, we can expect governments to collect big data when seeking to understand the happiness and well-being of its people for no other reason than its cache.  Let's hope cache is not the reason they use it. Let's hope that when collecting big data, governments will understand its limitations and that they will use it alongside data that reliably reflects the population's sense of happiness and well-being. Let's hope that when gathering happiness and well-being data, governments use it for the purpose of securing, promoting and protecting all people's inalienable right to life, liberty, and the pursuit of happiness, and not to encourage consumption and gaming. 


Written by Laura Musikanski, executive director of the Happiness Alliance. 





2 comments:

  1. Excellent article that I have shared in multiple ways.

    ReplyDelete
  2. I definitely appreciate your blog. Excellent work!
    Big data

    ReplyDelete