Feeds:
Posts
Comments

Posts Tagged ‘Statistics’

The context is the recent US elections that belied every prediction made by the media and the ‘elites’. The article by William Davies appearing in recently in The Guardian (here) interestingly attempts an analysis and identifies a social phenomenon that has significance far beyond its immediate context of US elections as you’ll see. As one not knowledgeable or deeply interested to take a stand in US politics, in the extract below I’ve excised author’s political views except where it severely affects the readability to keep the focus on the subject. Also the long linear text is now structured with titles for better understanding of author’s case. Otherwise every attempt is made to preserve the fidelity of author’s thoughts to the extent I understood them. Here we go:

How statistics lost their power…

(the text with in “ ” is edited/recast at places for the same reasons; emphasis made are my doing)

  1. The phenomenon – rejection of Statistics:

In theory, statistics should help settle arguments. They ought to provide stable reference points that everyone – no matter what their politics – can agree on. Yet in recent years, divergent levels of trust in statistics has become one of the key schisms that have opened up in western liberal democracies. Shortly before the November presidential election, a study in the US discovered that 68% of Trump supporters distrusted the economic data published by the federal government. In the UK, a research project by Cambridge University and YouGov looking at conspiracy theories discovered that 55% of the population believes that the government “is hiding the truth about the number of immigrants living here”.

Rather than diffusing controversy and polarisation, it seems as if statistics are actually stoking themNot only are statistics viewed by many as untrustworthy, there appears to be something almost insulting or arrogant about them. Reducing social and economic issues to numerical aggregates and averages blind to local variability seems to violate some people’s sense of political decency.

Nowhere is this more vividly manifest than with immigration. The think-tank British Future has studied how best to win arguments in favour of immigration and multiculturalism. One of its main findings is that people often respond warmly to qualitative evidence, such as the stories of individual migrants and photographs of diverse communities. But statistics – especially regarding alleged benefits of migration to Britain’s economy – elicit quite the opposite reaction. People assume that the numbers are manipulated and dislike the elitism of resorting to quantitative evidence. Presented with official estimates of how many immigrants are in the country illegally, a common response is to scoff. Far from increasing support for immigration, British Future found, pointing to its positive effect on GDP can actually make people more hostile to it. GDP itself has come to seem like a Trojan horse for an elitist liberal agenda. Sensing this, politicians have now largely abandoned discussing immigration in economic terms

The declining authority of statistics – and the experts who analyse them – is at the heart of the crisis that has become known as “post-truth” politics. And in this uncertain new world, attitudes towards quantitative expertise have become increasingly divided. From one perspective, grounding politics in statistics is elitist, undemocratic and oblivious to people’s emotional investments in their community and nation. It is just one more way that privileged people in London, Washington DC or Brussels seek to impose their worldview on everybody else.

  1. Statistics is not elitist:

From the opposite perspective, statistics are quite the opposite of elitist. They enable journalists, citizens and politicians to discuss society as a whole, not on the basis of anecdote, sentiment or prejudice, but in ways that can be validated. We need to try and see them for what they are: neither unquestionable truths nor elite conspiracies, but rather as tools designed to simplify the job of government, for better or worse. The alternative to quantitative expertise is less likely to be democracy than an unleashing of tabloid editors and demagogues to provide their own “truth” of what is going on across society…”

  1. History and motivation for Statistics as a state tool:

“…In the second half of the 17th century, in the aftermath of prolonged and bloody conflicts, European rulers adopted an entirely new perspective on the task of government, focused upon demographic trends – an approach made possible by the birth of modern statistics. Since ancient times, censuses had been used to track population size, but these were costly and laborious to carry out and focused on citizens who were considered politically important (property-owning men), rather than society as a whole. Statistics offered something quite different, transforming the nature of politics in the process

The emergence of government advisers claiming scientific authority, rather than political or military acumen, represents the origins of the expert culture now in disrepute. These path-breaking individuals were neither pure scholars nor government officials, but hovered somewhere between the two. They were enthusiastic amateurs who offered a new way of thinking about populations…Thanks to their mathematical prowess, they believed they could calculate what would otherwise require a vast census to discover.

There was initially only one client for this type of expertise…Only centralised nation states had the capacity to collect data across large populations in a standardised fashion and only states had any need for such data in the first place. Over the second half of the 18th century, European states began to collect more statistics of the sort that would appear familiar to us today. Casting an eye over national populations, states became focused upon a range of quantities: births, deaths, baptisms, marriages, harvests, imports, exports, price fluctuations. Things that would previously have been registered locally and variously at parish level became aggregated at a national level.

New techniques were developed to represent these indicators, which exploited both the vertical and horizontal dimensions of the page, laying out data in matrices and tables, just as merchants had done with the development of standardised book-keeping techniques in the late 15th century. Organising numbers into rows and columns offered a powerful new way of displaying the attributes of a given society. Large, complex issues could now be surveyed simply by scanning the data laid out across a single page.

These innovations carried extraordinary potential for governments. By simplifying diverse populations down to specific indicators, and displaying them in suitable tables, governments could circumvent the need to acquire broader detailed local and historical insight.

  1. Criticisms:

Of course, viewed from a different perspective, this blindness to local cultural variability is precisely what makes statistics vulgar and potentially offensive. Regardless of whether a given nation had any common cultural identity, statisticians would assume some standard uniformity or, some might argue, impose that uniformity upon it.

Not every aspect of a given population can be captured by statistics. There is always an implicit choice in what is included and what is excluded, and this choice can become a political issue in its own right. The fact that GDP only captures the value of paid work, thereby excluding the work traditionally done by women in the domestic sphere, has made it a target of feminist critique since the 1960s. In France, it has been illegal to collect census data on ethnicity since 1978, on the basis that such data could be used for racist political purposes. (This has the side-effect of making systemic racism in the labour market much harder to quantify.)

Despite these criticisms, the aspiration to depict a society in its entirety, and to do so in an objective fashion, has meant that various progressive ideals have been attached to statistics –  ideals of “evidence-based policy”, rationality, progress and nationhood grounded in facts, rather than in romanticised stories…”

  1. Moving from state to political arena and into private hands:

“…The potential of statistics to reveal the state of the nation was seized in post-revolutionary France. The Jacobin state set about imposing a whole new framework of national measurement and national data collection. The world’s first official bureau of statistics was opened in Paris in 1800. Uniformity of data collection, overseen by a centralised cadre of highly educated experts, was an integral part of the ideal of a centrally governed republic, which sought to establish a unified, egalitarian societystatistics played an increasingly important role in the public sphere, informing debate in the media, providing social movements with evidence they could use. Over time, the production and analysis of such data became less dominated by the state. Academic social scientists began to analyse data for their own purposes, often entirely unconnected to government policy goals. By the late 19th century, reformers such as Charles Booth in London and WEB Du Bois in Philadelphia were conducting their own surveys to understand urban poverty.

 stats

To recognise how statistics have been entangled in notions of national progress, consider the case of GDP This is fiendishly difficult to get this single number right, and efforts to calculate this figure began, like so many mathematical techniques, as a matter of marginal, somewhat nerdish interest during the 1930s. It was only elevated to a matter of national political urgency by the Second World War, when governments needed to know whether the national population was producing enough to keep up the war effort. In the decades that followed, this single indicator, though never without its critics, took on a hallowed political status, as the ultimate barometer of a government’s competence. Whether GDP is rising or falling is now virtually a proxy for whether society is moving forwards or backwards.

Or take the example of opinion polling, an early instance of statistical innovation occurring in the private sector. During the 1920s, statisticians developed methods for identifying a representative sample of survey respondents, so as to glean the attitudes of the public as a whole. This breakthrough, which was first seized upon by market researchers, soon led to the birth of the opinion polling. This new industry immediately became the object of public and political fascination, as the media reported on what this new science told us about what “women” or “Americans” or “manual labourers” thought about the world

As indicators of health, prosperity, equality, opinion and quality of life have come to tell us who we are collectively and whether things are getting better or worse, politicians have leaned heavily on statistics to buttress their authority. Often, they lean too heavily, stretching evidence too far, interpreting data too loosely, to serve their cause. But that is an inevitable hazard of the prevalence of numbers in public life, and need not necessarily trigger the type of wholehearted rejections of expertise that we have witnessed recently…”

  1. What has changed now to cause resentment?

“… For roughly 450 years, the great achievement of statisticians has been to reduce the complexity and fluidity of national populations into manageable, comprehensible facts and figures. Yet in recent decades, the world has changed dramatically, thanks to the cultural politics that emerged in the 1960s and the reshaping of the global economy that began soon after. It is not clear that the statisticians have always kept pace with these changesEfforts to represent demographic, social and economic changes in terms of simple, well-recognised indicators are losing legitimacy.

Holistic view is no longer adequate:

Consider the changing political and economic geography of nation states over the past 40 years. The statistics that dominate political debate are largely national in character: poverty levels, unemployment, GDP, net migration. But the geography of capitalism has been pulling in somewhat different directions. Plainly globalisation has not rendered geography irrelevant. In many cases it has made the location of economic activity far more important, exacerbating the inequality between successful locations (such as London or San Francisco) and less successful locations (such as north-east England or the US rust belt). The key geographic units involved are no longer nation states. Rather, it is cities, regions or individual urban neighbourhoods that are rising and falling.

The ideal of the nation as a single community, bound together by a common measurement framework, is harder and harder to sustain. If you live in one of the towns in the Welsh valleys that was once dependent on steel manufacturing or mining for jobs, politicians talking of how “the economy” is “doing well” are likely to breed additional resentment. From that standpoint, the term “GDP” fails to capture anything meaningful or credible.

When macroeconomics is used to make a political argument, this implies that the losses in one part of the country are offset by gains somewhere else. Headline-grabbing national indicators, such as GDP and inflation, conceal all sorts of localised gains and losses that are less commonly discussed by national politicians. Immigration may be good for the economy overall, but this does not mean that there are no local costs at all. So when politicians use national indicators to make their case, they implicitly assume some spirit of patriotic mutual sacrifice on the part of voters: you might be the loser on this occasion, but next time you might be the beneficiary. But what if the tables are never turned? What if the same city or region wins over and over again, while others always lose? On what principle of give and take is that justified?

In Europe, the currency union has exacerbated this problem. The indicators that matter to the European Central Bank (ECB), for example, are those representing half a billion people. The ECB is concerned with the inflation or unemployment rate across the eurozone as if it were a single homogeneous territory, at the same time as the economic fate of European citizens is splintering in different directions, depending on which region, city or neighbourhood they happen to live in. Official knowledge becomes ever more abstracted from lived experience, until that knowledge simply ceases to be relevant or credible.

The privileging of the nation as the natural scale of analysis is one of the inbuilt biases of statistics that years of economic change has eaten away.

Classification is not simple: ‘Boxes’ oversimplify

Another inbuilt bias that is coming under increasing strain is classification. Part of the job of statisticians is to classify people by putting them into a range of boxes that the statistician has created (not by respondents): employed or unemployed, married or unmarried, pro-Europe or anti-Europe. So long as people can be placed into categories in this way, it becomes possible to discern how far a given classification extends across the population.

This can involve somewhat reductive choices. To count as unemployed, for example, a person has to report to a survey that they are involuntarily out of work, even if it may be more complicated than that in reality. Many people move in and out of work all the time, for reasons that might have as much to do with health and family needs as labour market conditions. But thanks to this simplification, it becomes possible to identify the rate of unemployment across the population as a whole.

Classification is not simple: Often ‘Boxes’ do not capture intensity

“…Unemployment is one example. The fact that Britain got through the Great Recession of 2008-13 without unemployment rising substantially is generally viewed as a positive achievement. But the focus on “unemployment” masked the rise of underemployment, that is, people not getting a sufficient amount of work or being employed at a level below that which they are qualified for. That is, the intensity of employment is not captured. This currently accounts for around 6% of the “employed” labour force…This is not a criticism of bodies such as the Office for National Statistics (ONS), which does now produce data on underemployment. But so long as politicians continue to deflect criticism by pointing to the unemployment rate, the experiences of those struggling to get enough work or to live on their wages go unrepresented in public debate. It wouldn’t be all that surprising if these same people became suspicious of policy experts and the use of statistics in political debate, given the mismatch between what politicians say about the labour market and the lived realityOpinion polling may be suffering for similar reasons.

Classification is not simple: Respondents create their own ‘boxes’

The rise of identity politics since the 1960s has put additional strain on such systems of classification. Statistical data is only credible if people will accept the limited range of demographic categories that are on offer, which are selected by the expert not the respondent. But where identity becomes a political issue, people demand to define themselves on their own terms, where gender, sexuality, race or class is concerned.

Classification is not simple: Other problems not discussed in the article

Example: ‘Boxes’ may not be not mutually exclusive.

 

  1. Big Data threatens to damage the ideal of quantitative expertise and its role in political debate

 In recent years, a new way of quantifying and visualising populations has emerged that potentially pushes statistics to the margins, ushering in a different era altogether. Statistics, collected and compiled by technical experts, are giving way to data that accumulates by default, as a consequence of sweeping digitisation. Traditionally, statisticians have known which questions they wanted to ask regarding which population, then set out to answer them. By contrast, data is automatically produced whenever we swipe a loyalty card, comment on Facebook or search for something on Google. As our cities, cars, homes and household objects become digitally connected, the amount of data we leave in our trail will grow even greater. In this new world, data is captured first and research questions come later.

In the long term, the implications of this will probably be as profound as the invention of statistics was in the late 17th century. The rise of “big data” provides far greater opportunities for quantitative analysis than any amount of polling or statistical modelling. But it is not just the quantity of data that is different. It represents an entirely different type of knowledge, accompanied by a new mode of expertise.

Concern on the alignment of the analytics with broader interest of the society:

“… The majority of us are entirely oblivious to what all this data says about us, either individually or collectively. There is no equivalent of an Office for National Statistics for commercially collected big data. We live in an age in which our feelings, identities and affiliations can be tracked and analysed with unprecedented speed and sensitivity – but there is very little to help anchor it in any shared reality in the public interest or public debateIt will fall to the new digital elite to identify the facts, projections and truth amid the rushing stream of data that results less well suited to making the kinds of unambiguous, objective, potentially consensus-forming claims about society that statisticians and economists are paid forWhether indicators such as GDP and unemployment continue to carry political clout remains to be seen

With the authority of statistics waning, and nothing stepping into the public sphere to replace it, people can live in whatever imagined community they feel most aligned to and willing to believe in, bestow up on themselves whatever identity without classification imposed on them; not everything be reliably referred back to some enlightened ideal of the nation state as guardian of the public interest. Where statistics can be used to correct faulty claims about the economy or society or population, in an age of data analytics there are few mechanisms to prevent people from giving way to their instinctive reactions or emotional prejudices…”

Concern about the data and the analytics not being in the public domain:

What is less clear is how the benefits of digital analytics might ever be offered to the public, in the way that many statistical data sets are. Bodies such as the Open Data Institute, co-founded by Tim Berners-Lee, campaign to make data publicly available, but have little leverage over the corporations where so much of our data now accumulates. Statistics began life as a tool through which the state could view society, but gradually developed into something that academics, civic reformers and businesses had a stake in. But for many data analytics firms, secrecy surrounding methods and sources of data is a competitive advantage that they will not give up voluntarily The anonymity and secrecy of the new analytics potentially makes them or whoever has access to them far more politically powerful than any social scientist

A company such as Facebook has the capacity to carry quantitative social science on hundreds of millions of people, at very low cost. But it has very little incentive to reveal the results. In 2014, when Facebook researchers published results of a study of “emotional contagion” that they had carried out on their users – in which they altered news feeds to see how it affected the content that users then shared in response – there was an outcry that people were being unwittingly experimented on. So, from Facebook’s point of view, why go to all the hassle of publishing? Why not just do the study and keep quiet…”

Concern on the potential use of new capabilities to advantageously produce partial truths/untruths for public consumption:

“…What is most politically significant about this shift from a logic of statistics to one of data is how comfortably it sits with the rise of populism. Populist leaders can heap scorn upon traditional experts, such as economists and pollsters, while trusting in a different form of numerical analysis altogether. Such politicians rely on a new, less visible elite, who seek out patterns from vast data banks, but rarely make any public pronouncements, let alone publish any evidence. These data analysts are often physicists or mathematicians, whose skills are not developed for the study of society at all. This, for example, is the worldview propagated by Dominic Cummings, former adviser to Michael Gove and campaign director of Vote Leave. “Physics, mathematics and computer science are domains in which there are real experts, unlike macro-economic forecasting,” Cummings has argued.

Figures close to Donald Trump, such as his chief strategist Steve Bannon and the Silicon Valley billionaire Peter Thiel, are closely acquainted with cutting-edge data analytics techniques, via companies such as Cambridge Analytica, on whose board Bannon sits. During the presidential election campaign, Cambridge Analytica drew on various data sources to develop psychological profiles of millions of Americans, which it then used to help Trump target voters with tailored messaging.

This ability to develop and refine psychological insights across large populations is one of the most innovative and controversial features of the new data analysistechniques of “sentiment analysis”, which detect the mood of large numbers of people by tracking indicators such as word usage on social media, become incorporated into political campaignsIn a world where the political feelings of the general public are becoming this traceable, who needs pollsters?

  1. Conclusion:

“…privacy and human rights law represents a potential obstacle to the extension of data analytics

A post-statistical society is a potentially frightening proposition, not because it would lack any forms of truth or expertise altogether, but because it would drastically privatise them. Statistics are one of many pillars of liberalismThe experts who produce and use them have become painted as arrogant and oblivious to the emotional and local dimensions of politics. No doubt there are ways in which data collection could be adapted to reflect lived experiences better. But the battle that will need to be waged in the long term is not between an elite-led politics of facts versus a populist politics of feeling. It is between those still committed to public knowledge and public argument and those who profit from the ongoing disintegration of those things…”

End

Read Full Post »

Well, only if you look at it the right way. And, whoever said maths guys don’t make a living?

Read on – this short amazing piece – no maths in it, I assure you – is from Dan Lewis, his posts are on varied topics, interesting and easy to read (here).

Seeing is Disbelieving

plane_damage

During World War II, the UK and U.S. focused their air warfare plans on the use of strategic bombing, employing long- and short-range aircraft to lead the way and provide ground infantry with an upper hand. Much of the industrial war complexes of both these nations were focused on producing planes, and ensuring the safe return of an expensive, slow-to-produce bomber was a priority. After all, a plane that can make five or perhaps ten runs was worth much more than one which failed to return after a mission or two.

Of course, planes which came back often did so damaged. It made sense to repair those planes. The typical repair job came with additional armor added to the bullet hole-riddled areas of the plane, reinforcing the areas which took the most damage. And, in theory, it would also make sense to add additional armor in those places.

Until a statistician named Abraham Wald stepped in.

Wald earned a Ph.D. in mathematics from the University of Vienna in 1931, but, because he was Jewish, was unable to find a job in Austria. He managed to emigrate to the United States shortly after the Nazi annexation of Austria in 1938, and ended up studying econometrics for the Cowles Commission for Research in Economics, then based in Chicago. Either while at that post or shortly thereafter, he ended up on a data gathering project for the U.S. military. He was charged with looking at planes which had returned from battle, and recording where they had taken the most damage. As seen above (via the National World War II Museum), he put together a crude before-and-after diagram. The “after” image — the plane on the right — showed where the majority of the damage was, as indicated by the shaded regions. Wald determined that most of the plane — the wings, nose, and fuselage — had taken the worst beating, while the cockpit and tail were generally unharmed. Wald’s superiors suggested that the shaded areas receive additional armor.

Wald, though, objected. If planes were returning with damage to the shaded areas, then, Wald argued, the shaded areas needed the least reinforcement. After all, the planes were able to take significant damage to those areas yet still return. Wald theorized (and mathematically explored, in this pdf) that the fact that the planes lacked damage in the cockpit and tail was more telling. Certainly, the Axis’ targeting of Allies’ planes was both indiscriminate and imprecise; there was little reason to believe that the Axis forces were aiming for, say, the nose, and intentionally avoiding striking the tail. Some planes had to have taken significant damage to the tail and cockpit, and all of those planes had something in common: they, unlike the ones in Wald’s data set, did not return back to base.

On Wald’s advice, the U.S. military leadership reinforced the cockpits and tails on its planes.

 

End

Read Full Post »