Posts Tagged ‘Problem Solving’

And no maths, no equations.

This is from an article that appeared sometime ago edited for readability, deleting (irritating, light-weight and thoroughly avoidable) references to some IT specific code snippets and operations. Here you go:

How the Circle Line rogue train was caught with data

The MRT Circle Line (London underground) was hit by a spate of mysterious disruptions in recent months, causing much confusion and distress to thousands of commuters.

Like most of my colleagues, I take a train on the Circle Line to my office at one-north every morning. So on November 5, when my team was given the chance to investigate the cause, I volunteered without hesitation.

From prior investigations by train operator SMRT and the Land Transport Authority (LTA), we already knew that the incidents were caused by some form of signal interference, which led to loss of signals in some trains. The signal loss would trigger the emergency brake safety feature in those trains and cause them to stop randomly along the tracks.

But the incidents — which first happened in August — seemed to occur at random, making it difficult for the investigation team to pinpoint the exact cause.

We were given a dataset compiled by SMRT that contained the following information:

  • Date and time of each incident
  • Location of incident
  • ID of train involved
  • Direction of train

We started by cleaning the data…

This gave us:

picture-1Screenshot 1: Output from initial processing

No clear answers from initial visualisations

We could not find any obvious answers in our initial exploratory analysis, as seen in the following charts:

  1. The incidents were spread throughout a day, and the number of incidents across the day mirrored peak and off-peak travel times.

picture-2Figure 1: Number of occurrences mirror peak and off-peak travel times.

  1. The incidents happened at various locations on the Circle Line, with slightly more occurrences on the west side.

picture-3Figure 2: The cause of the interference did not seem to be location-based.

  1. The signal interferences did not affect just one or two trains, but many of the trains on the Circle Line. “PV” is short for “Passenger Vehicle”.

picture-4Figure 3: 60 different trains were hit by signal interference.


The Marey Chart: Visualising time, location and direction

Our next step was to incorporate multiple dimensions into the exploratory analysis.

We were inspired by the Marey Chart, which was featured in Edward Tufte’s vaunted 1983 classic The Visual Display of Quantitative Information. More recently, it was used by Mike Barry and Brian Card for their extensive visualisation project on the Boston subway system:

In this chart, the vertical axis represents time — chronologically from top to bottom — while the horizontal axis represents stations along a train line. The diagonal lines represent train movement.

Under normal circumstances, a train that runs between HarbourFront and Dhoby Ghaut would move in a line similar to this, with each one-way trip taking just over an hour:

picture-5Figure 5: Stylised representation of train movement on Circle Line

Our intention was to plot the incidents — which are points instead of lines — on this chart.

Preparing the data for visualisation

With the data processed, we were able to create a scatterplot of all the emergency braking incidents. Each dot here represents an incident. Once again, we were unable to spot any clear pattern of incidents.

picture-6Figure 6: Signal interference incidents represented as a scatterplot

Next, we added train direction to the chart by representing each incident as a triangle pointing to the left or right, instead of dots:

picture-7Figure 7: Direction is represented by arrows and colour.

It looked fairly random, but when we zoomed into the chart, a pattern seemed to surface:

picture-8Figure 8: Incidents between 6am and 10am

If you read the chart carefully, you would notice that the breakdowns seem to happen in sequence. When a train got hit by interference, another train behind moving in the same direction got hit soon after.

What we’d established was that there seemed to be a pattern over time and location: Incidents were happening one after another, in the opposite direction of the previous incident. It seemed almost like there was a “trail of destruction”…

Could the cause of the interference be a train — in the opposite track?

picture-9Figure 9: Could it be a train moving in the opposite direction?

We decided to test this “rogue train” hypothesis.

We knew that the travel time between stations along the Circle Line ranges between two and four minutes. This means we could group all emergency braking incidents together if they occur up to four minutes apart.

We found all incident pairs that satisfied this condition: We then grouped all related pairs of incidents into larger sets…This allowed us to group incidents that could be linked to the same “rogue train”…These were some of the clusters that we identified:

[{0, 1},
{2, 4},
{5, 6, 7},
{8, 9},
{18, 19, 20},
{21, 22, 24, 26, 27},
{28, 29, 30, 31, 32, 33, 34},
{42, 44, 45},
{47, 48},
{51, 52, 53, 56}]

Next, we calculated the percentage of the incidents that could be explained by our clustering algorithm. The result was:

(189, 259, 0.7297297297297297)

What it means: Of the 259 emergency braking incidents in our dataset, 189 cases — or 73% of them — could be explained by the “rogue train” hypothesis. We felt we were on the right track.

We coloured the incident chart based on the clustering results. Triangles with the same colour are in the same cluster.

picture-10Figure 10: Incidents clustered by our algorithm

How many rogue trains are there?

As we showed in Figure 5, each end-to-end trip on the Circle Line takes about 1 hour. We drew best-fit lines through the incidents plots and the lines closely matched that of Figure 5. This strongly implied that there was only one “rogue train”.

picture-12Figure 11: Time of clustered incidents strongly implies that the interference could be linked a single train

We also observed that the unidentified “rogue train” itself did not seem to encounter any signalling issues, as it did not appear on our scatter plots.

Convinced that we had a good case, we decided to investigate further.

Catching the rogue train

After sundown, we went to Kim Chuan Depot to identify the “rogue train”. We could not inspect the detailed train logs that day because SMRT needed more time to extract the data. So we decided to identify the train the old school way — by reviewing video records of trains arriving at and leaving each station at the times of the incidents.

At 3am, the team had found the prime suspect: PV46, a train that has been in service since 2015.

Testing the hypothesis

On November 6 (Sunday), LTA and SMRT tested if PV46 was the source of the problem by running the train during off-peak hours. We were right — PV46 indeed caused a loss of communications between nearby trains and activated the emergency brakes on those trains. No such incident happened before PV46 was put into service on that day.

On November 7 (Monday), my team processed the historical location data of PV46 and concluded that more than 95% of all incidents from August to November could be explained by our hypothesis. The remaining incidents were likely due to signal loss that happen occasionally under normal conditions.

The pattern was especially clear on certain days, like September 1. You can easily see that interference incidents happened during or around the time belts when PV46 was in service.

picture-13LTA and SMRT eventually published a joint press release on November 11 to share the findings with the public.

Final thoughts

When we first started, my colleagues and I were hoping to find patterns that may be of interest to the cross-agency investigation team, which included many officers at LTA, SMRT and DSTA. The tidy incident logs provided by SMRT and LTA were instrumental in getting us off to a good start, as minimal cleaning up was required before we could import and analyse the data. We were also gratified by the effective follow-up investigations by LTA and DSTA that confirmed the hardware problems on PV46.

From the data science perspective, we were lucky that incidents happened so close to one another. That allowed us to identify both the problem and the culprit in such a short time. If the incidents were more isolated, the zigzag pattern would have been less apparent, and it would have taken us more time — and data — to solve the mystery.

Of course, we were most pleased that all of us can now take the Circle Line to work with confidence again.

Daniel Sim, Lee Shangqian and Clarence Ng are data scientists at GovTech’s Data Science Division.



Source: https://blog.data.gov.sg/how-we-caught-the-circle-line-rogue-train-with-data-79405c86ab6a

Read Full Post »

mothers are not far behind with their digital leash!



Read Full Post »








Source: www

Read Full Post »

Every interaction with a customer including complaints is an opportunity to build or strengthen our bridges with our customers.  Very often we find our customer-facing staff blowing away this opportunity that lands on our lap for free. To better understand this gift recall what we go through when we go out to engage a customer unsolicited.  

And how do we blow it away? Usually by keeping our interaction down to a crisp and a minimal response demanded by the context.  Technically flawless, business-wise not so wise.  Of course at the other extreme, we might have a loquacious rep overdoing it pushing the customer to annoyance.

What then do we do with this opportunity? Well, there are several avenues to be explored: we could gain useful insights into his decision making process (why or how did he settle on our product?), his experience with competitors, his post-purchase impressions, what else would he like to see as features, does he see enough of our brand publicity… If it is a complaint, information about events leading to the failure could be collected.  Did he have other issues/signals before the failure occurred?  Does he have thoughts on how this failure could have been possibly averted? Of course what would work depends on the temperature of the call.

All of these cannot happen without orienting our customer-facing staff adequately, constructing different possible scenarios and outlining avenues for enriching the interaction.  

Note outsourced call-centers are optimized to enhance calls handled in a day rather than quality engagement with the caller, at once totally eliminating this opportunity.

Incidentally all of the above apply to our interactions with prospects too.

Here’s a short well-written piece from Art Petty on this same theme exhorting us to have transformational interaction instead of transactional. A personal experience included. So why settle for less when its potential benefits could be dramatic?


Read Full Post »

At least to me, it’s new. Never thought the joke could be on us, not about someone from south-of-boondocks as I had imagined.

A policeman sees a drunk staring at the ground beneath a streetlight. “What are you doing?” the cop asks.

“Looking for my keys.” says the drunk. “I dropped them in the dark alley over there.”

“Then why are you over here?” asks the policeman, confused.

“Because the light’s so much better over here.”

The streetlights are our controlled environments where we look for answers —labs, classrooms, fixed timetables, and clear metrics. But things are more fluid in the real world. For that we need to rely more on tacit knowledge from our experience




Source: conversationagent.com/2016/07/striving-for-conciseness-and-clarity.html while talking about ‘Streetlights and Shadows: Searching for the Keys to Adaptive Decision Making’, a book by research psychologist Gary Klein, a pioneer in naturalistic decision making.



Read Full Post »

for the local farmers to save their cattle!

Reminds me of the reported use some time ago of pre-recorded tiger’s roar to drive stray elephants back in to the forests and save the crops from damage.

Brit Scientist

A British scientist Dr Neil Jordan believes if he can stop African lions killing farmers’ cattle, then farmers will stop killing the endangered lions. ‘Farmers currently have very few effective tools to prevent this devastating lion-livestock conflict. Unfortunately shooting or poisoning predators is not only used as a last resort, farmers often feel it is their only resort,’ Dr Jordan said.

The conservation biologist, who works with the University of New South Wales and Taronga Zoo in Sydney, is trialling his theory in the Okavango Delta in Botswana.

Brit Scientist 1

‘Lions are supreme ambush predators, they rely on stealth. When seen they lose this element of surprise and abandon their hunt,’ he said.

The idea is to trick the big cats into thinking they have been seen by drawing eyes on the back of the cows, so that they are intimidated and do not attack. The scientist has already carried out a small 3-month sample test of his theory, which gave promising results. ‘While 3 out of 39 unpainted cows were killed by lions, none of the 23 painted cows from the same herd were killed,’ he said.

Dr Jordan is now fundraising to be able to buy more of the equipment needed to carry out further tests. He hopes his idea will provide local farmers with a low-cost and non-lethal tool to reduce livestock losses without having to kill lions.

African lion populations are in decline throughout most of the continent. ‘In 1975 there was an estimated 250,000 lions in Africa, yet today the continent wide population stands at a mere 25 – 30,000 individuals. ‘This staggering 80-90 per cent decline combines with the fragmentation and isolation of those remaining sub-populations with little long-term viability,’ World Lion Day reports.





Source: dailymail.co.uk/news/article-3675414/British-scientist-stops-lions-hunting-cattle-Botswana-painting-faces-behinds.html dated 5 July 2016 vide sampspeak.in.


Read Full Post »

Well, only if you look at it the right way. And, whoever said maths guys don’t make a living?

Read on – this short amazing piece – no maths in it, I assure you – is from Dan Lewis, his posts are on varied topics, interesting and easy to read (here).

Seeing is Disbelieving


During World War II, the UK and U.S. focused their air warfare plans on the use of strategic bombing, employing long- and short-range aircraft to lead the way and provide ground infantry with an upper hand. Much of the industrial war complexes of both these nations were focused on producing planes, and ensuring the safe return of an expensive, slow-to-produce bomber was a priority. After all, a plane that can make five or perhaps ten runs was worth much more than one which failed to return after a mission or two.

Of course, planes which came back often did so damaged. It made sense to repair those planes. The typical repair job came with additional armor added to the bullet hole-riddled areas of the plane, reinforcing the areas which took the most damage. And, in theory, it would also make sense to add additional armor in those places.

Until a statistician named Abraham Wald stepped in.

Wald earned a Ph.D. in mathematics from the University of Vienna in 1931, but, because he was Jewish, was unable to find a job in Austria. He managed to emigrate to the United States shortly after the Nazi annexation of Austria in 1938, and ended up studying econometrics for the Cowles Commission for Research in Economics, then based in Chicago. Either while at that post or shortly thereafter, he ended up on a data gathering project for the U.S. military. He was charged with looking at planes which had returned from battle, and recording where they had taken the most damage. As seen above (via the National World War II Museum), he put together a crude before-and-after diagram. The “after” image — the plane on the right — showed where the majority of the damage was, as indicated by the shaded regions. Wald determined that most of the plane — the wings, nose, and fuselage — had taken the worst beating, while the cockpit and tail were generally unharmed. Wald’s superiors suggested that the shaded areas receive additional armor.

Wald, though, objected. If planes were returning with damage to the shaded areas, then, Wald argued, the shaded areas needed the least reinforcement. After all, the planes were able to take significant damage to those areas yet still return. Wald theorized (and mathematically explored, in this pdf) that the fact that the planes lacked damage in the cockpit and tail was more telling. Certainly, the Axis’ targeting of Allies’ planes was both indiscriminate and imprecise; there was little reason to believe that the Axis forces were aiming for, say, the nose, and intentionally avoiding striking the tail. Some planes had to have taken significant damage to the tail and cockpit, and all of those planes had something in common: they, unlike the ones in Wald’s data set, did not return back to base.

On Wald’s advice, the U.S. military leadership reinforced the cockpits and tails on its planes.



Read Full Post »

Older Posts »