Tagged: Data

Mini review: “Invisible Women” by Caroline Criado Perez

Now this exists, there is no excuse – everyone should read this book. As a white male, I never had to consider the many, many points raised. As a father to daughters, they should not to have to grow up and live in a world that is hostile to their gender – and neither should any woman. Women in other parts of the world suffer even more.

One particularly poignant issue raised is how women are left in a worse position than men after a pandemic – in this book from 2019 the author was talking about SARS and Ebola. The effects of Covid-19 are going to dwarf the effects of those outbreaks. Everybody will be impacted but – although it won’t necessarily be obvious or even studied in depth (it should be) – I suspect women will be impacted disproportionately more than men to a great degree, especially in the long term.

It doesn’t have to be this way.

Humans Need Not Apply…

This great video explains who is at threat from automation in the workplace.  Watch it and, like me, try to think what you can do about it…

See also:

• Spare Cycles: Article: Better Than Human (Wired)

• Spare Cycles: Mini review: “The Second Machine Age” by Erik Brynjolfsson and Andrew McAfee (audiobook version)

• Spare Cycles: Mini Review: Race Against The Machine

• Spare Cycles: Mini review: The Lights in the Tunnel: Automation, Accelerating Technology and the Economy of the Future

• Spare Cycles: Article: Migrant Workers in China Face Competition from Robots (Technology Review)

• BuzzMachine.com: The jobless future

• Douglas Rushkoff : Are Jobs Obsolete?

• Wired: Raging Bulls: How Wall Street Got Addicted to Light-Speed Trading

Discussing Artificial Intelligence on “Triangulation” with Leo Laporte

This week Leo was discussing some of the threats that might materialise as a result of improvements in the field of Artificial Intelligence.

Thinking back, the subject of AI has come up a few of times before. These episodes give some different viewpoints on the subject and are well worth checking out.


Mini review: “The Second Machine Age” by Erik Brynjolfsson and Andrew McAfee (audiobook version)



This is essentially an updated and re-written version of the authors’ earlier book Race Against The Machine, although it has a much more positive outlook on technology and the future it will shape.

The book highlights three things that are contributing to the increasing influence of machines:

  • exponential growth in the capabilities of computer hardware (and the subsequent lower cost of providing a digital service)
  • more and more aspects of life are becoming digital and this will continue
  • combinatorial forces (taking existing technologies and putting them together in new ways).

The authors excel when they look at how existing technological change has economically affected different people and how the situation has altered over time.  Amongst other things, it explains why some people can become incredibly (obscenely) rich whilst others will no longer be able to escape their original economic class and improve their economic situation.  They successfully give the impression that they have carefully looked at the evidence and come to reasonable conclusions.

They also offer some tips on how to “race with the machines”.  Sometimes a human working in conjunction with a machine can achieve a better result than a computer simply replacing a human.

A comment about the narration on the audiobook version: what you have here is a bog standard American doing a distinctly average job of conveying the content of the book.  It achieves nothing more.  Any attempt to try a different accent or pronounce foreign words totally fails.  It seems to me that this is the de-facto voice used to appeal to American businessmen, but I’m sure that other narrators could do a much better job.  Good narration can add so much.  They should have gone for a narrator who is used to doing fiction to bring this book alive.  There are many valuable points and arguments made here, so the publishers should make it as accessible as possible.

Next steps towards Data Science

After my last Open University course  (Analysing Data) ended in June, I gave myself the summer off.  I read a number of books and enjoyed the time that was previously taken by study with my family.  I decided that the time demands of another OU course (along with work commitments) would be too much at the moment.

But that does not mean that I can’t carry on towards my goal of moving into the field of data science – I just need to approach it at my own pace.

There are a few new books that are starting to get to grips with what is a new and largely undefined subject, especially “Big Data“, “Doing Data Science” and “Learning R” .  I feel that people can now become a lot more informed about what is involved.  There is now more flesh on the bones.  From here on in I’ll know what is actually involved and what is needed to get there.  The more I do and the more I find out, the more challenging it seems.  Daunting even.  But I firmly believe this is a foundation for the future and I fully intend to take part, even if there is a long way to go.

So, what’s next?   These are things that can be done concurrently.  I intend on doing more practical work.

• Continue to learn Python – I’m getting to grips with the basics

• Read “Doing Data Science” – I’ve started, and I think it will be a real education on what is really involved

• Start “Learning R” once “Doing Data Science” is finished, although Python will be the focus for the near future.

That should keep me busy for a while…

Review: Big Data – A Revolution That Will Transform How We Live, Work and Think

The bottom line first: if you are interested in a broad and non technical introduction to the subject of “Big Data” then you should read this book. It is short and highlights a number of points (some that aren’t necessarily clear from reading elsewhere.)

Point 1:

Importantly in the first chapter it says that to be practising “big data” projects you do not have to be dealing with millions of data points. There may be a lot less but the issue is that you should be working will all the data that is available to you rather than just a sample. With all the data, it is possible to analyze it in different ways. With just a sample you will likely be limited to what you can discover after the sample has been taken. The authors discuss the very first article I read about this subject, Wired’s The End of Theory. It’s very interesting to read how the article is now regarded.

Point 2:

People may have to get used to the data revealing what is happening without actually revealing why it is happening. In some areas we will have to let go somewhat of the (natural) desire to understand the reasons behind the results.

Point 3:

The authors deal with the subject of data getting “messier” (becoming more imprecise) as as you increase the amount you are collecting:

However in many new situations that are cropping up today allowing for imprecision – for messiness – may be a positive feature not a shortcoming. It is a tradeoff. In return for relaxing the standards of allowable errors, one can get a hold of much more data. It isn’t just that “more trumps some” but that, in fact, sometimes “more trumps better”.

Because this data set consists of more data points, it offers far greater value that likely offsets its messiness.

Big Data transforms figures into something more probabilistic than precise.

So more trumps less. And sometimes more trumps smarter.

“Simple models and a lot of data trump more elaborate models based on less data.” (quote from Peter Norvig, Google)

… treating data as something imperfect and imprecise lets us make superior forecasts and thus understand out world better

Point 4:

The chapter on “Datafication” of just about everything is a good balance of history and the insights that can be gleamed from today’s social media giants. Location is particularly important:

The point is that these indirect uses of location data have nothing to do with the routine of mobile communications, the purpose for which the information was initially generated. Rather, once location is datafied new uses crop up and new value can be created.

Datafication is only just starting, but now it is under way it will continue, with many benefits:

Once the world has been datafied, the potential uses of the information are basically limited only by one’s ingenuity.

Seeing the world as information, as oceans of data that can be explored at ever greater breadth and depth offers us a perspective on reality that we did not have before.

Point 5:

Another important point is that humans will have to get used to the fact that their opinion is not always the best:

… the biggest impact of big data will be that data-driven decisions are poised to augment or overrule human judgement.

This is likely to mean a change in the requirements needed to do a specific job. The importance of experience will diminish as insight from data can dwarf the experience of one person.

Mathematics and statistics, perhaps with a sprinkle of programming and network science, will be as foundational to the modern workplace as numeracy was a century ago and literacy before that.

… the winners will be found among large and small firms, squeezing out the mass in the middle.

Big data squeezes the middle of an industry, pushing firms to be very large, or small and quick, or dead.

Point 6:

Re-use of data is looked at – old data can be combined with new in different ways to discover or exploit new opportunities.  So what is the value of data? A company may have relatively few assets but a massive company valuation – therefore is the difference between the two the value of the data the company controls? That could mean billions of pounds / dollars / etc.

And finally…

A number of times there were names of sites or companies that led me to put the book down, check out a website or install an app. The chapter called “Implications” is particularly good for that, but it does slow down the reading somewhat. Even when a book is this recent some of the examples are now out-of-date (for example, Decide.com shutting its doors as its staff join ebay). This is a fast moving field.

There is a lot more to this book, impressive given that it is only 200 pages long. I’m glad I read this book – it puts so much into focus.

Do we need data scientists?

There are two opposing points of view…

• GiogaOm:  We don’t need more data scientists — just make big data easier to use:

Virtually any article today about big data inevitably turns to the notion that the country is suffering from a crucial shortage of data scientists.

What seems to be missing from all of these discussions, though, is a dialogue about how to steer around this bottleneck and make big data directly accessible to business leaders.

While difficult to generalize, there are three main roles served by the data scientist: data architecture, machine learning, and analytics.

The solution then lies in creating fit-to-purpose products and solutions that abstract away as much of the technical complexity as possible, so that the power of big data can be put into the hands of business users.

• GigaOm:  Data scientists matter because data science is the future of IT:

Data scientists are changing the way decisions happen by making better use of big data. Rather than finding ways around them, we need to make data science more accessible as a profession and need to provide easier tools for data scientists.

We build new systems that are flexible and dynamic and create more new jobs — such as data scientists — to analyze and build models for these new systems. It is obvious that in such a world, where static models cannot keep up, data scientists will be indispensable.

…data scientists are the designers and the content creators of today, not the software engineers or the IT bottleneck.

We need data scientists, and we need hundreds of thousands of them. They will do their magic, create new ways of experiencing life, products and services…

New, simpler tools will no doubt come along over time and it is something to look forward to.

I’m choosing to concentrate on the analytics angle of a Data Science role – know the right questions to ask, know how to state the questions so that you are delivered the answers you want to get and then be able to interpret the answers correctly, so that relevant decisions can be made.  That is the ultimate goal.