Storytelling with AI and machine learning

In the 1970s, Marvin Minsky, father of frames, and some say neural nets, told a press conference that 50 years on, computers would read and understand Shakespeare.

Today, computers can indeed read Shakespeare but understanding, not really, not so much, even though they have been used to explore Shakespeare’s plays in a few ways:

  1. Computers are proving which bits Shakespeare didn’t write, apparently John Fletcher wrote some parts of Henry VIII. I’ve always loved this conversation about who wrote what, especially the Christopher Marlowe and Shakespeare conspiracy theories. Was Marlowe really Shakespeare? Etc.
  2. Machine learning can categorise whether a Shakespeare play is comedy or tragedy based on the structure of how the characters interact. In a comedy simply put, characters come together a lot. In a tragedy, they don’t – and ain’t that the truth in real life?
  3. Anyone can generate their own Shakespearean play with machine learning.

No. 3 seems mind blowing, but to be honest, and I love me some Shakespeare, the results truly makes no sense. However, it is hard to see that at first, because, Shakespearean English is like another language. I have attended some brilliant performances from Shakespeare School over the last couple of years, watching my children on stage, but for the first time, I realised, it is only the context and the acting which, for me, gave the words their meaning, rather like when you watch a film on TV in a language you don’t quite understand, but often the story is universal. It has emotional resonance.

I learnt Macbeth’s first soliloquy in sixth form: Is this a dagger which I see before me? It is when Macbeth contemplates his wife’s horrifying idea of killing Duncan the king. I can still recite it. It is meaningful because I studied it in depth and ruminated on what Macbeth must have been feeling, filled with ambition, and excited but horrified, whilst feeling the this isn’t going to end well feels.

However, machine learning cannot understand what Macbeth is saying, it hasn’t semantically soaked up the words and felt the emotional horror of contemplating murder in the name of ambition. All it has done is read the words and categorised them, and then written more words using probability to infer statistically what is the most likely next word as it was constructing each sentence, rather like predictive text does. It’s good and works to a certain extent, but none of us think that our predictive text is thinking and understanding. It feels like almost guessing.

We can see this more easily when looking at Harry Potter. The text is much simpler than Shakespeare so when a computer reads all the books and writes a new one, which is what the cool people at Botnik got a computer to do, it’s easier to see that the novel Harry Potter and the Portrait of what Looked Like a Large Pile of Ash is interesting for sure, but doesn’t make a great deal of sense.

“Leathery sheets of rain lashed at Harry’s ghost as he walked across the grounds towards the castle. Ron was standing there and doing a kind of frenzied tap dance. He saw Harry and immediately began to eat Hermione’s family.”

“Harry tore his eyes from his head and threw them into the forest.” 

Very dramatic – I love the leathery sheets of rain – but it doesn’t mean anything, well it does in a way, but it hasn’t been designed in the way a human would design a story, even unknowingly, and it doesn’t have the semantic layers which give text meaning. We need to encode each piece of data and link it to other pieces of data in order to enrich it and make it more meaningful. We need context and constraints around our data, that is how we create meaning. However, to make this a standard is difficult, but the WWW consortium is working on this, in part, in order to create a web of data, especially when all our devices go online, not that I think it is a good idea, my boiler does not need to be online.

And this, my friends, is where we are with machine learning. The singularity, the moment when computers surpass human intelligence, is not coming anytime soon, I promise you. Currently, it is a big jumble of machines, data sets, and mathematics. We have lots of data but very little insight, and very little wisdom. And, that is what we are looking for. We are looking to light the fire, we are looking for wisdom.

The prospect of thinking machines has excited me since I first began studying artificial intelligence, or in my case l’ intelligence artificielle and heard that a guy from Stanford, one Doug Lenat, wrote a LISP program and had it discovering mathematical things. It started simply with 1+1 as a rule and went on to discover Goldbach’s conjecture, which asserts that every even counting number greater than two is equal to the sum of two prime numbers.

The way the story was told to me, was that Lenat would come in every morning and see what the computer had been learning over night. I was captivated. So, imagine my excitement the day I was in the EPFL main library researching my own PhD and I stumbled across Lenat’s thesis in the library. I read the whole thing on microfiche there and then. Enthralled I rushed back to the lab to look him up on the WWW – imagine that, I had to wait until I got to a computer – to see that after his PhD, he had gone off to create a universal reasoning machine: Cyc.

Lenat recently wrapped up the Cyc project after 35 years. It is an amazing accomplishment. It contain thousands of heuristics or rules of thumb that create meaning out of facts which us humans have already learnt by three years old, and which computers need to have in order to emulate reason. This is because computers must reason in a closed-world, which means that if a fact or idea is not modelled explicitly in a computer, it doesn’t exist. There is so much knowledge we take for granted even before we begin to reason.

When asked about it, Marvin Minsky said that Cyc had had promise but had ultimately failed. Minsky said that we should be stereotyping problems and getting computers to recognise the stereotype or basically the generic pattern of a problem in order to apply a stereotypical solution. I am thinking, archetypes potentially, maybe, with some instantiation, so that we can interpret the solution pattern and create new solutions, not just stereotypes, no.

In this talk about Cyc, Lenat outlines how it uses both inductive (learns from data) and deductive (has heuristics or rules) learning. Lenat presents some interesting projects, especially problems where data is hard to find. However, it is these sorts of problems which need to be looked at in depth. Lenat uses container spillages and how to prevent them.

Someone said to me the other day that a neuroscientist told them that we have all the data we will ever need. I have thought about this and hope the neuroscientist meant: We have so much data we could never process it all because to say we have all the data we need is just wrong. A lot of the data we produce is biased, inaccurate and useless. So, why are we keeping it and still using it? Just read Invisible Women to see what I am talking about. Moreover as Lenat says, there are many difficult problems which don’t have good data with which to reason.

Cyc uses a universal approach to reasoning which is what we need robots to do in order to make them seem human which is what the Vicarious project is about. It is trying to discover the friction of intelligence, without using massive data sets to train a computer, and I guess it is not about heuristics either, it’s hard to tell from the website. I have said before, what we are really looking to do is how to encapsulate human experience, which is difficult to measure let alone to encapsulate because to each person, experience is different, and a lot goes on in our subconscious.

Usually, artificial intelligence learning methods take opposite approaches either the deductive rule-based, if x then do y, using lots of heuristics or an inductive approach, the look at something long enough and then find the pattern in it, a sort of, I’ve seen this 100 times now that if x, y follows, as we saw above, Cyc, used both.

Machine learning (ML) uses an empirical approach of induction. After all, that is how we learn as humans, we look for patterns – we look in the stars and the sky for astrology and astronomy, we look at patterns in nature when we are designing things and patterns in our towns, especially people’s behaviour especially online nowadays on social media.

Broadly speaking, ML takes lots of data, looks each data point and either decides yes or no on when categorising the data point it’s either in or out, rather like the little nand and nor gates in a computer, and in fact replicates what the neurons in our brains do too. And, this is how we make sense in stories: day/night, good/bad as we are looking for transformation. Poor to rich is a success story, rich to poor is a tragedy. Neuroscience has proven that technology really is an extension of us which is so satisfying because it is, ultimately, logical.

In my last blog, I looked at how to get up and running as a data scientist using python and pulling data from Twitter, and in another blog, another time, I may look in detail at the various ML methods, under the two main categories of supervised and unsupervised, as well as deep learning, which uses rewards or reinforcement, that is a human steps in to say yes this categorisation is correct or no, it is not, because ultimately, a computer cannot do it alone.

I don’t believe a computer can find something brand spanking new, off the chain, never discovered, seen or heard of before, without a human-being helping which is why I believe in human-computer interaction. I have said it so many times in the human-computer interaction series, in our love affair with big data, and all over this blog but honestly, I wouldn’t mind if I was wrong, if something new could be discovered, a new way of thinking to solve problems which have always seemed without solution.

Computing is such an exciting field, constantly changing and growing, it still delights and surprises as much as it did over 20 years ago when I first heard of Doug Lenat and read his thesis in the library. I remain as enthralled as I was back then, and I know that is a great gift. Lucky me!

Tutorial: A quick guide to data mining on Twitter

photo: usa today

Data mining and sentiment analysis, which is measuring and interpreting what people are saying about a particular subject on Twitter, is a fascinating thing to do, but be warned you may lose a lot of time once you get started. I know I am finding it to be slightly addictive.

There are so many examples online, but here is my very basic guide which will get you up and running in no time at all.

The four main steps are:

  1. Anaconda: Install the Anaconda platform.
  2. Twitter developer: Register yourself as a Twitter developer.
  3. Install tweepy: Connect from Python to Twitter.
  4. Hello World!: Experiment.

Let’s dive into more detail:

1. Anaconda

  1. Go to anaconda.com/distribution and click on the download button to install the latest version.
  2. Once the .exe file is downloaded, double click on it, and step through the installation process, clicking next when prompted.

The reason we are using Anaconda and not just python from python.org is that Anaconda contains all the packages we want to access (apart from tweepy, which is the one for Twitter). Had we installed just python we would have to go and install each package separately, as python was not originally designed to support mathematical manipulations.

The main ones we will be using to get started, and which we will call using the ‘import’ command at the beginning of each session, are as follows:

  • numpy is short for numerical python it contains mathematical functions for manipulating arrays and matrices of numbers.
  • pandas is an easy-to-use data structures and data analysis library.
  • matplotlib is how generate plot our data on histograms, bar charts, scatterplots, etc., with just a few lines of code.

We can now practice using the software.

Launch Spyder (Anaconda 3) from the Windows menu. On the left hand side there will be the script editor and on the right hand side is the console. Cut and paste this short script to use matplottib and create a scatterplot of the results. You put it on the left hand side and remove the stuff that is already there, and then you press the run button (it looks like a play button) and you see the results in the bottom right hand side in the console window.

import matplotlib.pyplot as plt
girls_grades = [89, 90, 70, 89, 100, 80, 90, 100, 80, 34]
boys_grades = [30, 29, 49, 48, 100, 48, 38, 45, 20, 30]
grades_range = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
fig=plt.figure()
ax=fig.add_axes([0,0,1,1])
ax.scatter(grades_range, girls_grades, color=’r’)
ax.scatter(grades_range, boys_grades, color=’b’)
ax.set_xlabel(‘Grades Range’)
ax.set_ylabel(‘Grades Scored’)
ax.set_title(‘scatter plot’)
plt.show()

You will need to look up some of the commands over on matplotlib and if you don’t know what the commands are doing or indeed why you would want a scatter plot then Google that too. But, already we can see how easy it is to have some data to visualise and how quick it is to do so.

For bigger sets of data instead of declaring them in arrays as we did below:

girls_grades = [89, 90, 70, 89, 100, 80, 90, 100, 80, 34]
boys_grades = [30, 29, 49, 48, 100, 48, 38, 45, 20, 30]
grades_range = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]

We would put those line in a file called, for example: all_grades.py and read what we needed like this:

import girl_grades from all_grades.py

So that if we add or delete data we can keep a copy. Make sure you save all your files with useful names you can recognise when you come back to them.

Also make sure that your python PATH is set up correctly. (Google this.)

It is worth following some python online tutorials over on python.org to get a feel for simple python commands and how to read and write from files either in the .py – python format which we used above – or other formats such as csv (often used in Excel) or json (often used in web apps), because we may want to use other people’s datasets or create our own from Twitter and store them in files.

2. Twitter developer

  1. Log into Twitter, or create an account if you don’t have one.
  2. Go to the apps section using this link http://apps.twitter.com, and register a new application by clicking on the create an app button.
  3. There’s a bit of form filling to explain what you want to do:
    1. I chose hobbyist, exploring the API, and put in my phone no and verified the account using the text they sent me.
    2. Next page: How will you use the Twitter API or Twitter data? I said that I am using the account for python practice, not sharing it with anyone, but I will be analysing Twitter data to practice manipulating data in real time.
    3. They send you an email and/or a SMS text with a code. After confirming a couple of times, you will get a Congratulations screen. Ta-daa!!
    4. Go to the dropdown menu on the top right hand corner and choose the Apps menu, which will take you to the Apps webpage.
    5. Click the Create an App button. Give your app a name and description e.g. I said: Stalker’s Python Practice, and give a description to the Twitter team about how you will just be using this app for practice. (You won’t need Callbacks or enable Twitter Sign-in.) Click Create at the bottom.
    6. The page which appears is your Apps page. Go to Keys and Permission and you will see your Consumer API keys which are called consumer key and consumer secret. These keys should always be kept private otherwise people will be able to pull your data from your account and your account will become compromised and potentially suspended. Underneath them it says Access token & access token secret, so click Create, and you will receive you an access token and an access token secret. Similarly to the consumer keys, this information must also be kept private .

Stay logged into Twitter but now we move onto Anaconda.

3. Install Tweepy

Launch the Anaconda Prompt (Anaconda 3) which you will find in the menu on Windows and then type: pip install tweepy

pip install tweepy

Theoretically we could do everything in this console but the Spyder set up makes it so much easier. Close this console and we are now ready to begin!

4. Hello World!

Cut and paste this script into the left hand side and replace each xxxxxxxxx with your consumer_key, consumer_secret, access_token, and access_secret, but leaving the quote marks around them:

import tweepy
from tweepy import OAuthHandler
consumer_key = ‘xxxxxxxxxxxxxxxxx’
consumer_secret = ‘xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx’
access_token = ‘xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx’
access_secret= ‘xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx’
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)

for status in tweepy.Cursor(api.home_timeline).items(10):
# Process a single status
print(status.text)

The top section of code will give you access to Twitter and the last three lines will print out 10 of the latest status tweets which normally appear in your home timeline.

Again save this script in a file so that you can reuse it.

And there you have it.

Next steps

Now you are set up, you are ready to begin manipulating the data you read from Twitter and here are three tutorials of varying complexity:

You may want to use datasets from Kaggle.com or perhaps ones which are geolocated such as the ones at www.followthehashtag.com or Github, or from Twitter itself. There really is a world of data out there.

And, as part of the Anaconda framework there is the Jupyter notebook which can create webpages on the fly so that you can share your findings really easily. And then there is Tensorflow which is can be used for machine learning in particular neural nets because it contains all sorts of statistical techniques to help you manipulate data in a super powerful way and yet straightforward way.

The possibilities with Anaconda really are endless.

Troubleshooting

I am writing this tutorial as timestamped and running it on Windows 10. All the apps I mention are updated frequently, so these instructions may not represent what you have to do in the future. You may need to explore, but don’t worry, you won’t break anything. The worst case scenario is that you delete what you have done and start again which is always great practice.

If you get an error message, check you cut and paste the whole script correctly, and that your PATH is pointing in the right diection. If that doesn’t help, read the message carefully and see what it says. If you still don’t know, cut and paste the message into Google, someone somewhere will have found the solution to the problem.

This is the gift of the World Wide Web, someone somewhere can always help you, you can find whatever you are looking for, and someone is always creating something new and amazing to use. It really is magic.

Good luck and happy hunting.