ML CharRNN training

Timmy Zhou
3 min readOct 25, 2020
  • How do we get through this whole list of concerns and still build AI that is fun, respectful, tender, pleasurable, kind?

I’ve thought of a couple ways that AI could be changed, but even then there’s a flipside to each solution, which is what makes this challenge so difficult to start with.

  1. Build into an AI’s algorithm a Catch for moments where ‘unrecognized’ input is fed into the machine. I.e a model trained on western/english texts is given another language for input. If the model detects this, it’ll send an alert to its developers with a message of like “This concept was not considered during production”. Assuming that there are people willing to change the paradigm, different teams could be hired to maintain an AI system that continuously hires new people to deal with each situation that was considered by its original creators. The downfall would be that certain people can disguise information that they want the AI to learn, that will incrementally shift it in a certain direction that could be detrimental.
  2. Start treating AI with the same standards that humans are subject to. If AI is capable or altering aspects of society on a far bigger scale than any one person, it should be bound by strict technological laws that teach systems what is “right” and “wrong”. Now of course, whoever rights these laws are also the ones deciding what is “right” and “wrong…

I decided to train the text model on my own computer for the assignment.

  • *The Repo instructions should be updated as Tensorflow 1.x can support all the way up to python 3.8 now!**

Once everything was installed in my Venv, I sourced A Christmas Carol from Gutenberg and copied the text into a .txt file. The Github tutorial seemed to recommend that the text file be 2mb and above, but even after copying a decent amount of words, my file only came in at 158kb.

Naturally, this caused the error below to come up a few times before I tweaked train.py to something the script liked.

I later found out that I messed up the first few times I had trained the model. I kept the same text document name, but updated the text inside. (I initially used the very small 2kb text file from class on Wednsday) The program didn’t like that very much and ended up not actually returning a new models folder. I assumed the old files were just rewritten.

In Train.py I changed Batch_Size to 1 with that text file, and the output was pretty much word for word exactly what I put in.

After that, I created a new text file and found that now my Models folder had a new model which I added to the command line. Here I found that I could increase Batch_Size to 50 and have no problem. When I trained it on a smaller size I got some pretty crazy numbers:

My question is: what exactly does that definition of batch size above mean, what is a sample?, and how come a smaller number is equal to more rounds or training?

Below is my final result!

--

--