Plain and Simple Estimators

Plain and simple estimators


Machine learning is amazing, while it doesn't force you to do advanced math. The tools for machine learning have improved dramatically, and training your own model has never been easier.


We use our understanding of our dataset rather than an understanding of raw mathematics as a model code that we gain insights into.


In this episode, we're going to train a simple classifier using a handful of lines of code. Here are all the codes we see today.


Tensor flow estimators for machine learning

We use TenserFlow, Google's open-source machine learning library, to train our classifiers. Tensorflow has a very large API surface, but we are going to focus on high-level APIs, called estimators.


I have printed our loading results and we can see that we are now able to access the training data and related labels, or using targeted, named attributes.


Build a model


Next we will build the model. To do this, we will first set up feature columns. Feature columns define the types of data that come in the model. We're using dimen dimension feature columns to represent our features, and call them "full_features".

The estimators package the training loop for us, so that we can train this model by configuring it without coding manually. It takes up a lot of boilerplate, allowing us to think at a higher level of abstraction. This means we will play with the fun parts of machine learning, and not dive into too many details.


Because we've only covered linear models so far, we stick here. We will look at this example again to expand its capabilities in the future.



Flower classification: as interesting as wine vs. beer?


This week we will create a model to differentiate between three different of the same flower. I found it a little more fun than the previous episode of beer and wine, but it's a little harder to separate these flowers, which made it even more fun.


In particular, we are classifying different species of iris flowers. Now, I'm not sure if I can get Iris flowers from the rose area, but our model aims to distinguish Iris Setosa, Iris Versicolor, and Iris Virginica.


Making our model using estimators is quite simple. Using f Tf.estimator.LinearClassifier`, we can set up the model by crossing over the newly created feature columns; The numbers of different outputs predicted by the model in this case; And a directory for storing model progress and output files. This helps the TensorFlow to go back to training where it left off, if necessary.



Input functions


This classification item will keep track of our state, and we are now almost ready to move on to training. There is one last piece to connect our model in training mode, and that is the input function. The function of the input function is to create TensorFlow operations that generate data for the model.


We have a dataset measuring the height and width of the petals and snails of these flowers. These col columns serve as our 'features'.


Load data


After importing TensorFlow and NumPy, we will load our dataset, using TensorFlow's Load_CSV_Wid_Header function. Data, or features, are presented as floating-point numbers, and are recorded as 'labels', or targeted, integers for each row of data: 0, 1, or 2, corresponding to 3 species of flowers.



So we move from the raw data to the input function, which passes the data, which is mapped by the Fedom columns to the model. Note that we use the same name for features as we did when defining feature columns. This is how data is related.



Run the training


Now is the time to run our training. To train our model, we will run classified. train (), with only input classified as classified. This is how we connect to our datasets and models.


The train function handles the training loop and repeats over the dataset, improving its performance with each step. And so, we've completed 1,000 training steps! Our dataset is not large, so it was completed rather quickly.


Evaluation time


Now it's time to evaluate our results. We can do this first using the same classifier object, as this is the trained state of the model. To determine how good our model is, we use classifiers. (F) We pass our test datasets and extract the accuracy from the returned metrics.


We got a purity of 96.6.66%! Nothing bad!








Estimators: A straightforward workflow


Let’s pause here for this week, and review what we’ve achieved so far using Estimators.


The Estimators API gives us a better workflow for retrieving our raw data, passing it through input functions, setting up our facility columns and model structures, running our training, and running our running. This easy to understand framework allows us to think about our data and its properties, compared to the underlying mathematics which is the best place!


Links


Comments