ent maia

Posts about:

Training (2)

Retuning the Heavens: Machine Learning and Ancient Astronomy

What can we learn about machine learning from ancient astronomy?

When thinking about Machine Learning it is easy to be model-centric and get caught up in the details of getting a new model up and running: preparing a dataset for machine learning, partitioning the training and test data, engineering features, selecting features, finding an appropriate metric, choosing a model, tuning the hyper-parameters. Being model-centric is reinforced by the fact that we don’t always have control of the data or how it was collected. In most cases, we are presented with a dataset collected by someone else and are asked what we can make of it. As a result, it is easy to just accept the data and over-fit your thinking about machine learning to the specifics of your modeling process and experience. Sometimes it is a good idea to step away from these details and remind yourself of the basic components of a model and its data, how they interact with each other, and how they evolve.

Read More

Extracting Target Labels from Deep Learning Classification Models

In the blog post Configuring a Neural Network Output Layer we highlighted how to correctly set up an output layer for deep learning models. Here, we discuss how to make sense of what a neural network actually returns from the output layers. If you are like me, you may have been surprised when you first encountered the output of a simple classification neural net.

Read More

Exploring Python Objects

Introduction

When we teach our foundational Python class, one of the things we do is make sure that our students know how to explore Python from the command line. This has several advantages. First, it reduces context switching – to figure out new stuff, students don’t constantly have to toggle between writing Python code and searching for documentation on the web or in a book. Second, it encourages an experimental mindset – students can use a set of simple tools to examine unfamiliar Python objects, figure out what they do, and how to correctly use them, or find new possibilities for what they could do.

Read More

Choosing the Right Number of Clusters

Introduction

When I first started my machine learning journey, K-means clustering was one of the first algorithms I was introduced to – and it is still one of my favorites to this day. I was amazed at how elegant yet comprehensible the procedure was. There is something oddly satisfying about watching the cluster assignments and centroids being updated with each iteration. While K-means clustering has been tried and true since its inception in the 1950s, there is still one foundational requirement for employing this method: choosing the correct number of clusters – the K in K-means. In this month’s newsletter, we’ll explore a technique known as the elbow method to help determine the ideal number of clusters that should be chosen for a given clustering task. To conclude, we will explore another type of clustering algorithm (Affinity Propagation clustering) that does not require a predetermined number of clusters for execution. 

Read More

Prospecting for Data on the Web

Introduction

At Enthought we teach a lot of scientists and engineers about using Python and the ecosystem of scientific Python packages for processing, analyzing, and visualizing data. Most of what we teach involves nice, clean data sets–collections of data that have been carefully collected, scrubbed, and prepared for analysis. While we also mention in passing the idea of collecting data from the web, work a few examples of general data cleanup, and at least show our students each of the tools needed, we seldom have enough time in class to follow a complete, practical example of web data prospecting from end to end. This newsletter should help remedy that.

The Problem

While the internet is a great resource for many things, including data, the web’s wild and tangled nature presents a few problems:

Read More

No Zero Padding with strftime()


O
ne of the best features of Python is that it is platform independent. You can write code on Linux, Windows, and MacOS and it works on all three platforms with no problems…mostly.

Admittedly there are some issues. Most of these are from known operating system differences when accessing system subprocesses or dealing with various local quirks of file systems and security schemes. These kinds of problems are expected. However, there are some lesser known problems that only emerge in very specific circumstances. One of these is controlling zero padding with the strftime() function in Python’s datetime module.

Read More

Got Data?

Introduction

So, you have data and want to get started with machine learning. You’ve heard that machine learning will help you make sense of that data; that it will help you find the hidden gold within.

Read More

A Beginner’s Guide to Deep Learning

Deep learning. By this point, we’ve all heard of it. It’s the magic silver bullet that can fix any complex problem. It’s the special ingredient that can take any bland or rudimentary analysis and create an immense five course meal of actionable insights. But, what is at the core of this machine learning technique? Is it truly something that all companies need to invest in? Will implementing it bring immediate business value and create forever-increasing ROI? Or, is this some type of elusive marriage masked by marketing hype? Before even considering these questions, we need to take a step back and accurately define what deep learning is.

Read More

Sorting Out .sort() and sorted()

Sorting Out .sort() and sorted()

Sometimes sorting a Python list can make it mysteriously disappear.  This happens even to experienced Python programmers who use .sort() when they should have used sorted() instead. The differences between these two ways of sorting a list are presented in this blog.

Read More

Scientists Who Code

Digital skills personas for success in digital transformation

The digital skills mix varies widely across companies, from those just starting to invest in digital transformation initiatives, to ones well into their journey. Building a community of people who think digitally and are able to innovate and quickly prototype ideas is key to delivering results. 

Read More