ent maia

Exploring Python Objects

Introduction

When we teach our foundational Python class, one of the things we do is make sure that our students know how to explore Python from the command line. This has several advantages. First, it reduces context switching – to figure out new stuff, students don’t constantly have to toggle between writing Python code and searching for documentation on the web or in a book. Second, it encourages an experimental mindset – students can use a set of simple tools to examine unfamiliar Python objects, figure out what they do, and how to correctly use them, or find new possibilities for what they could do.

Read More

Choosing the Right Number of Clusters

Introduction

When I first started my machine learning journey, K-means clustering was one of the first algorithms I was introduced to – and it is still one of my favorites to this day. I was amazed at how elegant yet comprehensible the procedure was. There is something oddly satisfying about watching the cluster assignments and centroids being updated with each iteration. While K-means clustering has been tried and true since its inception in the 1950s, there is still one foundational requirement for employing this method: choosing the correct number of clusters – the K in K-means. In this month’s newsletter, we’ll explore a technique known as the elbow method to help determine the ideal number of clusters that should be chosen for a given clustering task. To conclude, we will explore another type of clustering algorithm (Affinity Propagation clustering) that does not require a predetermined number of clusters for execution. 

Read More

Prospecting for Data on the Web

Introduction

At Enthought we teach a lot of scientists and engineers about using Python and the ecosystem of scientific Python packages for processing, analyzing, and visualizing data. Most of what we teach involves nice, clean data sets–collections of data that have been carefully collected, scrubbed, and prepared for analysis. While we also mention in passing the idea of collecting data from the web, work a few examples of general data cleanup, and at least show our students each of the tools needed, we seldom have enough time in class to follow a complete, practical example of web data prospecting from end to end. This newsletter should help remedy that.

The Problem

While the internet is a great resource for many things, including data, the web’s wild and tangled nature presents a few problems:

Read More

No Zero Padding with strftime()


O
ne of the best features of Python is that it is platform independent. You can write code on Linux, Windows, and MacOS and it works on all three platforms with no problems…mostly.

Admittedly there are some issues. Most of these are from known operating system differences when accessing system subprocesses or dealing with various local quirks of file systems and security schemes. These kinds of problems are expected. However, there are some lesser known problems that only emerge in very specific circumstances. One of these is controlling zero padding with the strftime() function in Python’s datetime module.

Read More

Digital Transformation of the Materials Science R&D Lab

“Digital transformation”, “machine learning”, and “artificial intelligence” are buzzwords heard in every industry, from the boardroom to the lab.

We asked Dr. Michael Heiber, lead of Enthought’s Materials Informatics solutions, about what these technology trends mean for the future of materials and chemical labs and product development.

Read More

Got Data?

Introduction

So, you have data and want to get started with machine learning. You’ve heard that machine learning will help you make sense of that data; that it will help you find the hidden gold within.

Read More

A Beginner’s Guide to Deep Learning

Deep learning. By this point, we’ve all heard of it. It’s the magic silver bullet that can fix any complex problem. It’s the special ingredient that can take any bland or rudimentary analysis and create an immense five course meal of actionable insights. But, what is at the core of this machine learning technique? Is it truly something that all companies need to invest in? Will implementing it bring immediate business value and create forever-increasing ROI? Or, is this some type of elusive marriage masked by marketing hype? Before even considering these questions, we need to take a step back and accurately define what deep learning is.

Read More

Sorting Out .sort() and sorted()

Sorting Out .sort() and sorted()

Sometimes sorting a Python list can make it mysteriously disappear.  This happens even to experienced Python programmers who use .sort() when they should have used sorted() instead. The differences between these two ways of sorting a list are presented in this blog.

Read More