Machine Learning and MindsDB
Patricio Cerda Mardini
MindsDB helps bring state-of-the-art machine learning as close as possible to where the data lives: production databases. It currently supports supervised learning tasks and is customizable depending on each user's needs.

Patricio Cerda Mardini

October 5, 2020

Patricio Cerda Mardini is a Machine Learning Engineer, currently helping democratize ML at MindsDB.

What is Machine Learning?

Machine Learning (ML) is a subset of the computer science field known as artificial intelligence (AI). Formally, ML is the study of computer programs that learn to improve their performance in some given task through experience. The general idea is to develop agents that will behave "intelligently" in novel situations, applying previously acquired knowledge to complete the task successfully.

Progress in ML research is incredibly fast. Ever since the "deep learning" revolution began in 2012, after the ImageNet classification challenge was comfortably won using a deep convolutional neural network, there has been quick progress towards intelligent computational systems. These systems can complete tasks previously thought to be very hard or outright impossible for a computer agent.

Some examples include algorithms capable of beating world champions in the game of Go, generating convincing synthetic free-form text, achieving super-human performance on Atari games, or building autonomous driving capabilities, among others.

Why is it so valuable?

Data is generated at astounding rates in modern society. YouTube, for instance, processes and uploads over 500 hours of video every minute. Every day, people watch one billion hours of content on this platform. How can anyone make sense of such an immense amount of data?

It turns out that a family of ML algorithms, called deep neural networks (DNN), are incredibly powerful when trained over vast amounts of data. And it doesn't really matter what type of data we are talking about: images, sound, video; DNNs will work regardless, so it's no surprise that this tool has revolutionized how people can get actionable insights from their data. Here is where the value of machine learning lies: even if you do not have enormous amounts of information, a neural network can still get you pretty far.

And it shows. Currently, ML application is widespread throughout industries, with use cases ranging from time series forecasting in financial contexts, meteorology, and recommendation systems; to vision understanding in smartphones or robots; and natural language understanding in translators, “chatbots” or text generators. ML is a fundamental tool for the development and application of artificial intelligence.

How can we frame an ML problem?

We can broadly distinguish three different learning paradigms, where the algorithm receives different types of feedback from which to learn.

1) Supervised learning

This is the most common type of learning task. We present pairs of problem instances and answers to our algorithm, which will optimize for approximating its output as much as it can to the correct answers.

For example, imagine a system that can discern whether an image contains a hot dog (or not). To train this algorithm, we need to provide a set of images along with their correct answers.

2) Unsupervised learning

In this paradigm, the designers of the ML system do not know the actual label and/or answer to each problem instance. The system’s task is to make sense of the input on its own, finding structure and patterns that might be valuable for the user.

A typical example (used in medicine and marketing, among others) is the clustering task. We train the model to partition the data into groups whose members are similar in some sense, not necessarily known to the user.

3) Reinforcement learning

Finally, reinforcement learning has a much more active conception of the algorithm that we are implementing. The environment is an active playground with which our agent interacts, aiming to learn how to maximize the rewards received for its actions by carefully exploiting its current state and knowledge while exploring new ways of acting.

Typical use cases for reinforcement learning include robots, counterfactual systems (like recommenders, or finance traders), and self-driving cars.

How MindsDB ties into this

MindsDB helps bring state-of-the-art machine learning as close as possible to where the data lives: production databases. It currently supports supervised learning tasks and is customizable depending on each user's needs.

The focus is on simplifying the definition, training, and deployment of an ML model that will seem like just another table you can query on your database. Additionally, features like confidence estimation and explainability can help you decide whether you trust a prediction.

MindsDB lets newcomers and experts alike to quickly obtain insights from their data, which can be analyzed directly as a database table, or through our simple and easy to use graphical interface, the Scout. 

For anyone who wants that extra bit of control over the machine learning model, our “Lightwood” backend is intuitively customizable and very powerful. Additionally, if you want to deploy a pre-existing algorithm, you can bring your own model into MindsDB.

Final thoughts

In this blog post, we have seen what Machine Learning is, why it is a powerful and valuable tool to have, and how MindsDB can be used to dive into supervised ML directly from your database. We are on a mission to democratize Machine Learning for everyone, so I hope you’ve learned something new from this article. Until next time!