An Overview of How Software Works
On a basic level, every software system manipulates data. In software development, we create a system where someone or something (the user) provides some data (the input). Software manipulates that input and translates it into different data, the output. Our fundamental goal for developing software is to have a user provide input that meets certain criteria and have software create an output that makes sense to them and helps them solve some problem.
This process can be seen in the technology we use every day and often take for granted. For example, when you click on something with a mouse or your finger, software translates that input into the output you desire, such as opening an app. When you enter data into a spreadsheet, you can utilize software to reorganize all the data based on your desired criteria, like creating an alphabetical list.
But with exponentially growing processing power, software has also become much more powerful and complex. Self-driving cars can analyze an enormous number of data points from multiple sensors in real time to perform all of the normal functions of a human driving a vehicle. Even more, the machine learning algorithms within self-driving cars can improve their driving skills with experience. The more input they receive, the better they can distinguish between different types of objects, predict how different objects will move, etc.
To a non-technical user, the processing that happens between the input and output can seem like magic. (Imagine sending a self-driving car back in time to 1800.) As developers and architects, however, we understand the effort that goes into creating software that not only works — but works the way it is intended.
We write new code and use existing code to make that magic between inputs and outputs happen. Within the customized software we create, a function maps the user’s input to the output. To create that function, we need to fully understand the specific problem or issue that our client needs to have addressed. Here’s how:
- We carefully analyze all of the possible input data.
- We collaborate with stakeholders to learn what the desired output should be.
- We interview subject matter experts to improve our understanding of the problem.
- We distill what we’ve learned into business logic that serves as the foundation for the rest of our software development.
The result of this careful analysis and planning is a system that seamlessly collects input and produces useful output. How well the system performs is entirely under our control and dependent on our skills and ability.
But what happens if we are missing inputs, outputs, or other vital information?
That’s where machine learning comes in.
What Is Machine Learning, and Why Is It Important?
If users can provide examples of the input and corresponding output, we can use the techniques of machine learning to approximate a function that creates the desired output. The “learning” part of machine learning is derived from the fact that a computer examines the existing data and “learns” how to build a working function for new inputs.
Taking advantage of machine learning makes the most sense when the following conditions are true:
A function cannot be implemented based on existing data and insights.
If we know how to implement the function directly, doing so will always produce a better solution because machine learning can, at best, provide only an approximation. Machine learning, then, is only deployed in cases where a function is unknown.
Examples of the input exist.
The machine learning algorithm needs examples (the more, the better) in order to hone a function. Having the expected output for each input example is also helpful.
Discernable patterns exist in the data.
Since machine learning develops a function by analyzing examples, discernable patterns in the data need to exist to get a more accurate approximation. If the examples are indistinguishable from random noise, a machine learning algorithm will struggle to develop an accurate function.
Unlike building a working function (where success is dependent on our skills and ability), the success of using machine learning is dependent on the quality of the example data, which is usually out of our control.
An Example of Machine Learning: Would You Survive the Titanic Sinking?
Let’s apply this to an interesting example: predicting whether or not you would survive the sinking of the Titanic. We have piles of data about the passengers who were on the Titanic: their names; their room numbers; the amount they paid for their fares; their sex; whether they survived or perished; their age; which class they were in; if there were siblings, spouses, parents, or children on board with them (and how many); where they embarked on the vessel; and more.
We could make some educated guesses about who survived: the wealthy, females, children, etc. But you or I couldn’t create a function that translates demographic inputs into a chance of survival for any person. Instead, we would use machine learning. The algorithm would analyze known, relevant inputs (such as sex, class, age, etc.) and their corresponding outputs (whether people with those attributes survived or not). Using that known data, it would create a function to predict what would happen if someone had different attributes.
Using that function, we could create software that allowed users to input certain characteristics. The software would then provide users with the output: their chance of survival. In fact, you could even input data about fictional characters like Jack and Rose from the movie Titanic. This is exactly what Jennifer Marsman from Microsoft did in a 2-part series on Azure machine learning. And the machine predicted that Rose would survive (with over 98% confidence) and that Jack would not (with over 28% accuracy — even when the user misleadingly entered that he survived as an input)
The Applications and Challenges of Machine Learning
Now imagine that instead of a small handful of data points (like the demographic information about Titanic survivors), we have thousands of data points. And instead of a somewhat whimsical historical project, you are trying to make predictions or draw conclusions about your business — and the decisions you make based on that information have real-world implications.
In cases like these, it’s nearly impossible to build an effective function from scratch. Machine learning is almost always the right choice for developing new functions that apply to your unique business needs and goals. Still, machine learning isn’t perfect because it is dependent on the quality of the original data. When using machine learning to approximate a function, there are three possible outcomes:
- We will know that the function is a good one.
- We won’t know if the function is any good without further testing.
- Right away, we will know that the function is not working.
One way to improve the quality of machine learning is to provide better data. Developers can:
- Remove inputs that are not relevant to the desired output.
- Provide additional data points that inform the output.
- Add further data to the set, especially if that data includes both inputs and the resulting output.
- Change the way that the algorithm treats missing and outlier values.
- Tune the weights of different inputs as they relate to the eventual output.
- Apply multiple types of algorithms to the same data.
Our deeply experienced professionals at Stratus Innovations Group are some of the foremost experts in developing customized cloud-based solutions utilizing machine learning and other leading-edge data science techniques. We can bring the power of the cloud and machine learning to your business to drive measurable, immediate value.
Whether you are interested in our featured solutions or you would like to develop a completely customized solution for your business needs and goals, call us toll-free today at 844-561-6721 or fill out a simple contact form to start reaping the benefits of cloud computing today.
Still Have Questions About Machine Learning? Stay Tuned to Stratus Innovations Group’s Blog
Machine learning is far too complex to discuss in a single article, so you may have more questions:
- How do we know that machine learning works?
- Why does machine learning work?
- How do algorithms and data scientists analyze the example data?
- How can an algorithm produce an approximate function, and how do we know if the approximation is effective?
- How can we know if our initial examples are useful in the first place?
We’ll continue to cover these topics — and more — on our blog, so be sure to check back often!
Marsman, J. (2016, February 18). Using Azure machine learning to predict who will survive the Titanic – Part 2. Channel 19. Retrieved from https://channel9.msdn.com/Blogs/raw-tech/Using-Azure-Machine-Learning-to-Predict-Who-Will-Survive-the-Titanic-Part-2