Wonder how Facebook identifies you and your friends in photos? Wonder how Gmail automatically detects and filters out spam? Or how Autosuggest in your smartphones works? All these answers lie in Machine Learning! Read ahead to find out! ;)
Machine Learning is the science of getting computers to act without being explicitly programmed to do so. The machine is able to learn and implement algorithms, and tries to figure out/determine output.
Gmail is able to classify mails into spam and not spam. How? It figures it out from content of the mail. “Mark as spam” helps in classification, as it takes these as input. Any ML Algorithm learns from data. Classifying handwritten numbers as digits is a common application of ML. As a very primal example, process of identifying numbers- see how many straight lines it has.
ML is of two common types- Show four different red apples, give a red fruit and ask whether it’s an apple or not. You have a predefined set of data. This is known as Supervised Learning.
Suppose we give the machine no output, but have data, it’s an example of unsupervised learning. In this case, the machine tries to categorise data into new groups, based on features of the images.
Machine Learning can solve things that are solvable only. We cannot find cost of houses from number of potted plants in the house. Will give wrong results obviously.
Gmail’s spam filter- part of it is supervised learning. Unsupervised learning- realises spam automatically. It’s a dynamic process. Way of filtering keeps changing along with changing ways of writing spam. Another example- Captcha. There’s an IP algorithm which is trying to learn if the captcha provided is correct. So, next time you see a captcha, remember, you are helping a Machine Learning Algorithm get better with your identification of the letters there. Basically, Captcha uses human confirmation to check if the letters/numbers the human provides is same as the output of the Machine Learning Algorithm that’s used to identify the letters in there!
So, how does Machine Learning actually work? The first and most important thing we need to do is estimation of weights of the various parameters we have, on the basis of which the output is to be decided. Regression is concerned with modelling the relationship between variables that is iteratively refined using a measure of error in the predictions made by the model. Example of Linear Regression- Best fit line. Estimation of cost of house based on size of house. Line’s m and can be determined- these are the weights (Line is taken to be y=mx+c). Caution- Enough features are required to check. Error function is used to check the discrepancies in the data and output. Squared difference error is normally used to correct the hypothesis and find the weights. Regularisation is used in almost all algorithms. It is basically addition of a term to the error, sum of modulus of all parameters. It can choose any degree polynomial, it’s not restricted to a straight line.
The most broad category- neural networks. We are applying logic behind it, so we can’t treat it as a black box.
Common problems solved using ML- Classification problems- Classifying a new object into given groups. Clustering Problem- It’s an example of unsupervised learning. Clustering is the assignment of a set of observations into subsets (called clusters) so that observations in the same cluster are similar in some sense. Unsupervised learning will be mostly classification problems.
Neural Networks are a tool used in ML. It’s an attempt to mimic the brain. Neurons work on electrical signals. If they get a signal more than threshold they fire, else remain grounded. Outputs of neurons are added up. Human neural system- one neuron firing gives signals to other nodes. Synapse- chemical triggering. Weights of inputs decides output of a neuron. Output is combination of weight vectors. There is a constant bias term which decides the threshold. Network learns by itself and decides the threshold parameter. Weights are set into vectors, and learning happens by matrix multiplication. Most often, neural networks are used in classification problems. Based on data, it develops the weights automatically. We just provide the structure. Layered networks of neurons. Each layer is a different function. Parameters are not normally one dimensional, because we need to incorporate all the features together, plus there might be constraints in the parameters. Hence we need to use gradient descent. To know more about stochastic gradient descent, go to- https://en.wikipedia.org/wiki/Stochastic_gradient_descent
Square error function- Sum of squares of errors. Minimise the square error function to minimise error. Stochastic Gradient Descent- Mean Error Function can be thought of as a paraboloid type curve, the decreasing gradient takes you towards the minima. It’s similar to the Newton Raphson Method. However, the square error function can be solved arithmetically only, not analytically. Weights are updated by minimising errors. Back tracking or back propagation is done. Neural networks can approximate almost everything. No one actually knows the exact mechanisms of a neural network. We can’t get an intuition behind it. Initially, weights are all set to one (you face some problems if the weights are initially set to zero).
Genetic Algorithms are an application of Neural Networks.
Courtesy (Speakers in the GD): Siddhant Garg, Karan Chadha, Arka Sadhu