Machine Learning Methods – Computerphile
Articles,  Blog

Machine Learning Methods – Computerphile

Well today, I want to talk data mining which is what I’m really interested in and I want to explain a little bit about the inner workings of data mining a little bit of the sort of terms that you might have heard when you read – the first lecture or the first book I want to talk about supervised learning, unsupervised learning, what exactly are these things, and then I want to get on to something new semi-supervised learning and also What’s the research at the moment in this area? It’s called Machine learning That’s the sort of applied artificial [intelligence] machine learning if you get a data you want to mine the data and Broadly there’s kind of two categories of methods how this works, so if [I] could pull up my prop. Yes, I’ve carefully prepared Here are some items of data that I have brought along the first method may be that I should explain is unsupervised learning Because it perhaps the easier way, it’s called unsupervised learning Because we don’t have any examples that are labeled, so it’s an unlabeled learning yeah I guess the idea is a supervisor knows the answer and we don’t have anybody who knows the answer So we get the data to begin with and we don’t really know anything about it We know obviously the attributes. We know the values, but we don’t know what categories are they let’s say that’s a problem So unsupervised learning very often is just sorting off the data so unsupervised learning very often is just sorting of the data So you get your first date item and you put it somewhere and then comes another data item and you basically go let’s do colors is this similar or is this different and Now this is quite different. We put it there and then comes another date item. Oh It’s this similar or is it different it’s a little bit similar to the yellow ones so we’ll put it a little bit closer to the yellow one and Then comes another data item and no This is obviously quite similar to the yellow one so we put it closer to here and then so over time you get all these Data items in and they might end up a bit like Something maybe a bit like that So what have I done? I’ve done a sorting of the data and the approach I’ve done is something based on similarity measures these unsupervised methods they all use the similarity measure in this case I’ve done kind of by color the other way these methods usually work is to actually start out by saying but how many groups would? You like your data to be in how many clusters would you like it to be in? So let’s say you want them in three clusters Well, then maybe solution might look like this, it’s clustered in by the color three clusters If there would have been four clusters maybe the solution would have looked like this And if there was maybe two clusters it might even looked like this, so you might ask okay? So so what’s the data mining about the sorting of the data well? Once we sorted the data in this way. We can of course have a look at all So what ended up together maybe these things have ended up together? And maybe now we can say oh, this is the light colors. This is the dark colors, and we certainly have two groups I mean we wouldn’t normally sort color cubes You would sometimes saw patients and are they really ill or are they very ill and you know that sort of thing we could sort about this now most of the Unsupervised method spoke exactly like I described to the worker by sorting it the differences that [had] [a] measure the difference between things so is it a statistical similarity is it a Algebraic similarities that your metric measure you can imagine or so many ways you can measure the difference between things Unsupervised learning is sort of quite a simple way of doing it I mean, it’s quite quick the algorithms, but it’s not as powerful as other methods What’s the problem with it? One of the problems with it is actually quite straightforward Let’s say we end up with this solution. Well, is this a good solution, or it’s not a good solution It’s actually really hard. It’s really hard to evaluate because we obviously don’t know about the data We don’t know so we’re looking at it Going which looks okay? But maybe not and then very often what happens actually if you look at the data from one way It looks like a good solution, but now I do my reveal we sort of turn the data a bit And you know suddenly we have another angle on the data and like actually now. It’s a mess They’re not really sorted variable at them or are they well often what happens? That’s often what happens with unsupervised learning you sort them in one way, and they look quite good But then we look at the data differently and actually this hasn’t quite worked And it’s not so great the other downside with Unsupervised learning is the algorithms really only work when you tell them how many groups you want to data to be in two groups, three groups, four groups For some problems you might notice maybe you have like I say ill patients and healthy patients And you know there is two groups but very often actually how many groups you have is the whole question so you can’t really use these methods that well, if you want to know some technical terms Kmeans for example, it’s a classic unsupervised method That’s very popular. So if you can look it up, you’ll learn a bit more about it now… Second way of doing learning would be the supervised way We said unsupervised there must be that must be a supervised way. Here the difference is that you have some data which has some answers attached to it already so you can learn from it From this data really learn from it and a classic way of doing it is [them] well well neural networks forms one of the best-known ones. How does that work, okay? Well? So have some date again, and this time let’s say we want to do something a bit different We want to just sort them in light colors and dark colors for example And what happens is I get my data in and already somebody has labeled the data for me they said these are light colors, these are dark colors so we already know the answer for this data We don’t know it for some other data, but we know it for this This is our training data And now I’m going to do a new learning neural network the first data item comes in it goes here The next occurred item comes in and goes here And I keep doing this and maybe I end up with something that looks like this And now of course I can assess the quality of the solution and go… oh well algorithm, you’ve done Okay, but you haven’t done it really well because these two should be over there this one should really be there fix the function a bit and do it again [okay] back And we might end up like this. It’s like Okay, that was better But he’s still got one wrong fix the function again and do it against this called back propagation neural Network And we’ll do this again and of course if you do this long enough eventually the algorithm will learn the perfect function how to sort things and then the idea is a new data item comes along and It will go to the same function and because the function is now perfect it you will end up exactly the right place no problem and then ah and then no problem so It’s supervised because we have labels and because [of] labels we can assess the quality and in neural networks it’s the classic way of doing this and in general supervised learning is very powerful because As long as we have enough data with enough labels, we can always learn the function, and then it should work really well But well there wouldn’t be research if we’re finished with it So there’s obviously a problem with this as well. The problem with this is that it can lead to overfitting What does overfitting mean? Means like tight jeans you know. No, not that. It means that you have Too much emphasis on getting the function right you make it too right. So the function is absolutely perfect in fact it’s so perfect, it’s brittle it’s it’s it’s just not good anymore So what happens is a new data item comes along one that you’ve not seen before I got one And the unsupervised method wouldn’t have a problem with this because it just goes by similarity and we’ll go It’s kind of a light color you probably end up here But a supervised method has never seen this color before and the function goes like what do I do with this and it Pftttt breaks or it puts you just at a random place like maybe here so supervised learning is really good But if you overdo it, then you’ve overfitted and the problem is that you actually make the system worse again. You made it brittle The other downside of supervised learning is you must actually have enough data with labels which for some problems you have it’s fine but for some problems, you don’t really have it, so Let’s talk about a practical problem that I was working on so I was working with doctors in a hospital Clinicians who look after colon cancer patients and they took many years to collect the data of about 500 patients of classic medical data so we’ve got age, critical medical history we’ve got genetic values, blood values, and so on and so on and so on and They get diagnose the different categories of illness some more serious some less serious and the doctors wanted some help with this categorization the most serious cases and the least serious cases they’re quite clear, but it’s just this whole group in the middle And I wanted to make sure can we split them a bit better And so we were working with this with them, and so this is a classic problem And in that case there was 500 patients that were already categorized as in what category of illness they were in so actually a supervised approach was really good because we could learn from those 500 and build up a picture and as long as we’re careful to not overdo it we’ll be fine But then what actually happened and this leads me on to what my research is at the moment What happened is…. not for all the 500 patients did they have all the labels because some of the technology has been changing over the years So there’s more modern things now that I didn’t have ten years ago so actually for the last 50 patients they had some additional labels that I didn’t have for all the others and So we were talking about what to do with this And there’s a method called semi-supervised learning which is kind of what the research is on Why can we take the best of both worlds and maybe combine it a bit so what if you’ve got a few labels? It’s not enough to learn perfectly, but maybe we can do something so what we’ve done is a semi-supervised method And it’s kind of a mixture of the two You get your data and let’s just say we want to split them in light and dark colors It’s basically our more serious patients and our less serious patients And you might end up sorting the data something like that because it’s an unsupervised approach first of all we don’t know exactly how good this is But then for some of the data items we have a label and we can look up What’s the number on them and because for some of you have a label now We can say okay all the ones have the same label or with a similar label are there in the same group so suddenly We can assess the quality of this So we don’t have a label for all of these, but have a label for some of them are they in the same group Yes, and then the same labels are in the same group Yeah That looks like a good solution semi-supervised learning is probably the future because as data sets get bigger and bigger and bigger You don’t have labels anymore for everything because nobody has time to label everything and computers can’t really label things very well so you’ll have the experts labeling a few things and semi-supervised learning will be where this is going but Then the next step really would be to have it interactive that would be even better So that’s kind of what we’re working on right now. It’s called a man in the loop or human in the loop learning where You maybe have no labels at all or maybe just very few and then you do some sorting of the data and then we asked the expert has the sorting gone well? Has it not gone well? Well, what about this one item, what would be the label you would give it and it sort of a bit interactive And I think they’ll be much better because then you can you know there is [more] in real time and you can actually also Latest developments can come in tacit knowledge that you might not even have in the data So that’s like spot checking? Yeah, exactly it’s like spot checking it and but then putting that knowledge back into the algorithm So the algorithm can learn from it again and it’s a sort of reinforce a bit That’s a single-car. That’s basically controlling the robot twice 864 processes. Which is more than a robot will usually get. Where are we going now? I’ll show you the big machine. That’s it.


  • John King

    Please go into algorithms in future videos! Especially neural networks. I've found most explanations of NNs lacking in intuition and I would love to see your shot at explaining it!

  • Martin Savc

    Many viewers seem to find this video informative, yet I find it lacking a good representation of what machine learning actually is. Being quite familiar with it, I thought that the process was somewhat miss represented. Does any one share this view?

    Machine learning algorithms usually do not place blocks (representing data) on the table (representing space). The blocks are already there, sorted as they are.

    Machine learning algorithms draw lines and curves on the table. The curves represent borders, dividing the table into areas. Areas then represent different labels.

    When a new block is placed on its place on the table, the area it resides in represents its label.

    When a machine learning algorithm over fits, it might have drawn borders to tightly around the given blocks.

  • allenamenbesetzt

    According to this video, isn't semi-supervised learning the same as supervised learning except that there are multiple "perfect" solutions because you do not have labels for everything?

  • bwack

    Very interesting. I did a master thesis on comparing two learning algorithms for artificial neural networks (ANN) 9 years ago. TD-Lambda versus learning by search (supervised learning). It was very interesting to see the ANN learning to do better and better movies in the game of back gammon 🙂

  • 2LegHumanist

    The overfitting problem is overstated. It is easy to test for…. just put unseen labelled test data through the system. it is easy to fix… just increase your Lamda value in the cost function.

  • purplecoathanger

    First attempt at explaining machine learning I've seen without a picture of a scatter graph with a decision boundary drawn on it.

  • k24bfan

    This reminded me of my AI class good times we only covered unsupervised, and supervised learning. Insightful to know where the future of this field is going.

  • Biomirth

    I find it depressing how primitive these approaches are in terms of "learning". If you have an interactive method it shouldn't just add the new knowledge to the algorithm, but discover the reason the knowledge was missed in the first place and pro-actively test algorithms along the same "missing method" axes. I know that's asking a lot, but come on, it's no longer 1973

  • dork

    I'm slightly disappointed at the immediate reach to neural networks. While NNs are proving to be very fruitful as modern learning methods, they aren't really a good example for introducing basic machine learning concepts unless you start mischaracterizing or really oversimplifying what nets do. Maybe start with something way simpler and build your way up? I mean you can even just go with perceptrons or some other linear classifier and that's way easier to understand.

  • Salafrance

    I love these videos in general, but I found the audio very indistinct on this one, which made Prof. Uwe's presentation difficult to follow. I might try running the audio through an equaliser. Thanks for a fantastic resource.

  • Ben Harber

    I find it very interesting that all this work has been done to automate things (take humans out of the loop), only to find that the 'probably' best solution is to actually have a human in the loop. 🙂

  • djah

    Humans learn everything in a semi-supervised way, am I right? We get some advice from "experts" (our parents, our teachers, etc) but we also experiment ourselves based on observation (in a sense, we build some "metrics" for the quality of everything ourselves, similar to the similarity metrics in unsupervised learning). Does this make any sense?
    BTW the video is extremely interesting! Amazing stuff! 🙂

  • Muli Yulzary

    I find unsupervised learning much more powerful than semi/supervised learning. You are not restricted to the number of classes which is kind of the whole point of data mining. They are just being used for different purposes…

  • LovSven2011

    Great explanation. I can see that he knows much more, and he has a way to simplify enough so you can understand.
    I already have some questions about how some token examples to reevaluate the categorization in semi-supervised machine learning would work. Some confidence rating or sth…

  • loLler781229

    Wow, i've been thinking about a for months. I am really glad to know that there are other people who think about the same thing and even researching it.

  • FreeHomeBrew

    I find artificial evolution of neural networks to be a very interesting way of finding solutions to problems. I might actually make one for a current problem I'm having, which is sort of related to voice recognition.

  • TheBandScanner

    The computer aspect is interesting.  However I am appalled with how easily you slip into use as a medical tool.  What if your child was a genius, but because he didn't fit a prescribed pattern got labeled as a dullard.  Please don't turn doctors into robots without interest in the human patient.

  • Peter Walker

    My cardiac surgeon ( Dr Nick Hendel ) sat with me and entered data from tests and factors and ran the program. He said it informed him of what to do and avoid and whether I'd survive.
    Lying in the cardio ward watching the surgeon with an HP or TI programmable calc is pretty different.

  • Alistair Bain

    machine learning using c# for beginners by Latasha Morgan is the best ML book out there.Get it.It helped me loads.Lots of coded examples and explanations.

  • Daniel Amaya

    "…I wouldn't be researcher if we'd finished with it!" lol loved that phrase, the very nature of curiosity and the willing of discovering the world!

  • Zachary Quinn

    So the "man in the loop" is 'considering' the human's input? Constrasted with supervised where the user overrides the program's decision, man in the loop treats the human input as a weighted factor that is part of the overall algorthim? The human factor influences, but doesnt effect a change?

  • Elliebarbs

    The only video that helped me with understanding ML for my university assignments 😅👌🏾😍😍! Thanks a lot

  • foreverteuk

    This guy finally allowed me to understand the concept. it reminds me of that time when my lecturer talked about toilet seats. You want it to be hard, but if too hard, it becomes brittle and it breaks when you sit on it.

  • Theo Pana


Leave a Reply

Your email address will not be published. Required fields are marked *