PK #iAh 3 Unit_4.1_-_Pruning_the_Input/01-Introduction.en.srt1
00:00:00,000 --> 00:00:05,080
How can you deal with an NP complete problem if you can't avoid it
2
00:00:05,080 --> 00:00:07,570
and a simple algorithm doesn't work.
3
00:00:07,570 --> 00:00:12,880
Well, unless P=NP, you know there's not going to be a polynomial time algorithm for your problem.
4
00:00:12,880 --> 00:00:16,180
Now, that statement actually allows for some loopholes, which we'll investigate
5
00:00:16,180 --> 00:00:18,380
in this unit and the following units.
6
00:00:18,380 --> 00:00:24,280
The first loophole is to try and find intelligent algorithms that allow you to search
7
00:00:24,280 --> 00:00:29,300
through an exponential number of solutions still on exponential time, but so efficiently
8
00:00:29,300 --> 00:00:32,470
that sometimes you actually get a perfect solution.
9
00:00:32,470 --> 00:00:37,330
The second case we'll be looking at is if you accept that your solution is not going to be
10
00:00:37,330 --> 00:00:42,020
the best possible one but maybe within a certain range of the best possible one
11
00:00:42,020 --> 00:00:45,770
if you can then find faster algorithms to solve your problem.
12
00:00:45,770 --> 00:00:48,640
For each of these approaches, we'll be discussing the main ideas.
13
00:00:48,640 --> 99:59:59,000
I'll give you some examples and of course, we'll also be discussing a bit of the theory behind them.
PK #iAZ<5 5 7 Unit_4.1_-_Pruning_the_Input/02-Counter_Measures.en.srt1
00:00:00,000 --> 00:00:04,200
So far, our units have focused on showing that problems are hard
2
00:00:04,200 --> 00:00:07,010
or NP complete to be more precise.
3
00:00:07,010 --> 00:00:10,110
Actually, what I have very often observed in practice,
4
00:00:10,110 --> 00:00:13,190
and even in scientific papers where the office did not have a
5
00:00:13,190 --> 00:00:16,020
solid background in theoretical computer sciences is this.
6
00:00:16,020 --> 00:00:18,880
Once a problem is shown to be NP complete,
7
00:00:18,880 --> 00:00:23,290
many people would just tend to give up or come up with some,
8
00:00:23,290 --> 00:00:26,600
well honestly terrible algorithm that they've justified with.
9
00:00:26,600 --> 00:00:28,660
Well, the problem was NP complete,
10
00:00:28,660 --> 00:00:31,760
and that is why we're now going to turn our attention
11
00:00:31,760 --> 00:00:35,250
to count the measures against NP completeness.
12
00:00:35,250 --> 00:00:42,260
So, this unit and the following units are your first aid kit to help you overcome NP completeness
13
00:00:42,260 --> 00:00:44,090
or at least take a bit of the sting out of it
14
00:00:44,090 --> 00:00:47,720
because imagine you find yourself in the situation like the four computer scientist
15
00:00:47,720 --> 00:00:52,690
that we have gotten to know throughout this course like Alice, Bob, Carol, and Dave.
16
00:00:52,690 --> 00:00:54,970
Solving tough problems was part of their job.
17
00:00:54,970 --> 00:00:58,470
Alice was solving vertex cover for telecommunications;
18
00:00:58,470 --> 00:01:03,210
Bob was working in biotech developing new medicines, analyzing genes;
19
00:01:03,210 --> 00:01:07,690
Carol was working in finance designing secure investments;
20
00:01:07,690 --> 00:01:12,050
and Dave was working in logistics where he had to solve the shortest tool problem
21
00:01:12,050 --> 00:01:15,810
and actually there are logistics companies who developed algorithms
22
00:01:15,810 --> 00:01:19,880
with which they save millions everyday and also fuel.
23
00:01:19,880 --> 00:01:24,180
So, those problems are practically relevant although they are NP complete.
24
00:01:24,180 --> 00:01:28,340
So, when your company or your boss asked you to solve an NP complete problem,
25
00:01:28,340 --> 00:01:31,850
what should you do--should you say, "I want another problem,"
26
00:01:31,850 --> 00:01:37,120
should you make her or him take this course so that they finally understand NP completeness,
27
00:01:37,120 --> 00:01:39,570
should you say "It's impossible to do that,"
28
00:01:39,570 --> 00:01:42,460
should you say "It's proven impossible,"
29
00:01:42,460 --> 99:59:59,000
or should you say "It's proven difficult but we can try."
PK #iAKo[zu u @ Unit_4.1_-_Pruning_the_Input/03-Counter_Measures_Solution.en.srt1
00:00:00,000 --> 00:00:05,990
Of course, this was a bit of a simple inter quiz here. Yeah, probably that's not a good idea.
2
00:00:05,990 --> 00:00:09,770
Making anybody take this course is a very good idea, but of course, actually you're
3
00:00:09,770 --> 00:00:13,640
the computer scientist so you're supposed to be the one knowing about NP completeness
4
00:00:13,640 --> 00:00:15,390
especially after taken this course.
5
00:00:15,390 --> 00:00:17,180
You'll probably shouldn't say it's impossible.
6
00:00:17,180 --> 00:00:21,430
A better answer would be to say, "It's been proven to be impossible,"
7
00:00:21,430 --> 00:00:27,070
but I think that after this unit, you will agree that the best possible answer is actually to say,
8
00:00:27,070 --> 00:00:31,380
"Well, it's been proven difficult or extremely difficult," but we can actually try
9
00:00:31,380 --> 00:00:36,100
to solve this problem even though it is NP complete and I will show you how to approach that.
10
00:00:36,100 --> 00:00:40,750
The thing is this--there's so many NP complete problems that are practically relevant
11
00:00:40,750 --> 00:00:45,390
that over the past years extremely clever algorithms have been developed for these problems.
12
00:00:45,390 --> 00:00:50,810
They have been developed for vertex cover, for clique, for independent set,
13
00:00:50,810 --> 00:00:54,380
for the shortest tour problem, for SAT, and so on.
14
00:00:54,380 --> 00:00:58,670
Knowing about these problems can really make you standout as a problem solver
15
00:00:58,670 --> 00:01:03,070
because many people, as I said, will give up once they know their problem to be NP complete.
16
00:01:03,070 --> 00:01:08,870
You should respect NP completeness, so maybe not be all smiling when you encounter
17
00:01:08,870 --> 00:01:13,650
NP complete problems, although it's nice to know that they are happy but I don't know.
18
00:01:13,650 --> 00:01:16,770
When they're solving their problem, be respectful but there's no need
19
00:01:16,770 --> 99:59:59,000
to be unhappy or scared of NP completeness.
PK #iAdȲN
N
> Unit_4.1_-_Pruning_the_Input/04-Laws_Of_NP-Completeness.en.srt1
00:00:00,000 --> 00:00:04,280
How can you deal with an NP complete problem other than avoiding it.
2
00:00:04,280 --> 00:00:09,590
Well, I think you should think of yourself as kind of a lawyer who has to work with the
3
00:00:09,590 --> 00:00:16,470
laws of NP completeness and the main law of NP completeness is this, unless P=NP,
4
00:00:16,470 --> 00:00:20,940
there will be no polynomial time algorithm for your NP complete problems.
5
00:00:20,940 --> 00:00:23,590
Now, what do good lawyers do with the law.
6
00:00:23,590 --> 00:00:30,910
They find loopholes and that is exactly what we are going to do for this law of NP completeness,
7
00:00:30,910 --> 00:00:34,040
and we're actually be looking at two loopholes.
8
00:00:34,040 --> 00:00:39,530
The loophole that we will be exploring in this unit is using intelligent exponential time algorithms.
9
00:00:39,530 --> 00:00:44,320
We're going to work in exponential time but we're going to do it in a smart way.
10
00:00:44,320 --> 00:00:51,240
We're going to avoid things like 2^n or 3^n, but rather we're going to have algorithm set
11
00:00:51,240 --> 00:00:55,240
for example I have something like 1.3^n, which is already much better
12
00:00:55,240 --> 00:00:57,120
and works for many practical cases.
13
00:00:57,120 --> 00:01:02,600
What we are then going to look at in the next units is what I like to call sloppy solutions.
14
00:01:02,600 --> 00:01:06,310
For all proofs of NP completeness, we always require the computer
15
00:01:06,310 --> 00:01:09,670
to provide the exact answer to our problem.
16
00:01:09,670 --> 00:01:13,800
One chance to circumvent NP completeness might be to say,
17
00:01:13,800 --> 00:01:18,570
"Well, we do not want the best possible solution, but we're happy with say
18
00:01:18,570 --> 00:01:23,490
one that is close to optimal" or for a decision problem for example, we might be saying,
19
00:01:23,490 --> 00:01:27,250
"Well, it's okay if we get the right answer in about 80% of the time."
20
00:01:27,250 --> 00:01:30,360
For each of these approaches, we will discuss the main ideas and also of course,
21
00:01:30,360 --> 00:01:33,530
give you some examples to make them more clear and talk a bit about the theory
22
00:01:33,530 --> 00:01:36,170
behind these approaches because these are techniques where you really have
23
00:01:36,170 --> 00:01:39,300
to understand the theory if you want to be successful in practice.
24
00:01:39,300 --> 99:59:59,000
So let's start out with smart exponential time algorithms.
PK #iA8G 7 Unit_4.1_-_Pruning_the_Input/05-Ten_Times_Faster.en.srt1
00:00:00,000 --> 00:00:02,790
And we are going to kick this off with a little quiz.
2
00:00:02,790 --> 00:00:10,510
So let's assume that we have an algorithm with a running time of 2^n n², not O(2^n)n²,
3
00:00:10,510 --> 00:00:12,850
but really 2^n n².
4
00:00:12,850 --> 00:00:17,170
You can add any constant you wan if that makes you happy but not so the point of this quiz here.
5
00:00:17,170 --> 00:00:21,090
And now we're going to look at an algorithm that is 10 times faster,
6
00:00:21,090 --> 00:00:26,630
so it has a running time of 2^nn²/10.
7
00:00:26,630 --> 00:00:32,910
And if you think about it, finding an algorithm that is 10 times faster is actually huge in practice.
8
00:00:32,910 --> 00:00:36,060
So imagine for example that whenever you start your computer,
9
00:00:36,060 --> 00:00:39,760
it would start 10 times faster than it does now--a huge improvement,
10
00:00:39,760 --> 00:00:43,070
but what I now want to show you is that such a huge improvement
11
00:00:43,070 --> 00:00:47,330
is virtually useless when you're dealing with exponential time.
12
00:00:47,330 --> 00:00:51,960
So assume for example that for this algorithm here, we have a computer
13
00:00:51,960 --> 00:00:58,280
where the maximum input size that it can handle is 30, what I would like to know from you now
14
00:00:58,280 --> 00:01:04,900
is if we run this algorithm which is 10 times faster, huge improvement on the same machine
15
00:01:04,900 --> 00:01:07,960
what is the maximum input size we can handle then.
16
00:01:07,960 --> 99:59:59,000
And I would like you to give me your answer here in this box.
PK #iAQo @ Unit_4.1_-_Pruning_the_Input/06-Ten_Times_Faster_Solution.en.srt1
00:00:00,000 --> 00:00:02,990
And the answer here is in some way surprising.
2
00:00:02,990 --> 00:00:09,000
Although we have a 10 times improvement, we actually cannot handle an input that is much larger.
3
00:00:09,000 --> 00:00:12,600
We can only handle an input where n is 33 or smaller,
4
00:00:12,600 --> 00:00:14,730
and the way you can figure this out is as follows--
5
00:00:14,730 --> 00:00:19,710
If the maximum input size we can handle is 30, then the number of time steps
6
00:00:19,710 --> 00:00:35,320
this algorithm over here takes is 2³⁰ 30², which is 966,367,641,600 time steps.
7
00:00:35,320 --> 00:00:38,410
So already a rather powerful computer at work here I would presume.
8
00:00:38,410 --> 00:00:40,920
What if we have an algorithm that is 10 times faster.
9
00:00:40,920 --> 00:00:52,480
Now, 2³¹31²/10 is about 206 billion, 2³²32²/10=440 billions--that is still smaller than this
10
00:00:52,480 --> 00:01:02,700
and 2³³33² is about 93 billion, so slightly smaller still than this one here.
11
00:01:02,700 --> 99:59:59,000
If you increase it to 34, then this number here will become larger than this one over here.
PK #iA?n ; Unit_4.1_-_Pruning_the_Input/07-Hundred_Times_Faster.en.srt1
00:00:00,000 --> 00:00:03,980
We've put lots of effort into our algorithm to make it 10 times faster
2
00:00:03,980 --> 00:00:07,710
and we have gained only very little in terms of input size that we can handle.
3
00:00:07,710 --> 00:00:13,650
Now, what if we were even bolder and had an algorithm that was a 100 times faster.
4
00:00:13,650 --> 00:00:19,010
Imagine a 100 times--imagine your computer booting up a 100 times faster that it does today
5
00:00:19,010 --> 00:00:25,610
or your computer game not running at 30 frames per second but at 3000 frames per second,
6
00:00:25,610 --> 00:00:27,810
so 100 times faster--shoot speedup.
7
00:00:27,810 --> 99:59:59,000
What input size can we handle now?
PK #iA;? D Unit_4.1_-_Pruning_the_Input/08-Hundred_Times_Faster_Solution.en.srt1
00:00:00,000 --> 00:00:02,250
The answer here is almost depressing.
2
00:00:02,250 --> 00:00:08,119
Now we can handle a maximum input size of 36 even though we have a 100 times speedup,
3
00:00:08,119 --> 00:00:10,260
and the argument is basically the same as before.
4
00:00:10,260 --> 00:00:14,720
Again, you compare this number her for n=30 to this number here
5
00:00:14,720 --> 00:00:18,160
and then see when this one gets larger than this one over here.
6
00:00:18,160 --> 00:00:24,050
And that is why it's so important to understand algorithms and improve algorithms
7
00:00:24,050 --> 00:00:30,620
because if you do any optimizations that are a constant factor even if it's a huge constant factor,
8
00:00:30,620 --> 00:00:33,460
it doesn't really increase the performance of your algorithm
9
00:00:33,460 --> 00:00:35,930
if you're dealing with exponential time algorithms.
10
00:00:35,930 --> 99:59:59,000
So you really have to dive down and understand your algorithm and then improve it.
PK #iAzGb
2 Unit_4.1_-_Pruning_the_Input/09-Brute_Force.en.srt1
00:00:00,000 --> 00:00:02,780
Now, before dug deeper into the more sophisticated algorithms,
2
00:00:02,780 --> 00:00:08,039
I would like to briefly tell you a little about a technique known as brute force.
3
00:00:08,039 --> 00:00:10,520
So what does brute force mean? Does it mean that we take a hammer
4
00:00:10,520 --> 00:00:13,220
and beat the answer out of our computer?
5
00:00:13,220 --> 00:00:19,840
Of course not. Brute force is the algorithm that Alice, Bob, and Carol were using initially.
6
00:00:19,840 --> 00:00:22,520
So, you remember, for example, that for Bob,
7
00:00:22,520 --> 00:00:27,070
we were using an algorithm like this or not using but proposing an algorithm like this at least
8
00:00:27,070 --> 00:00:33,020
that simply went through all possible assignments of zero and one to the vertices,
9
00:00:33,020 --> 00:00:36,670
or at that time, we steal common genes and weakness of this algorithm was that
10
00:00:36,670 --> 00:00:39,020
it had a pretty unacceptable running time
11
00:00:39,020 --> 00:00:42,680
as soon as you got to more than 20 or 30 vertices in the network.
12
00:00:42,680 --> 00:00:46,000
But this algorithm actually also has two strength.
13
00:00:46,000 --> 00:00:49,370
One strength from it is that it's very simple, and the other thing is that,
14
00:00:49,370 --> 00:00:53,130
because it's so simple, you can actually also implement it rather fast.
15
00:00:53,130 --> 00:00:55,800
Under certain conditions, using an algorithm like this,
16
00:00:55,800 --> 00:00:57,710
actually might not be that bad an idea
17
00:00:57,710 --> 00:01:00,970
in this method of using all possible assignments.
18
00:01:00,970 --> 00:01:03,310
First of all, we're not confused the hammer here.
19
00:01:03,310 --> 00:01:08,490
This method if going through all possible assignments without any more sophisticated
20
00:01:08,490 --> 00:01:13,330
techniques or sophisticated algorithmic techniques is known as brute force.
21
00:01:13,330 --> 00:01:17,820
And so, as I've explained to you, there are certain conditions for brute force is actually desirable.
22
00:01:17,820 --> 00:01:20,430
So, let's have you figure out what those conditions could be,
23
00:01:20,430 --> 00:01:24,700
and by brute force, of course, we try all possibilities.
24
00:01:24,700 --> 00:01:29,970
Could it be that when you don't have much time to design and program a more sophisticated algorithm.
25
00:01:29,970 --> 00:01:34,440
Could it be when the instances where the inputs for which you're trying to solve your problem are small.
26
00:01:34,440 --> 00:01:40,550
Say n is the size of the problem smaller than 20
27
00:01:40,550 --> 00:01:43,420
or could it be when you need to solve the problem only once
28
00:01:43,420 --> 00:01:45,560
and the input might be a little bit bigger there.
29
00:01:45,560 --> 00:01:47,880
For each one, I'm going to give you two choices
30
00:01:47,880 --> 00:01:51,170
whether under these conditions using brute force could be okay
31
00:01:51,170 --> 00:01:53,790
and the first one is a maybe and second one is a definitive no.
32
00:01:53,790 --> 00:01:56,070
I'm not going to give you a clear yes choice because,
33
00:01:56,070 --> 00:01:59,030
obviously, since we're dealing with NP complete problems here,
34
00:01:59,030 --> 99:59:59,000
brute force is never generally advisable of course.
PK #iAI 2 Unit_4.1_-_Pruning_the_Input/10-Brute_Force.en.srt1
00:00:00,000 --> 00:00:05,620
And, well, it's a bit of an opinion here but I would say that--actually for each three of these,
2
00:00:05,620 --> 00:00:11,190
you could say maybe, so if you don't have much time to design and program an algorithm,
3
00:00:11,190 --> 00:00:14,280
then brute force might be okay under certain conditions.
4
00:00:14,280 --> 00:00:19,140
For example when your instances are small and you have proper time to run the algorithm,
5
00:00:19,140 --> 00:00:23,620
then brute force might actually be a good idea because it's your time that you invest
6
00:00:23,620 --> 00:00:26,490
into programming a more sophisticated algorithm.
7
00:00:26,490 --> 00:00:30,600
If it's your time that counts and not computer time, doing brute force
8
00:00:30,600 --> 00:00:32,369
is sometimes actually a good solution.
9
00:00:32,369 --> 00:00:36,020
Whenever the instances are small, there's two reasons to us brute force.
10
00:00:36,020 --> 00:00:40,520
One is because you can implement it very fast, but the other good reason to us brute force
11
00:00:40,520 --> 00:00:46,720
in these cases here is that because brute force algorithms are so simple they can be programmed
12
00:00:46,720 --> 00:00:51,960
with very little overhead, so what I mean by that is that the constants that are involved
13
00:00:51,960 --> 00:00:57,720
in the O notation are rather small and for small instances, such speedups actually play a role.
14
00:00:57,720 --> 00:01:01,160
For small instances, it could be a good idea to use brute force,
15
00:01:01,160 --> 00:01:04,300
but again even for the small instances, it might not be fast enough.
16
00:01:04,300 --> 00:01:08,850
What you need to solve a problem only once--that might also be a condition where a brute force
17
00:01:08,850 --> 00:01:13,070
could be okay--that might mean that you have a lot of computational time available.
18
00:01:13,070 --> 00:01:19,160
Say you need to have this problem solved within the next 3 months or 6 months,
19
00:01:19,160 --> 00:01:21,960
maybe then brute force is kind of fast enough.
20
00:01:21,960 --> 00:01:24,860
Of course, in this case down here, I think that's the most debatable.
21
00:01:24,860 --> 00:01:26,780
You might want to try some of the more sophisticated
22
00:01:26,780 --> 99:59:59,000
techniques that I'm about to show you.
PK #iA! : Unit_4.1_-_Pruning_the_Input/11-Number_Of_Instances.en.srt1
00:00:00,000 --> 00:00:02,940
But first of all, let's play a little bit more with the numbers.
2
00:00:02,940 --> 00:00:05,450
Let's say I make the following table here.
3
00:00:05,450 --> 00:00:11,860
This is the number of instances you need to solve--say 1, 100, 10,000 and this is
4
00:00:11,860 --> 00:00:18,400
the size of the instances, and this is the size of the instances, so smaller than 10, smaller than 20,
5
00:00:18,400 --> 00:00:21,120
smaller than 30 or larger than 100.
6
00:00:21,120 --> 00:00:24,800
And now let's say you have algorithm that has running time 2^n.
7
00:00:24,800 --> 00:00:31,090
We'll simplify this here, so as the running time is 2^n, no O off, no polynomial involved.
8
00:00:31,090 --> 00:00:33,540
All I want you to get here is a filling full of numbers.
9
00:00:33,540 --> 00:00:37,950
Now, what I would like you to think about is in which case you see a brute force
10
00:00:37,950 --> 00:00:39,580
would be something to try out.
11
00:00:39,580 --> 00:00:44,400
If you think that you should try brute force either you're sure about it
12
00:00:44,400 --> 00:00:47,480
or it's at least something to think about, then I would like you to make a check mark
13
00:00:47,480 --> 00:00:52,220
in these boxes here and otherwise, leave the box blank, and of course, this is very subjective.
14
00:00:52,220 --> 99:59:59,000
If we don't agree, just click next and let's discuss where we don't agree.
PK #iAQ
C Unit_4.1_-_Pruning_the_Input/12-Number_Of_Instances_Solution.en.srt1
00:00:00,000 --> 00:00:06,660
So 2^n, if instances are smaller than 10 that means that the running time will be
2
00:00:06,660 --> 00:00:11,200
around 1000 time units or 1024 time units to be precise.
3
00:00:11,200 --> 00:00:15,370
If it's 20, you will have about 1 million time units.
4
00:00:15,370 --> 00:00:21,670
If it's 30, you will have about 1 billion time units. So 1 thousand times 1 thousand times 1 thousand.
5
00:00:21,670 --> 00:00:25,390
If it's larger than 100, then of course you have a huge running time here.
6
00:00:25,390 --> 00:00:29,230
You will have a running time that is about 10³⁰.
7
00:00:29,230 --> 00:00:36,600
It's clear that when your instances get larger than 100, you would not be able to use brute force.
8
00:00:36,600 --> 00:00:41,830
And on the other hand, if you just need 1000 time units, I think you don't even care if you have
9
00:00:41,830 --> 00:00:47,960
1 instance or 100 or 10,000 because if you think about it a current desktop computer
10
00:00:47,960 --> 00:00:52,630
is running on gigahertz so it can perform several billion operations per second.
11
00:00:52,630 --> 00:00:58,000
Even solving 10,000 instances where each one is small is usually not an issue unless
12
00:00:58,000 --> 00:01:00,170
you're working in a very special environment.
13
00:01:00,170 --> 00:01:04,940
What about a million time units? I would clearly say well solving that once is okay.
14
00:01:04,940 --> 00:01:08,790
Again, your computer can perform several billion operations per second.
15
00:01:08,790 --> 00:01:13,280
Solving 100 instances is usually okay and even solving 10,000 instances.
16
00:01:13,280 --> 00:01:18,870
So 10,000 times 1 million that's 10 billion, which sounds like a lot but actually again
17
00:01:18,870 --> 00:01:21,660
desktop computers can handle this in most cases.
18
00:01:21,660 --> 00:01:26,480
Now what about this case here? Well, 1 billion is okay. We just said that over here.
19
00:01:26,480 --> 00:01:29,830
And we said even 10 billion was okay. So what about 100 billion?
20
00:01:29,830 --> 00:01:34,840
Well, this I think here is kind of debatable but I would still say it's worth a try probably.
21
00:01:34,840 --> 00:01:39,990
Then you have to probably wait for two or three or four days until all instances are solved
22
00:01:39,990 --> 00:01:43,190
or maybe even one month but in most cases that should be okay.
23
00:01:43,190 --> 00:01:48,340
You wouldn't even attempt to solve NP-commplete problem in real time for example.
24
00:01:48,340 --> 00:01:50,360
This one here I think is the most debatable.
25
00:01:50,360 --> 00:01:55,190
So 10,000 billion I would say in a general case you wouldn't want to do that.
26
00:01:55,190 --> 00:01:58,950
I mean, there might be special cases if you work for the secret service
27
00:01:58,950 --> 00:02:02,440
and you have a huge cluster of computers available but here you're kind of
28
00:02:02,440 --> 00:02:07,400
reaching the limits of brute force and here of course as we just said you've already reached that.
29
00:02:07,400 --> 00:02:13,250
So again, brute force can work if the instances are small or maybe even moderate
30
00:02:13,250 --> 00:02:19,660
but this approach is bound to fail at some point and that's when to apply intelligence
31
00:02:19,660 --> 99:59:59,000
instead of brute force using intelligent force.
PK #iA
pI 1 Unit_4.1_-_Pruning_the_Input/13-Loophole_1.en.srt1
00:00:00,000 --> 00:00:03,770
So let's have a look back at the laws of NP completeness,
2
00:00:03,770 --> 00:00:09,200
and the laws of NP completeness, of course, is a book that only holds unless P=NP.
3
00:00:09,200 --> 00:00:14,490
If anybody ever proves P=NP, then we can recycle this book here or delete the e-book.
4
00:00:14,490 --> 00:00:18,880
So what do the laws of NP completeness say--they say, "No polynomial algorithm."
5
00:00:18,880 --> 00:00:24,970
So now it's time to think like lawyers and try to put loopholes into the statement here
6
00:00:24,970 --> 00:00:27,390
and there's two loopholes that you can poke into this.
7
00:00:27,390 --> 00:00:33,320
One is that we only say, "No polynomial time." We do not say, "Exponential time."
8
00:00:33,320 --> 00:00:39,390
What do I mean by that? Well, what about a running time like 2^√n.
9
00:00:39,390 --> 00:00:44,860
It's not exponential time--it's called sub-exponential because you don't have the n as it is
10
00:00:44,860 --> 00:00:48,100
appear in the exponent, but it's also not polynomial time because you have
11
00:00:48,100 --> 00:00:50,060
some function of n up here in the exponent.
12
00:00:50,060 --> 00:00:55,360
A running time like this actually conforms with the law and it has an excellent speedup.
13
00:00:55,360 --> 00:00:59,150
So let's think about this for a minute. Let's say n=50.
14
00:00:59,150 --> 00:01:09,490
If n=50, what is the speedup of 2^√n versus 2^n. Of course, we're going to do this as a quiz.
15
00:01:09,490 --> 00:01:14,140
Is it over a million, is it over a billion, or is it over a trillion.
16
00:01:14,140 --> 99:59:59,000
Of course, I only want you to select one answer here, the best possible one.
PK #iAgv 1 Unit_4.1_-_Pruning_the_Input/14-Loophole_1.en.srt1
00:00:00,000 --> 00:00:03,320
The speedup here is traumatic, just because we write the square root here.
2
00:00:03,320 --> 00:00:15,290
It's over a trillion and the reason is that 2⁵⁰ is about 10¹⁵ whereas 2^√⁵⁰, so the ^√⁵⁰ is about 7,
3
00:00:15,290 --> 00:00:20,390
2^√ is about 134.
4
00:00:20,390 --> 00:00:26,320
You have 100 and 34 versus 10¹⁵, which is a speedup of over a trillion,
5
00:00:26,320 --> 00:00:30,560
and we're still within the laws of NP completeness using this function here.
6
00:00:30,560 --> 99:59:59,000
We can call this loophole 1 in the laws of NP completeness.
PK #iA 1 Unit_4.1_-_Pruning_the_Input/15-Loophole_2.en.srt1
00:00:00,000 --> 00:00:05,190
Now, I'll show you another function that is actually exponential time but also isn't that bad.
2
00:00:05,190 --> 00:00:08,000
What about 1.1^n?
3
00:00:08,000 --> 00:00:13,290
Absolutely not a polynomial time algorithm but still a huge speedup compared to 2^n.
4
00:00:13,290 --> 00:00:20,160
Let's do another quiz here--speedup of 1.1^n versus 2^n, and again,
5
00:00:20,160 --> 00:00:24,740
we're going to say that n=50 then you don't have to recalculate this here.
6
00:00:24,740 --> 00:00:29,970
You already know that it's about 10¹⁵ and I'll give you the same choices here.
7
00:00:29,970 --> 99:59:59,000
Is the speedup bigger than a million, bigger than a billion, or bigger than a trillion.
PK #iAVB : Unit_4.1_-_Pruning_the_Input/16-Loophole_2_Solution.en.srt1
00:00:00,000 --> 00:00:08,530
So 1.1⁵⁰ is about 117 which again is a speedup of over a trillion, so what we now have found
2
00:00:08,530 --> 00:00:15,300
is a second loophole, where the first loophole was sub-exponential time and the second loophole
3
00:00:15,300 --> 00:00:17,920
is small bases for the exponent.
4
00:00:17,920 --> 00:00:24,070
Not having a 2 here but a 1.1 or maybe a 1.2, so now of course the question is--
5
00:00:24,070 --> 00:00:27,610
can we actually exposed those loopholes, so can we find algorithms
6
00:00:27,610 --> 00:00:29,260
that have these running times here.
7
00:00:29,260 --> 00:00:33,380
The answer is--yes, we can, and that is exactly the point of using
8
00:00:33,380 --> 00:00:39,890
intelligent force versus brute force, and I'm actually going to show you how this loophole here works.
9
00:00:39,890 --> 99:59:59,000
And the problem for which we're going to show this is vertex cover.
PK #iAJ5 5 B Unit_4.1_-_Pruning_the_Input/17-Brute_Force_On_Vertex_Cover.en.srt1
00:00:00,000 --> 00:00:04,240
It's been awhile since we last met Alice. It was working on vertex cover.
2
00:00:04,240 --> 00:00:07,310
And actually, the last time we left off, Alice wasn't happy like that.
3
00:00:07,310 --> 00:00:12,410
Actually, Alice was quite unhappy because we proved vertex cover to be NP complete.
4
00:00:12,410 --> 00:00:16,730
The last time we left off, there was little hope for her to ever solve
5
00:00:16,730 --> 00:00:18,790
her vertex cover problem that she was working on.
6
00:00:18,790 --> 00:00:22,630
Now, I promised to show you an intelligent algorithm for vertex cover.
7
00:00:22,630 --> 00:00:24,870
At least one that's more intelligent than brute force.
8
00:00:24,870 --> 00:00:28,480
But in order to do that, let's first start out with a brute force algorithm.
9
00:00:28,480 --> 00:00:33,570
Let's say Alice is running her algorithm on a very simple graph like this.
10
00:00:33,570 --> 00:00:35,670
Just four vertices and five edges.
11
00:00:35,670 --> 00:00:39,480
What her simple algorithm or brute force algorithm was doing was it was considering
12
00:00:39,480 --> 00:00:43,320
all possible assignments of 0 and 1 to the vertices.
13
00:00:43,320 --> 00:00:48,830
For four vertices, we had a total of 16 possibilities that her algorithm needed to look at.
14
00:00:48,830 --> 00:00:51,970
Her algorithm starts out, for example, by saying,
15
00:00:51,970 --> 00:00:55,560
"Oh, let's put no vertices into the vertex cover."
16
00:00:55,560 --> 00:01:00,310
And of course this leaves all edges uncovered so it's not even a valid solution.
17
00:01:00,310 --> 00:01:04,959
Then, it would maybe start out by saying, "Oh, let's put one vertex into the vertex cover."
18
00:01:04,959 --> 00:01:09,650
There's actually four different ways of doing this and at least this time we have some edges covered.
19
00:01:09,650 --> 00:01:13,780
None of these solutions is the valid solution because we still have uncovered edges
20
00:01:13,780 --> 00:01:16,450
and I've drawn a red here so you can better see them.
21
00:01:16,450 --> 00:01:18,540
No valid solution for any of those.
22
00:01:18,540 --> 00:01:24,030
The algorithm continues and says, "Oh then, let's put two vertices into the vertex cover."
23
00:01:24,030 --> 00:01:28,850
Finally, the algorithm has found a valid vertex cover for our small network
24
00:01:28,850 --> 00:01:32,920
but of course it has also found many, many vertex covers that are actually invalid
25
00:01:32,920 --> 00:01:37,270
because not all edges are covered so again redundant, redundant, redundant.
26
00:01:37,270 --> 00:01:41,890
And all the algorithm proceeds and although it has already found a solution where
27
00:01:41,890 --> 00:01:46,890
two vertices are enough for the vertex cover, it now puts three vertices into the vertex cover
28
00:01:46,890 --> 00:01:51,940
which has the advantage that it will always find a valid vertex cover but the disadvantage that
29
00:01:51,940 --> 00:01:55,500
it's totally redundant and unnecessary work.
30
00:01:55,500 --> 00:02:00,400
Same thing if you put all four into the vertex cover. Valid solutions but not the best possible ones.
31
00:02:00,400 --> 00:02:05,810
This is what your brute force algorithm is doing and you can see why brute force is so stupid
32
00:02:05,810 --> 00:02:10,949
because first of all it considers many solutions where you could have immediately said,
33
00:02:10,949 --> 00:02:15,940
"Look, this is not valid." You've taken an edge and none of the two end points is in the vertex cover.
34
00:02:15,940 --> 00:02:18,070
Why even consider all the other possibilities?
35
00:02:18,070 --> 00:02:22,150
As soon as you have an edge that is not covered, you can basically scrap that solution
36
00:02:22,150 --> 00:02:27,150
and go looking for another one and the other thing that her algorithm did that wasn't very smart was
37
00:02:27,150 --> 00:02:31,190
once it had found a solution with two vertices it still tried all the solutions
38
00:02:31,190 --> 00:02:33,360
with three vertices and four vertices.
39
00:02:33,360 --> 00:02:37,190
It could've stopped right here because it knew that it would not be able to find
40
00:02:37,190 --> 00:02:39,120
any vertex cover that is smaller.
41
00:02:39,120 --> 00:02:41,880
And that is the difference between brute force and intelligent force.
42
00:02:41,880 --> 00:02:46,670
In brute force, we go through all of the possible assignments of the vertices into the vertex cover
43
00:02:46,670 --> 00:02:52,260
out of the vertex cover whereas what we're now going to do is we're going to avoid stupid choices.
44
00:02:52,260 --> 99:59:59,000
And that is the basis for intelligent force.
PK #iA"hA8 8 2 Unit_4.1_-_Pruning_the_Input/18-Search_Tree.en.srt1
00:00:00,000 --> 00:00:04,990
The way to avoid stupid choices in the algorithm and actually we can make Alice
2
00:00:04,990 --> 00:00:08,700
a little bit more happy because we're soon going to show her that we're not going to make her
3
00:00:08,700 --> 00:00:11,770
totally happy but at least neutral because we're now going to show her
4
00:00:11,770 --> 00:00:13,400
a better technique than brute force.
5
00:00:13,400 --> 00:00:18,200
And that better technique is known as a search tree and you would soon see why it's called a tree.
6
00:00:18,200 --> 00:00:22,860
If you were solving vertex cover by yourself, you would probably take the following approach.
7
00:00:22,860 --> 00:00:27,270
You would not see for all vertices together if they are in the vertex cover or not
8
00:00:27,270 --> 00:00:32,439
but rather you would look at a single vertex, say this one here, and then split it into two possibilities.
9
00:00:32,439 --> 00:00:36,790
One, you're going to say, "Yes, let's make this vertex part of the vertex cover."
10
00:00:36,790 --> 00:00:41,010
And the other one, you would say, "No, I do not want this vertex to be part of the vertex cover."
11
00:00:41,010 --> 00:00:46,660
And then, you're going to split down further so on this side here and on this side here.
12
00:00:46,660 --> 00:00:48,870
Let's look at this vertex here next.
13
00:00:48,870 --> 00:00:52,260
We're just going to draw it a little bit smaller so that we don't run out of space.
14
00:00:52,260 --> 00:00:57,060
We already know that this vertex here is in the vertex cover
15
00:00:57,060 --> 00:00:59,630
and we're now deciding for this vertex here on the left.
16
00:00:59,630 --> 00:01:04,110
So we'll have one case where we do put it into the vertex cover and another one where
17
00:01:04,110 --> 00:01:06,130
we don't put it into the vertex cover.
18
00:01:06,130 --> 00:01:09,780
And then for each of these cases again we can split into two possibilities.
19
00:01:09,780 --> 00:01:13,380
And this time we're going to look at the vertex down here. Same here.
20
00:01:13,380 --> 00:01:19,050
One case where we put it into the vertex cover and another case where we don't.
21
00:01:19,050 --> 00:01:25,180
Now here for this case already, we can stop. We have found a solution of size 3.
22
00:01:25,180 --> 00:01:30,450
All edges are covered. We do not even need to consider the possibilities for this vertex here.
23
00:01:30,450 --> 00:01:34,680
Over here, we still have to consider two possibilities so we still need to make a decision
24
00:01:34,680 --> 00:01:39,770
here on the right side so we can either put it into the vertex cover or not put it into the vertex cover.
25
00:01:39,770 --> 00:01:43,570
And so this solution here or it's not solution because it's not valid.
26
00:01:43,570 --> 00:01:47,610
And here again we have found a solution of size 3 or a vertex cover of size 3.
27
00:01:47,610 --> 00:01:51,180
So I'm just going to leave this check mark here and have this account for both.
28
00:01:51,180 --> 00:01:56,320
So now we still have to continue here for these four points where we left off earlier
29
00:01:56,320 --> 00:02:01,020
and again we're going to do the same thing so we're going to consider one possibility where
30
00:02:01,020 --> 00:02:04,470
this vertex down here is not in the vertex cover and again we see.
31
00:02:04,470 --> 00:02:07,650
Oops! This is invalid so we do not need to continue any further.
32
00:02:07,650 --> 00:02:10,199
The next time we're going to put this vertex into the vertex cover.
33
00:02:10,199 --> 00:02:12,990
And yes, we have a valid solution of size 2.
34
00:02:12,990 --> 00:02:16,590
Continuing over here. We're going to consider this vertex here again.
35
00:02:16,590 --> 00:02:23,210
So one possibility would be to not put it into the vertex cover but that is already invalid.
36
00:02:23,210 --> 00:02:28,220
And another one would be to put it into the vertex cover, which again leads us to further
37
00:02:28,220 --> 00:02:31,970
possibilities to check and we're going to check them for this one down here.
38
00:02:31,970 --> 00:02:37,050
So same game. Invalid solution over here. Still valid over here.
39
00:02:37,050 --> 00:02:39,450
And we need to consider two more cases.
40
00:02:39,450 --> 00:02:42,460
And you already know that this case here is going to be invalid.
41
00:02:42,460 --> 00:02:46,760
And over here we're finding a solution of size 3. So that's all that.
42
00:02:46,760 --> 00:02:52,960
Originally, we had with brute force 16 assignments of 0 and 1 to the vertices that we needed to check
43
00:02:52,960 --> 00:02:56,990
and now this algorithm here only considered 9 assignments.
44
00:02:56,990 --> 00:03:00,640
So I told you at the beginning of this course that constant factor speedups
45
00:03:00,640 --> 00:03:04,580
probably don't matter that much when we're dealing with exponential time algorithms.
46
00:03:04,580 --> 00:03:07,990
So we will have to do some further analysis on this one.
47
00:03:07,990 --> 00:03:12,950
And you also noticed that there could be certain other techniques that we can use for speedups.
48
00:03:12,950 --> 00:03:17,490
So for example here, we had already found a solution of size 2.
49
00:03:17,490 --> 00:03:20,760
So we probably could've even stopped our algorithm a little bit earlier
50
00:03:20,760 --> 00:03:22,550
over here but we're going to get into this.
51
00:03:22,550 --> 00:03:26,460
The main thing that I would like you to understand is the technique of the search tree
52
00:03:26,460 --> 00:03:31,800
and we're now going to analyze this search tree further and try to improve it.
53
00:03:31,800 --> 00:03:34,980
And of course, I should also tell you why it is called a search tree.
54
00:03:34,980 --> 00:03:39,370
And to show you this, draw lines along where the algorithm was searching
55
00:03:39,370 --> 00:03:44,790
and then we're going to rotate this picture by 180 degrees and now here you can see
56
00:03:44,790 --> 00:03:50,550
a beautiful tree with the solution as the leaves or the fruit or whatever you will.
57
00:03:50,550 --> 00:03:53,180
But this is why it's called a search tree.
58
00:03:53,180 --> 00:03:55,760
It's basically a tree that hangs upside down.
59
00:03:55,760 --> 00:03:58,540
I don't know what the terminology tells you about computer scientists
60
00:03:58,540 --> 99:59:59,000
but that's not my concern here.
PK #iAϥN N @ Unit_4.1_-_Pruning_the_Input/19-Improving_The_Search_Tree.en.srt1
00:00:00,000 --> 00:00:05,320
Using the search tree for practical purposes, we may have already gained some efficiency.
2
00:00:05,320 --> 00:00:09,720
But now the question is, does it really affect the running time of the algorithm when we use
3
00:00:09,720 --> 00:00:11,960
O notation and worst-case running time?
4
00:00:11,960 --> 00:00:16,480
Or is it just something that gives us some constant factor saving some practice?
5
00:00:16,480 --> 00:00:20,810
It's hard to tell right now because this tree as we've considered it is highly dependent
6
00:00:20,810 --> 00:00:25,910
on the structure of the graph that we're analyzing and we need to think a little bit more about this.
7
00:00:25,910 --> 00:00:28,460
One interesting thing to consider is the following:
8
00:00:28,460 --> 00:00:33,430
There were only two cases when we stopped further exploration in this tree here.
9
00:00:33,430 --> 00:00:36,570
One case was that we had found a valid vertex cover.
10
00:00:36,570 --> 00:00:41,720
The other one was when we considered an edge that could not be covered any more.
11
00:00:41,720 --> 00:00:45,810
Actually doing it this way is not the smartest possible way because for vertex cover
12
00:00:45,810 --> 00:00:47,100
we already know one thing.
13
00:00:47,100 --> 00:00:51,620
If you look at an edge such as this one here, then this edge here has two endpoints
14
00:00:51,620 --> 00:00:56,940
and now we want to assign these endpoints to be in the vertex cover or to not be in the vertex cover.
15
00:00:56,940 --> 00:00:59,390
So far we have looked at the vertices individually,
16
00:00:59,390 --> 00:01:06,440
but we could also look at both vertices at the same time so not go into two different possibilities
17
00:01:06,440 --> 00:01:09,040
but actually go in three different possibilities.
18
00:01:09,040 --> 00:01:13,350
There's three cases that makes sense here of assigning the vertices
19
00:01:13,350 --> 00:01:15,090
to be on the vertex cover or not.
20
00:01:15,090 --> 00:01:19,780
So we know this edge needs to be covered somehow and there's actually just
21
00:01:19,780 --> 00:01:21,740
three different possibilities of doing that.
22
00:01:21,740 --> 00:01:25,390
One is you take this endpoint here into the vertex cover.
23
00:01:25,390 --> 00:01:29,070
The other one is you take this vertex here into the vertex cover.
24
00:01:29,070 --> 00:01:31,600
Or of course you can also take both.
25
00:01:31,600 --> 00:01:36,390
But you can ignore the case where you would put none of the two endpoints into the vertex cover
26
00:01:36,390 --> 00:01:39,050
because then you already know that your solution doesn't make sense.
27
00:01:39,050 --> 00:01:45,280
This of course also covers other edges as well and now our algorithm can actually very quickly
28
00:01:45,280 --> 00:01:49,260
come to a solution because when we look at this edge here, there's only one possible choice
29
00:01:49,260 --> 00:01:52,730
that remains and that is taking this vertex here into the vertex cover.
30
00:01:52,730 --> 00:01:54,990
All edges are covered. We have a solution of size 2.
31
00:01:54,990 --> 00:02:01,330
So now let's do a worst-case analysis and say that this is the edge that the algorithm considers next.
32
00:02:01,330 --> 00:02:06,700
In all of the three cases, the edge is uncovered so it again goes into three possibilities here,
33
00:02:06,700 --> 00:02:09,960
three possibilities here, and three possibilities here.
34
00:02:09,960 --> 00:02:13,700
Of course if we had been lucky and had chosen this edge here,
35
00:02:13,700 --> 00:02:18,040
the algorithm would have had a much easier choice and actually we're later going to look at
36
00:02:18,040 --> 00:02:23,300
such optimizations for the algorithm and again we're going to branch into the three cases that
37
00:02:23,300 --> 00:02:27,960
makes sense for that edge so this one here would make sense for that edge
38
00:02:27,960 --> 00:02:30,040
but of course it leads to an invalid solution.
39
00:02:30,040 --> 00:02:32,730
This one here is actually the best possible solution.
40
00:02:32,730 --> 00:02:35,960
And this one here is also a solution but it's a larger one,
41
00:02:35,960 --> 00:02:37,760
and we're going to do the same thing over here.
42
00:02:37,760 --> 00:02:42,050
Now, what you might be thinking is, "Oh no! Now we're trying nine assignments."
43
00:02:42,050 --> 99:59:59,000
But the good thing is that, we can now do a worst-case analysis of this.
PK #iAc? ? M Unit_4.1_-_Pruning_the_Input/20-Properties_Of_Search_Tree_Construction.en.srt1
00:00:00,000 --> 00:00:04,370
I would like you to think about a few properties of constructing the search tree this way,
2
00:00:04,370 --> 00:00:07,460
so taking an edge that is not yet covered--
3
00:00:07,460 --> 00:00:10,540
so where both endpoints are not part of the vertex cover--
4
00:00:10,540 --> 00:00:15,250
and then branching into 3 different possibilities, putting 1 endpoint into the vertex cover,
5
00:00:15,250 --> 00:00:18,030
putting the other point into the vertex cover, or both.
6
00:00:18,030 --> 00:00:21,650
So will the search tree always find the smallest vertex cover?
7
00:00:21,650 --> 00:00:26,090
Of course it can also find larger ones, but will it always find the smallest possible one
8
00:00:26,090 --> 00:00:27,740
for any given graph?
9
00:00:27,740 --> 00:00:30,330
Or could there be special cases where the search tree is wrong
10
00:00:30,330 --> 00:00:32,380
and we still need to fix that?
11
00:00:32,380 --> 00:00:39,260
Is it that the search tree at each level--and by level I basically mean if this is the tree here,
12
00:00:39,260 --> 00:00:43,840
then this would be a level and this would be the next level--
13
00:00:43,840 --> 00:00:48,300
is it that at each level the algorithm determines the assignment?
14
00:00:48,300 --> 00:00:53,690
So is a vertex in the vertex cover or not for at least 1 vertex?
15
00:00:53,690 --> 00:00:57,990
And this will require some thought, so this is a challenging one down here.
16
00:00:57,990 --> 00:01:02,720
Can we construct the search tree so that at each level of the search tree
17
00:01:02,720 --> 00:01:07,350
we determine the assignment of at least 2 vertices?
18
00:01:07,350 --> 00:01:09,270
And of course you could just guess here,
19
00:01:09,270 --> 00:01:13,260
but I would really like you to spend some time and think about the answers
20
00:01:13,260 --> 00:01:16,910
because this will be very important for you to really understand
21
00:01:16,910 --> 99:59:59,000
how you can build better search trees.
PK #iAT T V Unit_4.1_-_Pruning_the_Input/21-Properties_Of_Search_Tree_Construction_Solution.en.srt1
00:00:00,000 --> 00:00:02,960
The algorithm is correct, or the search tree is correct,
2
00:00:02,960 --> 00:00:06,190
because it still tries all possibilities;
3
00:00:06,190 --> 00:00:10,360
it's just smarter about avoiding stupid possibilities.
4
00:00:10,360 --> 00:00:15,120
And what I mean by this is basically that it deliberately avoids this case here
5
00:00:15,120 --> 00:00:18,240
where you have 2 endpoints of an edge not being in the vertex cover,
6
00:00:18,240 --> 00:00:19,690
so it tries to avoid those.
7
00:00:19,690 --> 00:00:23,330
Are there special cases where the algorithm could be wrong? No.
8
00:00:23,330 --> 00:00:25,710
It still searches through all possibilities.
9
00:00:25,710 --> 00:00:30,070
Again, it just does so in an intelligent way to avoid making any stupid choices.
10
00:00:30,070 --> 00:00:35,270
Does the search tree at each level determine the assignment of at least 1 vertex? Yes.
11
00:00:35,270 --> 00:00:38,740
That one is definitely true because if all edges would be covered,
12
00:00:38,740 --> 00:00:42,400
then the algorithm would be done; the search tree would not go much further.
13
00:00:42,400 --> 00:00:47,200
The usual case would be that we actually do an assignment for 2 vertices,
14
00:00:47,200 --> 00:00:50,330
and this will also tell you something about the answer down here.
15
00:00:50,330 --> 00:00:52,760
So usually when we have an edge that is not covered,
16
00:00:52,760 --> 00:00:56,970
we will have the case that we actually determine the assignment for 2 vertices.
17
00:00:56,970 --> 00:01:00,550
There is 1 special case, and if you have found that special case yourself,
18
00:01:00,550 --> 00:01:02,680
then you have truly understood search trees.
19
00:01:02,680 --> 00:01:06,970
But if you haven't, then just listen closely and don't worry too much about it.
20
00:01:06,970 --> 00:01:10,730
There might be the following case, where you have an edge that is not yet covered
21
00:01:10,730 --> 00:01:13,430
but you've already made the decision for 1 of the endpoints.
22
00:01:13,430 --> 00:01:15,750
The question is now what to do with this one here.
23
00:01:15,750 --> 00:01:18,400
But this is actually a case that is rather easy
24
00:01:18,400 --> 00:01:21,500
because here we don't really have to make a choice anymore
25
00:01:21,500 --> 00:01:23,670
because we know this edge must be covered.
26
00:01:23,670 --> 00:01:28,860
So whenever we decide for 1 vertex that it is not going to be part of the vertex cover,
27
00:01:28,860 --> 00:01:32,330
then we can put all of its neighbors into the vertex cover automatically.
28
00:01:32,330 --> 00:01:34,160
We do not need to make a choice here anymore,
29
00:01:34,160 --> 00:01:37,940
and this is why we can construct the search tree so that at each level
30
00:01:37,940 --> 00:01:40,780
we determine the assignment of at least 2 vertices.
31
00:01:40,780 --> 00:01:44,240
Just to make this very clear, let me show you this again here in this example.
32
00:01:44,240 --> 00:01:46,050
Let's go back 1 level.
33
00:01:46,050 --> 00:01:49,940
When we made the choice here to not put this vertex into the vertex cover,
34
00:01:49,940 --> 00:01:55,100
we could immediately also have said that this vertex here must be part of the vertex cover
35
00:01:55,100 --> 00:01:59,290
because otherwise this edge here cannot be covered and we would already have been done.
36
00:01:59,290 --> 00:02:01,180
And the same thing up here.
37
00:02:01,180 --> 00:02:04,810
So if we decide that this vertex is not part of the vertex cover,
38
00:02:04,810 --> 00:02:08,130
then we immediately know that this vertex must come into the vertex cover
39
00:02:08,130 --> 00:02:09,840
because otherwise this edge here would not be covered,
40
00:02:09,840 --> 00:02:13,670
and we also know because of this edge here that we have to put this vertex here
41
00:02:13,670 --> 00:02:18,200
into the vertex cover, so we are also done, having found a slightly larger solution.
42
00:02:18,200 --> 00:02:21,810
The only thing where we still need to make a decision is this part down here.
43
00:02:21,810 --> 00:02:27,030
There we can still go into 3 cases, either putting that one into the vertex cover, that one, or both.
44
00:02:27,030 --> 00:02:29,700
And this is of course something that an algorithm can do as well.
45
00:02:29,700 --> 00:02:32,000
So actually, we get down to 5 assignments here,
46
00:02:32,000 --> 00:02:36,530
which already makes this search tree better than the one we started out with.
47
00:02:36,530 --> 00:02:38,800
But now comes the really cool part
48
00:02:38,800 --> 00:02:42,320
because once we know that we can construct the search tree
49
00:02:42,320 --> 00:02:47,650
so that at each level we determine the assignment of at least 2 vertices,
50
00:02:47,650 --> 00:02:52,340
we can do a pretty, pretty cool worst case analysis of that search tree.
51
00:02:52,340 --> 99:59:59,000
So let me show you.
PK #iA- 8 Unit_4.1_-_Pruning_the_Input/22-Smart_Search_Tree.en.srt1
00:00:00,000 --> 00:00:03,740
We started out with a brute force algorithm,
2
00:00:03,740 --> 00:00:10,640
and the brute force algorithm basically considered 2 to the power of N assignments.
3
00:00:10,640 --> 00:00:14,510
Now let's look at our smart search tree or intelligent search tree
4
00:00:14,510 --> 00:00:16,379
and see how that is structured.
5
00:00:16,379 --> 00:00:19,320
We're going to do a worst case analysis here.
6
00:00:19,320 --> 00:00:22,360
We start out with a graph where we don't have any assignments,
7
00:00:22,360 --> 00:00:26,760
and then we branch into 3 different possibilities.
8
00:00:26,760 --> 00:00:30,660
Up here we don't know anything, so we have 0 vertices assigned.
9
00:00:30,660 --> 00:00:32,340
Now, since we know that we can construct it
10
00:00:32,340 --> 00:00:36,150
so that in the next level we will have at least 2 more vertices assigned,
11
00:00:36,150 --> 00:00:39,240
here we know that we have at least 2 vertices assigned.
12
00:00:39,240 --> 00:00:43,970
We can go 1 level deeper and again we will branch into 3 possibilities all of the time.
13
00:00:43,970 --> 00:00:47,110
Because we're doing a worst case analysis, of course we could already be done.
14
00:00:47,110 --> 00:00:48,860
Or let's say we just continue.
15
00:00:48,860 --> 00:00:52,480
In the next level we have again at least 2 more,
16
00:00:52,480 --> 00:00:56,150
so we have at least 4 vertices assigned and so on.
17
00:00:56,150 --> 00:00:59,250
So let's say we continue the search tree in this manner,
18
00:00:59,250 --> 00:01:01,290
so we will always get wider and wider and wider.
19
00:01:01,290 --> 00:01:03,010
Then we know 2 things.
20
00:01:03,010 --> 00:01:08,220
One is that the number of levels that we have can be, at most, N/2
21
00:01:08,220 --> 00:01:11,190
because every time we assign at least 2 vertices.
22
00:01:11,190 --> 00:01:14,940
So N and a half levels is the maximum number of levels that we can have
23
00:01:14,940 --> 00:01:17,700
because after that, all vertices have been assigned.
24
00:01:17,700 --> 00:01:23,030
And another thing that we know is that while we start out with 1 possibility
25
00:01:23,030 --> 00:01:27,760
and each time we go 1 level deeper, again doing a worse case analysis here,
26
00:01:27,760 --> 00:01:31,120
the number of possibilities that we consider triples.
27
00:01:31,120 --> 00:01:34,220
The tree basically gets wider by a factor of 3.
28
00:01:34,220 --> 00:01:36,460
The width is times 3 at each level.
29
00:01:36,460 --> 00:01:41,610
My question for you is if you look at the lowest level of the search tree,
30
00:01:41,610 --> 00:01:46,930
each of these ones down here is an assignment of 0 and 1 to the vertices.
31
00:01:46,930 --> 00:01:51,070
What I would like you to think about is how many different assignments
32
00:01:51,070 --> 00:01:54,500
do we have at this level down here, level N/2?
33
00:01:54,500 --> 00:01:59,050
You can assume for simplicity that N is an even number, so N/2 will be some integer.
34
00:01:59,050 --> 00:02:00,650
I'll give you a number of choices.
35
00:02:00,650 --> 00:02:04,430
Is it 2 to the power of N assignments that we have down here?
36
00:02:04,430 --> 00:02:09,330
Is it 3 to the power of N/2 assignments that we have down here?
37
00:02:09,330 --> 00:02:14,540
Is it 3 to the power of N? Is it 2 to the power of 3 times N?
38
00:02:14,540 --> 00:02:19,010
Or is it 2 to the power of N/2 times 3?
39
00:02:19,010 --> 00:02:21,000
You might have to think about this for a little bit,
40
00:02:21,000 --> 00:02:23,070
but just keep in mind the facts.
41
00:02:23,070 --> 00:02:27,220
The number of assignments that we are considering triples at each level,
42
00:02:27,220 --> 99:59:59,000
and we have N and a half levels.
PK #iAPq q A Unit_4.1_-_Pruning_the_Input/23-Smart_Search_Tree_Solution.en.srt1
00:00:00,000 --> 00:00:04,520
The correct answer here is 3 to the power of N/2
2
00:00:04,520 --> 00:00:08,600
because if you look at this tree, it starts out here at this level, at the first level,
3
00:00:08,600 --> 00:00:11,760
and it has 0 assignments that it's considering.
4
00:00:11,760 --> 00:00:17,910
At level 1 it's considering 3 possible assignments, so it triples at each time.
5
00:00:17,910 --> 00:00:20,380
At level 2 it's considering 9.
6
00:00:20,380 --> 00:00:24,060
At level 3 it will be considering 27 and so on and so on and so on.
7
00:00:24,060 --> 00:00:29,590
For level X it's considering 3 to the power of X different possibilities.
8
00:00:29,590 --> 00:00:33,960
The maximum level is N/2, so it's 3 to the power of N/2.
9
00:00:33,960 --> 99:59:59,000
So the correct solution here is 3 to the power of N/2.
PK #iA8/ҝ 4 Unit_4.1_-_Pruning_the_Input/24-2^N_Or_3^(N2).en.srt1
00:00:00,000 --> 00:00:02,060
So, wait.
2
00:00:02,060 --> 00:00:06,200
Brute force, 2 to the power of N, and now we have a 3 in the base of the exponent?
3
00:00:06,200 --> 00:00:08,700
Well, we probably need to do a little calculation here.
4
00:00:08,700 --> 00:00:12,950
So what I would like you to tell me is what 3 to the power of N/2
5
00:00:12,950 --> 00:00:15,140
can actually also be expressed as.
6
00:00:15,140 --> 00:00:21,430
So 3 to the power of N/2 is a certain number to the power of N,
7
00:00:21,430 --> 99:59:59,000
and I would like you to enter this number with 3 digits after the decimal and please round it up.
PK #iA8Km 4 Unit_4.1_-_Pruning_the_Input/25-2^N_Or_3^(N2).en.srt1
00:00:00,000 --> 00:00:02,600
The answer here is 3 to the power of N/2
2
00:00:02,600 --> 00:00:07,160
is the same as 3 to the power of ½ to the power of N,
3
00:00:07,160 --> 00:00:11,290
which is the same as the root of 3 to the power of N,
4
00:00:11,290 --> 00:00:17,070
which, if you use a calculator, is 1.733 to the power of N.
5
00:00:17,070 --> 00:00:20,540
Instead of 2 to the power of N assignments,
6
00:00:20,540 --> 00:00:23,830
we now only have to check 1.733 to the power of N.
7
00:00:23,830 --> 00:00:26,700
We have just squashed the exponent of our algorithm
8
00:00:26,700 --> 99:59:59,000
using a smart search tree instead of brute force.
PK #iAP
/ Unit_4.1_-_Pruning_the_Input/26-Speed_Up.en.srt1
00:00:00,000 --> 00:00:01,910
You might not be too impressed here.
2
00:00:01,910 --> 00:00:07,530
So 2 to the power of N, 1.733 to the power of N, hmm, doesn't sound too good.
3
00:00:07,530 --> 00:00:09,610
But there's 2 things to keep in mind here.
4
00:00:09,610 --> 00:00:12,010
First of all, this is a worst case analysis,
5
00:00:12,010 --> 00:00:14,760
so in practice we might be doing actually better.
6
00:00:14,760 --> 00:00:16,800
And the second thing is even in a worst case analysis,
7
00:00:16,800 --> 00:00:19,370
because we are dealing with exponential time,
8
00:00:19,370 --> 00:00:21,890
this improvement here is actually huge.
9
00:00:21,890 --> 00:00:25,290
So let's say that we're looking at an input of size 50.
10
00:00:25,290 --> 00:00:29,580
Then 2 to the power of N assignments would be 10 to the power of 15,
11
00:00:29,580 --> 00:00:33,450
about 1.13 times 10 to the power of 15.
12
00:00:33,450 --> 00:00:35,930
So this is almost infeasible.
13
00:00:35,930 --> 00:00:39,890
Maybe with a very, very, very, very, very powerful computer,
14
00:00:39,890 --> 00:00:41,430
but you probably wouldn't want to do this
15
00:00:41,430 --> 00:00:45,330
and probably also wouldn't want to do this for more than a single graph.
16
00:00:45,330 --> 00:00:50,700
If, however, you have the smarter search tree, then you have 1.733 to the power of N,
17
00:00:50,700 --> 00:00:57,440
which is 1.733 to the power of 50, and that is about 8.71 times 10 to 11.
18
00:00:57,440 --> 00:01:01,750
And the difference here is a speed-up of over 1000.
19
00:01:01,750 --> 00:01:04,790
That speed-up will become larger as N grows larger.
20
00:01:04,790 --> 00:01:08,780
And the other thing is again that this down here is only a worst case analysis.
21
00:01:08,780 --> 00:01:12,400
In many practical cases, the search tree will be much smaller
22
00:01:12,400 --> 00:01:18,440
because first of all, as we saw, it's often the case that you assign more than just 2 vertices,
23
00:01:18,440 --> 00:01:22,110
and in that case the tree doesn't become as high.
24
00:01:22,110 --> 00:01:24,810
And also you can apply various other optimizations,
25
00:01:24,810 --> 00:01:28,600
some of which we're soon going to talk about, to make the search tree even smaller,
26
00:01:28,600 --> 00:01:31,350
both in practical cases and in the worst case.
27
00:01:31,350 --> 00:01:33,530
Now, should Alice be happy about this?
28
00:01:33,530 --> 00:01:37,020
Probably not yet because already for N equal to 50,
29
00:01:37,020 --> 00:01:40,540
the running time of the algorithm here gets rather large.
30
00:01:40,540 --> 00:01:44,350
So if she is solving the problem for a telecommunications network
31
00:01:44,350 --> 00:01:48,850
with, say, 500 vertices, it will still be out of the question to find an optimal solution
32
00:01:48,850 --> 00:01:50,180
using the search tree.
33
00:01:50,180 --> 00:01:52,480
But of course there's a little hope already
34
00:01:52,480 --> 00:01:55,930
because we haven't really fully explored search trees yet.
35
00:01:55,930 --> 00:01:58,230
Maybe it's going to get even better for her.
36
00:01:58,230 --> 00:02:02,060
But before we continue helping Alice, let's have a look at Bob and Carol
37
00:02:02,060 --> 99:59:59,000
and see if can apply smart search trees also for their problems: independent set and clique.
PK #iAO& <