[NOISE] We've got two good seats,
no three good seats up here in the front for
those people standing in the back.
You might as well come now.
You'll get tired.
So I wanna explain something about it first of all.
I'm gonna play three roles today.
The first role is Emeritus Professor Don Knuth from the 21st century.
And that role I'm gonna be standing right here.
Then, I'm gonna briefly play the role of George Forsythe,
who was chair of the Computer Science Department in 1969.
And when I'm George Forsythe, I'm gonna be standing right here, okay?
George is gonna introduce tonight's speaker.
A young 31 year old guy named Don Kanuth and
that young guy is going to be when I stand here, that's who I am.
Okay, so I hope you can understand it.
All right now.
[COUGH] So this whole thing
started with the Stanford has a project going on for some years called Class X.
And the people, the idea is to make
interesting video stuff to put on YouTube for the future.
And so that things that are going on these days at Stanford
can be appreciated by people in future generations.
So they asked me, would I
be able to tape some kind of a presentation for the future?
And I said, well I think it would be fun to do something different and
that would be to try to repeat the first lecture that I gave
after I'd become a professor at Stanford.
This was, I had just become my title changed from
mathematician to computer scientist.
And at that time, as you can see, I wanted to explain what I
expected I might be doing the rest of my life to the people around.
So, that's the beginning of this talk and so
it was November 4th, 1969 that was a Tuesday and
in order to be a little bit authentic I went over to the gap and
I found out they are selling 1969 blue jeans.
>> [LAUGH] >> So I bought myself.
>> [LAUGH] >> A blue jean.
Okay.
Yeah.
And also, believe it or not I'm wearing a wig.
>> [LAUGH] >> And
the reason is just to help me
remember that when I do give the lecture that I speak as much as I can,
as I would have spoken in the 60's.
Now if you want to see what I really looked like
in the lobby of Gates building the first floor there's a really nice historical
display that shows that pictures that were taken of John McCarthy,
Ed McClowsky, Ed Flaginbaum so on.
A lot of Bob Floyd and also you can see some of the students who are writing
and trying tear down the computer center and so on, but then you.
This was the 60's but
actually most of the action of the 60's took place in the 70's, early 70's.
Anyways, so 1969.
And I also put online
some home movies that my wife and I took in the 60's in case
you're really interested in seeing what our cat looked like in 1960.
>> [LAUGH] >> So now, in order to prepare for this I,
by the way I came to Stanford because
it was the worlds greatest computer class department by far, by any measure.
And I mentioned George Forsythe, the chair, well he had worked
very hard all through the sixties to open up this department.
This department actually opened up in 1965 and this was
the place where I could come and be, have all the great
professors in one place instead of having to be isolated somewhere else.
So now, in order to get ready for today,
I went to Stanford archives, where George's appointment calendar was, and
I could look at faculty, minutes of faculty meetings,
and so on, and figure out what's going on.
I want to mention by the way also that,
if anybody deserves the name father of computer science, it's George Forsythe.
So this is also, everybody loved him, and he was wonder.
Everything that's good about Stanford computer science now is
pretty much due to him.
So then I also look at the Stanford daily for November 4th,
and a few of the other days.
And so what was the big news on November 4th?
That morning, it said essentially, the previous night,
Monday night, the campus was quiet for half an hour while all the students were
listening to President Nixon giving a special address about Vietnam.
And evidently what Nixon said was that he had a secret plan which was for
winning the war in Vietnam.
And this was not too well received on campus because vast
majority of the students wanted that to be over with and
that professors Sandy Dorenbush and Martin Kor were
leading a movement called the Vietnam Movement.
So there's lot of stuff going on, but
that still hadn't heated up very much at this time.
We were still going to classes and thinking about so
that's kind of sets the scene from then and
I guess then my next thing is then to be George and
introduce tonight's speaker.
Okay people in the back, come up in front, have a seat.
Welcome to our weekly colloquium.
I'm happy to announce that we'll be able to meet
again in Polya Hall starting at the end of this month.
Now tonight, this afternoon speaker is one of our own.
I'm really happy to introduce Dan Connolly to join our faculty few weeks ago.
And I know some of that he is teaching
our introductory class of 136.
This is gonna become 106 [LAUGH].
And so you probably also know that he wrote a book called
The Art of Computer Programming that took the first volume came out last year and
the second volume came out this summer and
I've been told that his books have set a new record at the Stanford library.
That no other book in the history of the library has been stolen so often,
>> [LAUGH]
>> From Stanford's library.
So now he's gonna give a talk about a kind of a strange
topic called the analysis of algorithms, so I'm looking forward to what he says.
Don?
>> [APPLAUSE]
>> Thanks.
Well, I'm really happy to be here, and at last,
I can officially Call myself a computer scientist.
And this is my, solves my identity crisis.
And I can become part of this terrific faculty that we have here at Stanford.
And in fact, even more important than the wonderful faculty.
That Stanford has is our students.
I think it's quite clear that we've got the absolute, best students.
In fact in this room this afternoon, we've got a substantial
fraction of the best computer science students in the whole world,
and I'm especially happy to be also here to,
to work with my long time colleague Bob Floyd.
Although previously we've done all our work by mail.
And writing letters back and forth frantically.
So at last we're going to be able to work together as colleagues and face to face.
George asked me to give a talk that sort of introduces what I do,
since this is the first time I'm talking here in public
at Stanford after getting this appointment.
And it reminds me that earlier this year, my colleague, Claus Worth.
I'm sort of Claus's
replacement here because Claus was called to be a professor at ETH in Zurich.
And so Europeans have a wonderful tradition of the inaugural lecture.
It's kind of a big deal that the professor's first lecture in Europe
is reported in the newspapers and so on.
Now, I'm not going to give a talk that's going to get reported in any newspaper.
In fact, it's kind of depressing to read the newspapers these days anyway.
But I do want to give a talk that could be considered in some way an inaugural thing.
The subject is really what do I sort of
view as what I want to work on while I'm at Stanford.
Is there some general theme or something that underlies it all?
So [SOUND] so much for that.
Actually, now I have to actually speak again from here.
Unfortunately Emily's
boss decided that Class X was such a classy project.
It wouldn't be good for me to wear a wig throughout this whole
presentation because Stanford is a very serious institution.
>> [LAUGH] >> And so
I don't want to get Emily in trouble with her boss, so.
>> [LAUGH] >> It's too bad, however,
I can't take the sideburns off, they're sort of.
>> [LAUGH] >> So
maybe that will be enough to remind me that I'm speaking from the 70s.
Okay now.
[LAUGH] >> Let's go back then.
So what is this strange title, The Analysis of Algorithms?
Well I can explain that, but actually it goes back to a story.
A couple years ago I was at a conference of mathematicians in Santa Barbara.
And somebody asked me at dinner, what do you do?
And I said I'm a computer scientist.
And computer scientist.
Are you a numerical analyst?
And I said well, no, I like numbers and
things like this but numerical analysis isn't quite the description of what I do.
So, I said, artificial intelligence?
And, I said, well, no, not really that either.
And, so, the guy says, then you must be a language man.
You see, in those days, computer science consisted of three possible things.
One was numerical analysis.
One was artificial intelligence and third was programming languages.
Study of languages, formal languages that are used to describe algorithms.
Okay, well.
But it occurred to me that really I wasn't in any of those categories and so
I had no way to explain what I do, or what I did.
And so that night I started thinking about it and said,
well wait a minute, I gotta have a name for what I like to do.
And so I came up with the, to say I'm okay,
the next time somebody asks me what I do, I'm gonna say an analysis of algorithms.
I didn't really know exactly what it meant at the time but
I knew that I needed names so I pictured it, you know,
Richard Bellman made up the name dynamic programming and everybody gots
started using it and so I figured if I started talking about it long enough,
it would become a well known display.
But anyway, I did know that what I liked to do was analyze algorithms.
I mean I liked algorithms.
I don't have to explain to you guys what an algorithm is.
But when I went to quantitatively studying algorithms and not just qualitatively.
So I'm gonna explain to you by example.
I still haven't really come up with a perfect definition of what
it means The Analysis of Algorithms.
But by example, I mean the working definition is,
whatever I'm interested in is analysis of value.
>> [LAUGH] >> I tend to refine this over the years,
but by giving an example, I think I can explain to you what is.
>> What is sort of special about it.
So, let's take an example.
I want to, I'm gonna work a simple algorithm
so that we can sort of analyze it to the hilt and here we go.
So the algorithm is going to be defined as
the maximum of n number.
So one through xn, got to find the maximum.
So here we go.
Start out.
>> Come in here and the first step is
I set m to x sub n, and I set k to n- 1.
All right, that's the very first step.
After that I make a test.
Is K=0 or not, okay.
Sorry, I haven't developed my skills at writing good
zeroes on the blackboard here.
So I got two variables here, M and K,
and I start out and I, that way, and
so it might be that K is equal to zero.
And then I'm gonna come do that, but if K is not zero.
Then I've got to make another test and I will write that down so
that is Xk bigger than M and again, it's yes or no.
So, I'm gonna write, I'm to put this logarithm down, and we'll play it,
through it.
And, you'll understand exactly what it does.
So, the S is saying, m equates by x k,
and no says k is set to k minus 1.
And these come out here and go all the way back.
And so I got the algorithm out here except for what happens under yes,
and at the end here it says output m.
And there we are.
Okay, so this is, I think,
if anybody here has written this half a dozen programs they probably came
up with a loop somewhere in one of those programs that's very much like, like this.
So it should be fairly familiar.
But let's take an example and play through.
So suppose that n is 4 and
the x1 is say 2.3 and
x2 is 7.1, x3 is 4.0.
X4 is 5.9, by the way if anybody has a question on what
I'm saying stop me right away.x I want to know unless
you don't understand anything and then you can't ask anything.
>> [LAUGHTER] But anyway, ask me and
I'll try to answer [INAUDIBLE] your chance okay?
So, we start out and the first part of, and
we set m is going to be,
is going to be set to 5.9 which is x4, xn?
Is 5.9 and k is set to n-1 which is 3.
All right, so that's this step.
We come down here and we say is k 0?
Nope.
So then we check if xk is greater than m.
So xk is 4.0, is 4.0 bigger than 5.9, nope, it's not.
So, we down here we change K to 2 and go back,
now again we ask is K through 0 it's not, so we come through and
now we check X equals.
7.1 X2 against M.
Is that bigger than M?
Yeah it is.
So you come through here and M gets changed to 7.1.
And then that takes us over here, K goes down to one.
That's not zero yet so we still, I'll check 2.3 against m and
it's not bigger, so that k moves to zero.
Yes, output m answer this is the maximum all right so
I don't think there was really a doubt but
I want to show you that I understand what a flow chart is.
>> [LAUGH] >> So,
now, the main thing that we teach students, is how to.
I mean, okay we went through a period where we didn't even teach this, but
the main thing we teach Student says, how do you know this algorithm is correct?
We ran it through one time, but how do we know there's not a bug,
something wrong?
And so for more than ten years people who wrote programs they just looked at it and
said yeah, I know it's right.
But they didn't have any particular reason, they just,
they just said, I can't make it fail, it must be okay, and so you,
but then, about Floyd, thank you Bob,
showed that actually we could, we could reason and prove that programs are correct
I am using something called invariant assertions and so
here we put a little statement saying this point in the program,
it turns out that N is equal to the mass
Of X k + 1 through X n.
And we verify that this is true the first
time we come through and then it's true again every time we go through the loop.
So if this stays true, until it's invariant.
And so then we know that this program not only gives an answer, but
it gives the right answer.
Okay.
So that's the way it usually is.
But as I started writing.
The books on the art of computer program and I found out that you know
there is almost always more than one algorithm for a problem.
In fact I'm right now working on buying three which
is about sorting and I've got more than 20 algorithm for sorting and
so you need some kind of way to know which algorithm to use.
Which one is gonna be faster?
Which is gonna use less resources?
Less memories something like that.
And so that's were analysis comes in of algorithm before we go beyond proving,
then the algorithm works to saying how well does it work,
how fast does it work, and how much does it cost you when it works?
So this is the underlying theme of
quantitative study of relatively performance of an algorithm,
not just the yes or no that it's correct, not correct.
Any questions on that?
I'm gonna analyze this algorithm for you and you will see.
Now this algorithm doesn't have,
there's no issue about how much memory it takes here.
It's gonna take n cells of memory to store the x's and
it's gonna take two for m and k and so that's fixed.
Nothing interesting to analyze there but we do have the running time to talk about.
So how long is this gonna do?
So I'm going to decorate this flow chart with the number of times we come through
a particular branch.
[COUGH] At the beginning,
just once, we start once.
And at the end we exit once, so we're coming One thing here,
how many times am I checking k equals zero?
Test a lot you say, and yeah I'm going to make this test n times.
If n is one, I make the test once, if n is two, I make the test twice so
this must be n times.
Now, so that means n- 1 is coming through here, okay.
But then there's a little bit of a problem here,
how many times is it coming into this step?
So I'm gonna call that capital A, that's some quantity that I've got to figure out.
And on the other hand it's going to come n minus one minus a times from here.
Coming in n minus one go out here a times
the other time has to be n minus one minus a.
And so there we are.
You get this idea that we have some quantity A that we have to know if we're
gonna know how long this program is gonna take to run.
All right, well now, so this is the idea.
A is the number of times
M has changed.
That's whoops, hey.
Fortunately I didn't need these.
Our job is to analyze A, what can we say about this quantity A.
Usually when I analyze something I set as my goal to find four things about A.
The min,
the max, the average, and the standard deviation.
Sometimes I can do less, sometimes I can do more, but anyway,
this is sort of a starting goal.
So in other words, the min is this is for optimistic people.
What's the fewest times I'll ever need to do it.
And so
in this case, how small can A be?
Okay, what.
Give me an example when it could be 0.
>> [INAUDIBLE] >> In other words, if you start out
already with Xn is the max, then we're never going to have to change the max.
Okay, so the min is 0, okay.
What's the max?
How big can they ever be?
>> n-1.
>> n-1 right.
Don't have any off by one errors here again.
>> [LAUGH] >> So, and
what's an example where you can need n-1?
Exactly, when they're all in reverse order they're all,
each one is X1 is biggest and x2 is next biggest and so on.
Okay, so that's n-1, okay.
So then [COUGH] the average is somewhere between 0,
so the max is for the pessimistic people, min is for the optimistic people.
Averages for the probabilistic people.
And we wanna know sort of realistically,
most of the time, or something like that, what can we say about typical value of A?
It's somewhere between zero and n minus one.
But is it maybe a half n, is it maybe squared n, or cubed root n,
something like that?
What is actually the average if we do it?
Well, in order define an average, we have to know what something about our input.
We have to make some assumptions as to what these x's are.
If the x's are always the same, then there's no, then the min and
max are also fixed as well as the n.
So, for most cases, it's reasonable
to assume that the x's are different.
That is, we're never going to have to worry about equality of xi and xj.
And that they're in random order.
That means that each ordering of the x's is equally likely.
So if I have n things that are different, and in
a different order than I have n factorial ways to in which they might occur.
So let's take a look when n equals 3 at what happens.
So there are, 3 factorial is 3 times 2 times 1, that's 6 ways.
And the x's can be like 1, 2, 3.
1, 3, 2.
2, 1, 3.
2, 3, 1.
3, 1, 2.
3, 2, 1.
And so let's figure out what A is in each case.
We got these six possibilities to work on, each one,
each of these six is supposed to be equally probable.
One sixth of the time we're gonna be faced with them in this order,
this order, and so on.
[COUGH] Okay well, so what's a, okay so
I'm gonna put a circle around whenever I have to change m.
So start here at m equals 3 it never has to change.
If I start here with m equals 2 gotta change it here.
3 again never had a change.
1 had to change here to 3.
This 2 had to change to a 3, and here a 1 got changed to a 2 and
then that got changed to a 3.
So a is equal to 0.
The number of circled elements here.
010112.
All right.
Well that's the so in other words
the if we had the probability that
a = 0, that's 2 out of 6 and
probability of a = 1 is 3 out of 6.
And probability A = 2 is 1/6.
Now from that, of course, we can figure out the average.
I might as well do that all.
I want to do it slow and easy because.
If we're gonna be working a little bit harder later on.
So here we have different
probabilities of p0 + p1+ p2.
You need the sum of these probabilities.
I call this P0, P1, P2, and this equal probability always equals 1.
But then the average is 0 times P0
plus (1 * p1) + (2 * p2) and so
this is 0/6 + 3/6 + (2 * 1/6
= 2/6) and that comes out to 5/6.
So n = 3, we find out the average value of A is 5/6.
Now, let's try to get more general and
go to higher values of n.
So, But
I might as well Tabulate what we've done so far.
I got 2, 3, 1 here, and the,
and we got an average of 56.
Well, I'm not gonna worry about that right now.
Okay, so what about n equals 4?
N equals 4, we got 24 cases, I'm not gonna write them all out,
but I'll show you schematically what they look like.
Six of the cases start with 1, six start with 2, six start with 3,
six start with 4.
And so, [COUGH] well, okay, so
here it goes 234, 243,
324, 342, 423, 432.
And same way with 2, I've got six things for the guys I've left out.
That's 134, 143, 341, 314,
I did, I should've done it that way.
341, 413, 431, and 3,
it's gonna be same kind of thing, 124, 142.
And for 4, it's gonna be the same as I had before.
123, 132, 213, 231, 312, 321.
So, it's easy to see then if we look at what gets circled that,
here it's just the same as we had before.
We circle this guy, this guy, this guy, and these two guys.
The same pattern, just add a 1 to everything.
The same deal here.
[COUGH] I'm sorry.
3,1 Circled the wrong 1.
1,3,4, yeah, this 4 gets circled, and
then this 4 and these two 4s and this 3.
And so I'm gonna get a circle here and a circle here and a circle here and
a circle here and a circle here, and so that carries over.
Just as we have before.
But also this 1, in circle.
So now, I Pattern
is clear then, the value of A is 010112,
as before, that gets repeated three times.
[COUGH] And then down here, it gets increased by 1,
it's 121223, [COUGH] okay?
So then the idea is then that if we add a new element,
the whole algorithm behaves just as it did before,
except if the maximum happened to occur in the first time,
then we have to bump A by one more.
And, so let's write down then,
P of nk is the number of cases
of within elements where A = k.
And, a p of nk is the probability
that A = k is then Pnk / n factorial.
And I wrote the Pnk here as 231, if I go to n = 4,
how many times do I have a 0 in here?
Well, 2 times 3 is 6, so I get 6 times of getting a 0 in here.
How many times do I get a 1 here?
Well, I get 3 times what I had before, plus these two 1s here.
And so that's coming out to be 11 and I get 6 here and 1.
And so actually you can see the rule that I used to get this line.
I multiplied by 3 and then I add the guy to the left.
So 3 times 3 plus 2 is 11, 1 times 3 plus 3 is six, 0 times 3 plus 1 is 1.
Okay, there we go.
Now, when I work on problems like this,
we can usually go up now higher, we can multiply by four and
add the guy to the left and it looks like this, 24, 35, 10, 1.
I always like to also go backward and
come back to, oops, 11, and 1.
Consider the case n equals 1, and n equals 2.
I don't know, always I catch more when I think of the small k.
Generalizing downward has always been an important part
of my problem strategy, problem solving strategy.
And so actually I started looking at different algorithms,
analyzing them, and fairly often came up with this same pattern of numbers.
These numbers started to become familiar to me.
And then I learned, yeah, a guy named James Sterling had
come up with these numbers already in the 1700s and they're called Sterling Numbers.
And so now,
this was something that didn't appear anywhere in my mathematical education.
I mean it was there and few people knew about Sterling numbers but
it wasn't taught.
At one point people saw some use for it but
then it wasn't used much in engineering during the next century and so on.
So people started to forget about this stuff.
And it was true in general of all this stuff that was going on as I'm analysing
algorithms.
I kept finding that I was focusing in on a kind of
mathematics that I had hardly ever been taught.
But instead I had seen it a few times in puzzles or something like that.
Where the concepts came up but there was
this little part of my math education that had to be amplified, amplified, amplified.
Because it was, I was working on topics that arose,
the discrete kind of topics that arose in studying algorithms were
different from the kind of topics that have been typical in calculus courses and
things that had become mainstream mathematics.
So I found essentially that my math education was
great but it didn't prepare me for analyzing algorithms exactly.
And so as time went on,
I started realizing that there was this other literature out there and
I could start learning more about Sterling and other people.
George Boole actually worked
a lot on this kind of problems.
And so well, I decided to add a new course
to Stanford's curriculum next fall.
So next fall I'm gonna introduce a class called concrete mathematics.
And it should be great, so stick around.
Now, okay, the other thing I wanna mention is that of course,
this a very simple algorithm that we're analyzing, but
the techniques that we're using here are techniques that I use over and
over again in different ways.
And so I realized actually, rather early,
that I would never run out of good problems if I was analyzing algorithms.
I knew that there was a whole
field called queuing theory that a lot of people worked on,
and this was just the analysis of a small, a certain special kind of algorithm.
So if I broaden that and say okay, well, queuing theory was one kind of algorithm,
let me take all the important algorithms of the world and analyze those.
I'll never run out of challenging and fulfilling problems to work on.
And in fact, there's an extra.
Not only is it fun to work on these problems, but also
when I get the answer, there are people out there who appreciate the answer.
So what could be a better way to spend my life?
Okay now, let's then proceed further and take a look at these numbers.
Well.
One of the next steps is to [COUGH] is to think
about these numbers as coefficients of a polynomial.
So let's consider a polynomial, and
I call that it 1 + z, 2 + 3z + z squared,
6 + 11z = 6z squared + z cubed and so on.
Now.
The reason I do this is because I know that the way that I got from one line to
the next was very much like something I do when I'm multiplying polynomials.
Because, if I look at this polynomial,
this turns out to be (1 + z)(2 + z).
And this one turns out to be (1 + z)(2 + z)(3 + z).
Because the rule that I used to get from one line to the next
was multiply this by 3 and shift over.
That's exactly the same like as saying multiply by (3 + z).
So you could guess what this is gonna factor out.
And so now we get a formula then.
Okay, as they said, I've found out that
these numbers were identified,
I mean called Stirling numbers, so it turned out that p of n,
on a k, is called a Stirling number,
and upstairs n, k + 1 downstairs.
So actually, I had a closed formula saying exactly what's the number of
cases that I'm gonna have to change the maximum that many time.
Okay, and I also found out that statisticians knew about his stuff,
but I think the first statistician who first studied this were like 1954,
something like that.
But anyway, they did study what they call a record breakers in sports events.
And so you wanted to say how many times does the lead change, or
something like that.
So starting numbers came up this way and
then the probability pnk is then I have this
formula and this divided by n factorial, somehow explaining it.
Okay now I've got a picture here that you can see to show just
n gets a little bit bigger.
I prepared, oops.
So here's n equals 12.
That's as far as I can compute at the moment
without getting numbers that are too big for my computer, but it
gives typical results that we can imagine.
N equals 12, you see that these are the capital p's here.
And so they get really tiny up here, and close to here looks
like this most of the time, we changed the maximum two times.
So what about when n is a million?
If I'm finding the maximum of a million numbers, I'd like to figure out
how many times the max was gonna change,
but it doesn't look like it's gonna be anywhere near half of a million.
But our next goal is try to figure out the behavior of this thing
as n gets larger and see more about what it is.
So Here
comes a great idea called generating function.
So instead of having the formula written out as a Stirling number and
everything like that, it turns out to be better to have the generating function,
which I want to explain.
But it was (1 + z)(2 + z),
up to n- 1 + z divided by n factorial.
So this is n factorial.
And this is equal to the summation of p of k.
I'm sorry, n, k times z to the k.
So the coefficient of z to the k is this probability that a is equal to k.
And it didn't take long when I was working on it, analyzing algorithms,
to realize that Generating functions were a wonderful way to approach it.
Why is that?
In fact, this was of course discovered 200 years earlier by Lagrange and
so on and Mejean.
But I didn't know it at the time.
From what I had been taught in school, I had my statistics teacher had talked
about something called the characteristic function, which turned out to be the same,
but z was called e to the it or something like that.
I didn't recognize it.
So [COUGH] why are generating functions great?
Well, most times when we're trying to find average values,
we don't know the exact value for p of nk.
We don't have a closed formula for it.
Stirling only invented a certain number of numbers, and
there are many more problems than are known numbers.
But often from the generating function, we can still find out what we need to
know about those numbers because a generating function is one object.
g n of z, while the numbers here are sort of n objects.
And so we have, the generating function is one thing that we can work with and
learn about, and deduce the p.
So, here's the point.
Forget n for a moment,
as long as g of z is equal to p of 0 + p 1 z Into
the square at 0.
Then we know that g of 1
is equal to, we plug in z = 1, and you get the sum of all probabilities.
So this is equal to 1.
Next, let's look at the derivative.
So I take the derivative with respect to z and
I get p1 + 2p2z- 3p3z squared and so on.
That's nice.
Because g'(1) is equal to the mean
of the average of the expected value of this probability.
It's the sum of you know P1+2P2+3P3 that's exactly what we said, that's the average.
So that's the number we want to get.
And while I'm at it I may as well show you the next step.
Because I did say I wanted to get the standard eviation as well.
So this is 2p2- plus 3 times 2 p3 z
and so on, 4 times 3 p4 z.
And so g double prime is gonna
be the sum of k (k- 1) p k, and if I add g double prime.
G double prime, plus g prime,
this is the sum of k squared times Ek.
So, and I might as well go all the way to the next step, then.
And I get the variance of g which is,
the standard deviation squared is
by definition equal to the variance, and this is then g double prime.
1 plus g prime of 1 minus g prime of 1 squared, right?
Either subtract it but this turns out to be the one on quantity
by which we measure deviation form, it's the square of the,
If the average value of the square of difference of a minus its mean.
So that's the indication of how stable
the thing is about the mean value.
Now, the point is it's really easy to calculate the average of
this particular generating function and the reason is that
when we have a product of two generating functions
then the average is just the sum of the two averages.
So here I'm going to show you that, it's so nice.
If we have a generating function g(z) and another generating function h(z).
And so the product of these two is it gives you another set of probabilities and
so if I can simply call this capital G, what the heck.
No, let me take another word.
So capital G(z) is equal to g1(z).
Well g(z) that's okay.
Just like So
It works out that the G(1) = 1.
This is values to 1, this is values to 1.
1 * 1 = 1. So all these probabilities of one
makes some kind of probability.
And furthermore, G prime of z is just G prime of
z h and z not plus G z h file z.
And so G prime of 1 is equal to little g prime of 1.
And h1 is 1, and this one is 1, so
this comes out to h1.
Okay? So the average, so
if we can factor a generating function as a product of other, simpler generating
functions, then to get the mean, we just have to add up the individual ones.
And it turned out the same thing for the variance.
That is, if I take G double prime
+ G prime- G prime squared,
it turns out to be little g prime delta
[INAUDIBLE] + [INAUDIBLE] [SOUND] So
the some of the two variance [SOUND]
it's the sum of the two variance.
So when you factorized it, everything becomes simple.
Now, these are then
give us the answer to our original question and so,
let's see, where was I?
Yeah, so, what do I get for the total my
generating function was 1+z/ 1*2+z/3 and
ended up n-1=z/n each of this is
the generating function by itself and
the average of this one is one-half,
this one is one-third, this one is 1/n.
Right, this is 1.
I mean you could
take the derivative and valuate it at 1, until, one-half, one-third, so on.
And so, the sum of these is the average, except that okay.
So now, here, this turns out,
then, to be the harmonic number minus 1.
The harmonic number, a sub n is defined to be the sum
of the first reciprocal starting with 1 instead of with a hat.
So the answer is then.
Is Hn-1 here.
And the standard deviation works out to be the square
root of Hn minus the second order harmonic number,
which is the sum of the squares of,
like one-half squared plus one-third squared.
1 over n square, so these are the, so there's answers,
the original question saying that the average turns
out to be this harmonic number which then we get to know
is Is very much like the logarithm and if n is a million,
it turns out to be about 13 and a half, so okay.
So that's the, it tells you that we're not going to be
changing the next one very much often in the meantime.
And once again this quantity a sub n,
harmonic number was appearing in zillions of
different algorithms that I was studying.
So even though as I'd been a math student I'd run into some of these reciprocals,
maybe half a dozen times somewhere in my education.
All of a sudden I realized that if I was gonna be analyzing algorithms
I'd not only have to know kind of about hn but
I'd have to know how to calculate the sum of k times h sub k and
I'd have to know how to calculate the sum of hn squared and I had to know all
kinds of things about these numbers hn that I hadn't been taught before.
And that was because the computer applications were telling me this
was a part of mathematics that I really could have learned if it
hadn't faded out of history.
So that's why I'm going to try to teach this course,
conquering math next fall, that brings us into the curriculum.
From a practical point of view now I've analyzed this algorithm.
What does that tell me about?
It wasn't riding a compiler.
Does that tell me something about how to arrange the code?
Well, in the first place, I actually I
already made the first simplification when I presented the algorithm I started out
with k equals n minus 1 instead of starting out with k equals 0.
I could have started out k equals 1 and go up and say k equals n here.
And while I knew that a computer it was easier to test for 0 than to test for n.
And since I'm doing this every time,
I wanted this test to be as simple as I could.
So I already biased that when I wrote the program by
counting downward instead of going up.
But then it also the analysis tells us that almost always we're gonna
come down this branch and hardly ever are we gonna be tripping around in here, so
what the computer can be spending its time going zip, zip,
zip like this all the time.
And so, this will tell us not only that, well we should arrange the programs,
the instructions in memory so that this path is nice, and
the computer goes fast as it can zip on these instructions.
But then I wouldn't have to make a side trip but then we don't really care so
much about what it does to fix up, and even more so,
we can realize that we could double up on this loop, and we could combine
several values of k, we only have to test to see if k is getting small.
Say is k bigger than five or not, and if not the we can do five of these tests
right away without making separate tests each time, and so next summer I'm
gonna be running a project where I want about a dozen volunteers to go through and
see what the programs people are writing at Stanford.
And we're going to scour the waste baskets and look at where we can
get wherever we can and see what people are actually doing in their programs.
And we're gonna analyze how much a compiler could do by we'll
take a look at the bottlenecks in these programs that we find around and
we're going to try to empirically figure out how often we
can make a big improvement over what the compilers do now.
So I'm looking for volunteers for this summer project.
Okay, now, how are we doing on time?
Okay, good.
I, then I guess it would be time for me now to say what's the take away?
Except in 1969, I wouldn't have said that because
that's a modern phrase but anyway, what do we get out of this?
First of all I wanted to show you that by trying to
use mathematics to find quantitative things
about computer programs we add a new dimension to
what we thought we knew before about a program.
Secondly I wanted to show you that the techniques are,
they have some kind of a system to them.
They feed on each other.
You can, each program doesn't have its own special trick.
But these things, it turns out that there's,
that we begin to learn the discrete calculus.
That that was analogous to the infinite
calculus that we've been taught, and
I wanted to show you that it's enjoyable
to psyche out these problems, and so
I believe then that I can safely say that I've got more than enough for
one lifetime's career ahead of me of pleasant
applications that combine traditional mathematics
with efficient use of computers, okay?
Thanks a lot for listening. [LAUGH]
>> [APPLAUSE]
>> Are there any questions?
>> [LAUGH]
Không có nhận xét nào:
Đăng nhận xét