See all Lex Fridman transcripts on Youtube

youtube thumbnail

Travis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming | Lex Fridman Podcast #224

3 hours 5 minutes 24 seconds

🇬🇧 English

S1

Speaker 1

00:00

The following is a conversation with Travis Olyphant, 1 of the most impactful programmers and data scientists ever. He created NumPy, SciPy, and Anaconda. NumPy formed the foundation of tensor-based machine learning in Python. SciPy formed the foundation of scientific programming in Python, and Anaconda, specifically with Conda, made Python more accessible to a much larger audience.

S1

Speaker 1

00:27

Travis's life work across a large number of programming and entrepreneurial efforts has and will continue to have immeasurable impact on millions of lives by empowering scientists and engineers in big companies, small companies, and open source communities to take on difficult problems and solve them with the power of programming. Plus, he's a truly kind human being, which is something that when combined with vision and ambition makes for a great leader and a great person to chat with. To support this podcast, please check out our sponsors in the description. This is the Alex Friedman Podcast, and here is my conversation with Travis Oliphant.

S2

Speaker 2

01:11

What was the first computer program you've ever written?

S3

Speaker 3

01:14

Do you remember? Whoa, that's a good question. I think it was in fourth grade.

S3

Speaker 3

01:18

Just a simple loop in basic. Basic. Basic, yeah, on an Atari 800, Atari 400, I think, or maybe it was an Atari 800. It was a part of a class, and we just were just basic loops to print things out.

S2

Speaker 2

01:32

Did you use goto statements?

S3

Speaker 3

01:34

Yes, yes, we used goto statements. I remember in

S2

Speaker 2

01:38

the early days, that's when I first realized there's like principles to programming when I was told that don't use goto statements. Those are bad software engineering principles. Like it goes against what great, beautiful code is.

S2

Speaker 2

01:51

I was like, oh, okay, there's rules to this game.

S3

Speaker 3

01:54

I didn't see that until high school when I took an AP Computer Science course. I did a lot of other kinds of just low-level programming in TI, but finally when I took an AP Computer Science course in Pascal. Wow.

S3

Speaker 3

02:06

Yeah, it was Pascal. That's when I, oh, there are these principles. Not C or C++? No, I didn't take C until the next year in college.

S3

Speaker 3

02:14

I had a course in C, but I haven't done much in Pascal, just that AP Computer Science course.

S2

Speaker 2

02:21

Now, sorry for the romanticized question, but when did you first fall in love with programming?

S3

Speaker 3

02:26

Oh man, good question. I think actually when I was 10. You know, my dad got us a Timex Sinclair, and he was excited about the spreadsheet capability, and then, but I made him get the basic, the add-ons so we could actually program in basic.

S3

Speaker 3

02:41

And just being able to write instructions and have the computer do something. And we got a TI-99, TI-99-4A when I was about 12. And I would just, it had sprites and graphics and music. You could actually program it to do music.

S3

Speaker 3

02:55

That's when I really sort of fell in love with programming.

S2

Speaker 2

02:58

So this is a full, like a real computer with like, with memory and storage and processors. So we're not, because you say TI,

S3

Speaker 3

03:06

it's not. Yeah, the Timex Sinclair was 1 of the very first. It was a cheap, cheap, I think it was, well, it was still expensive, but it was 2K of memory.

S3

Speaker 3

03:14

We got the 16K add-on pack. But yeah, it had memory and you could program it. You had the, in order to store your programs, you had to attach a tape drive. Remember that old, the sound that would play when you converted the modems would convert digital bits to audio's file, It was set on a tape drive.

S3

Speaker 3

03:31

Still remember that sound, but that was the storage.

S2

Speaker 2

03:34

And what was the programming language, do

S3

Speaker 3

03:36

you remember? It was basic. It was basic.

S3

Speaker 3

03:37

And then they had a Visicalc. And so a little bit of spreadsheet programming and Visicalc, but mostly just some basic.

S2

Speaker 2

03:42

Do you remember what kind of things drew you to programming? Was it working with data? Was it video games?

S2

Speaker 2

03:50

Games?

S3

Speaker 3

03:51

Math. Mathy stuff? Yeah, I've always loved math. And a lot of people think they don't like math because I think when they're exposed to it early, it's about memory.

S3

Speaker 3

04:01

You know, when you're exposed to math early, you have a good short-term memory, you can remember those time tables. And I do have a reasonably, I mean, not perfect, but a reasonably long little short-term memory buffer. And so I did great at times tables. I said, oh, I'm good at math.

S3

Speaker 3

04:15

But I started to really like math, just the problem solving aspect. And so computing was problem solving applied. And so that's always kind of been the draw, kind of coupled with the mathematics.

S2

Speaker 2

04:30

Did you ever see the computer as like an extension of your mind, like something able to achieve? Not till later.

S3

Speaker 3

04:37

Okay. Yeah, not then.

S2

Speaker 2

04:39

It's just like a little set of puzzles that you can play with, and you can play with math puzzles.

S3

Speaker 3

04:43

Yeah, it was too rudimentary early on. Like it was sort of, yeah, it was too, it was a lot of work to actually take a thought you'd have and actually get it implemented and that's still work, but it's getting easier. And so, yeah, I would say that's definitely what's attracting me to Python is that that was more real, right?

S3

Speaker 3

05:02

I could think in Python. Speaking of foreign language, I only speak another language fluently besides English, which is Spanish, and I remember the day when I would dream in Spanish. And you start to think in that language, and then you actually, I do definitely believe that language limits or expands your thinking. There's some languages that actually lead you to certain thought processes.

S2

Speaker 2

05:23

Yeah, like, so I speak Russian fluently, and that's certainly a language that leads you down certain thought processes. Well, yeah, I mean, there's a history of the 2 world wars, of millions of people starving to death or near to death throughout its history of suffering, of injustice, like this promise sold to the people and then the carpet or whatever is swept from under them. It's like broken promises and all of that pain and melancholy is in the language.

S2

Speaker 2

05:59

The sad songs, the sad hopeful songs, the over romanticized like, I love you, I hate you, the sort of the swings between all the various spectrums of emotion. So that's all within the language, the way it's twisted, There's a strong culture of rhyming poetry. So like the bards, like the sync, there's a musicality to the language too.

S3

Speaker 3

06:24

Did Dostoevsky write in Russian? Yeah, so

S2

Speaker 2

06:27

like Dostoevsky, Tolstoy, all the,

S3

Speaker 3

06:32

all the- The ones that I know about, which are translated, and I'm curious how the translations.

S2

Speaker 2

06:36

So Dostoevsky did not use the musicality of the language too much, so it actually translates pretty well because it's so philosophically dense that the story does a lot of the work, but there's a bunch of things that are untranslatable. Certainly the poetry is not translatable. I actually have a few conversations coming up offline and also in this podcast with people who've translated Dostoevsky.

S2

Speaker 2

07:01

And that's for people who worked in this field, know how difficult that is. Sometimes you can spend months thinking about a single sentence in context, because there's just a magic captured by that sentence. And how do you translate just in the right way? Because those words can be really powerful.

S2

Speaker 2

07:22

There's a famous line, beauty will save the world from Dostoevsky. There's so many ways to translate that. And You're right, the language gives you the tools with which to tell the story, but it also leads your mind down certain trajectories and paths to where over time, as you think in that language, you become a different human being.

S3

Speaker 3

07:42

Yes. Yeah, that's a fascinating reality, I think. I know people have explored that, but it's, I guess, rediscovered.

S2

Speaker 2

07:49

Well, we don't, we live in our own little pockets. Like, this is the sad thing, is I feel like, unfortunately, given time and given getting older, I'll never know China, the Chinese world, because I don't truly know the language. Same with Japanese, I don't truly know Japanese and Portuguese in Brazil, that whole South American continent.

S2

Speaker 2

08:12

Like, yeah, I'll go to Brazil and Argentina, but will I truly understand the people if I don't understand the language? It's sad because I wonder how many geniuses we're missing because so much of the scientific world, so much of the technical world is in English, and so much of it might be lost because we don't have the common language.

S3

Speaker 3

08:35

I completely agree. I'm very much in that vein of there's a lot of genius out there that we miss, and we're sort of fortunate when it bubbles up into something that we can understand or process, there's a lot we miss. So I tend to lean towards really loving democratization or things that empower people or, you know, very resistant to sort of authoritarian structures.

S3

Speaker 3

09:00

Fundamentally for that reason, well, several reasons, but it just hurts us.

S2

Speaker 2

09:04

We're worse off. So speaking of languages that empower you, so Python was the first language for me that I really enjoyed thinking in, as you said.

S3

Speaker 3

09:16

Sounds like you've shared my experience too.

S2

Speaker 2

09:18

So when did you first, do you remember when you first kind of connected with Python, maybe even fell in love with Python?

S3

Speaker 3

09:23

It's a good question. It was a process, it took about a year. I first encountered Python in

S1

Speaker 1

09:27

1997.

S3

Speaker 3

09:29

I was a graduate student studying biomedical engineering at the Mayo Clinic. I had previously been involved in taking information from satellites. I was an electrical engineering student, used to taking information and trying to get something out of it, doing some data processing information out of it.

S3

Speaker 3

09:46

I'd done that in MATLAB. I'd done that in Perl. I'd done that in scripting on a VMS. There's actually a VAX VMS system, and they had their own little scripting tools around Fortran.

S3

Speaker 3

09:57

Done a lot of that. And then as a graduate student, I was looking for something and encountered Python. And because Python had an array, it had 2 things that made me not filter it away. Because I was filtering a bunch of stuff.

S3

Speaker 3

10:10

I looked at Yorick, I looked at a few other languages that are out there at the time in 1997. But it had arrays. There's a library called Numeric that had just been written in

S1

Speaker 1

10:20

95,

S3

Speaker 3

10:22

not too much earlier, by an MIT alum, Jim Huganin. I went back and read the mailing list to see the history of how it grew, and it was very interesting. It's fascinating to do that, actually, to see how this emergent cooperation, unstructured cooperation happens in the open source world that led to a lot of this collective programming, which is something maybe we might get into a little later, about what that looks like.

S2

Speaker 2

10:46

What gap did numeric fill? Numeric filled

S3

Speaker 3

10:48

the gap of having an array object. So instead

S2

Speaker 2

10:50

of- There was no array object.

S3

Speaker 3

10:51

There was no array, there was a 1 dimensional byte concept, but there was no N dimensional, 234 dimensional tensor, they call it now. I'm still in the category that a tensor is another thing, and it's just an N-D-A-R-A, we should call it, but I kind of lost that battle.

S2

Speaker 2

11:08

There's many battles in this world, some of which we win, some we

S3

Speaker 3

11:11

lose. That's exactly right. But it had no math to it. So numeric had math and a basic way to think in a race.

S3

Speaker 3

11:20

So I was looking for that and it had complex numbers, a lot of programming languages and you can see it because, you know, if you're just a computer scientist, you think, ah, complex numbers are just too, too float. So you can, people can build that on. But in practice, a complex number as 1 of the significant algebras that helps connect a lot of physical and mathematical ideas, particularly to FFT for an actual engineer. And it's a really important concept, And not having it means you have to develop it several times, and those times may not share an approach.

S3

Speaker 3

11:54

1 of the common things in programming, 1 of the things programming enables is abstractions. But when you have shared abstractions, it's even better. It sort of gets to the level of language of actually we all think of this the same way, which is both powerful and dangerous, right? Because powerful in that we now can quickly make bigger and higher level things on top of those abstractions dangerous because it also limits us as to the things we maybe left behind in producing an abstraction, which is at the heart of programming today and actually building around the programming world.

S3

Speaker 3

12:24

I think it's a fascinating philosophical topic.

S2

Speaker 2

12:26

Yeah, they will continue for many years, I think. They'll continue for many years. As we build

S3

Speaker 3

12:29

more and more and more abstractions. Yes, I often think about, you know, we have a world that's built on these abstractions that, were they the only ones possible? Certainly not, but they led to, now it's very hard to do it differently.

S3

Speaker 3

12:42

Like there's an inertia that's very hard to, you know, push out, push away from. There's, it has implications for things like, you know, the Julia language, which you have heard of, I'm sure. And I've met the creators and I like Julia. It's a really cool language, but they've struggled to kind of, against the, just the tide of like this inertia of people using Python.

S3

Speaker 3

13:03

And there's strategies to approach that, but nonetheless, it's a phenomenon. And sometimes, so I love complex numbers, and I love to erase, so I looked at Python. And then I had the experience, I did some stuff in Python, And I was just doing my PhD, so I was out, my focus was on, I was actually doing a combination of MRI and ultrasound and looking at a phenomenon called elastography, which is you push waves into the body and observe those waves, like you can actually measure them, and then you do mathematical inversion to see what the elasticity is. And so that's the problem I was solving is how to do that with both ultrasound and MRI.

S3

Speaker 3

13:39

I needed some tool to do that with. So I started using Python in 97. In

S1

Speaker 1

13:44

98,

S3

Speaker 3

13:45

I went back, looked at what I'd written, and realized I could still understand it, which is not the experience I'd had when doing Pearl in 95. I'd done the same thing, and then I looked back and I'd forgotten what I was even saying. Now, I'm not saying it, so hey, this may work.

S3

Speaker 3

14:01

I like this. This is something I can retain without becoming an expert, per se. And so that led me to go, I'm gonna push more into this. And then that 98 was kind of when I started to fall in love with Python, I would say.

S2

Speaker 2

14:18

A few peculiar things about Python, so maybe compare it to Perl, compare it to some of the other languages, so there's no braces. Yeah. So space is used, indentation, I should say, is used as part of the language.

S3

Speaker 3

14:34

Yeah, right.

S2

Speaker 2

14:35

So did you, I mean, that's quite a leap. Were you comfortable with that leap or were you just very open-minded?

S3

Speaker 3

14:42

It's a good question. I was open-minded, So I was cognizant of the concern. And it definitely has specific challenges.

S3

Speaker 3

14:52

Cut and pasting, for example, you're cutting and pasting code. And if your editors aren't supportive of that, if you're putting it into a terminal, and particularly in the past when terminals didn't necessarily have the intelligence to manage it now. Now iPython and Jupyter Notebooks handled it just fine, so there's really no problem, but in the past it created some challenges, formatting challenges. Also mixed tabs and spaces.

S3

Speaker 3

15:12

If editors weren't clear on what was happening, you would have these issues. So there were really concrete reasons about it that I heard and understood. I never really encountered a problem with it personally. Like it was occasional annoyances, but I really liked the fact that it didn't have all this extra characters, right?

S3

Speaker 3

15:31

That these extra characters didn't show up in my visual field when I was just trying to process understanding a snippet of code.

S2

Speaker 2

15:37

Yeah, there's a cleanness to it. But I mean, the idea is supposed to be that Perl also has a cleanness to it because of the minimalism of how many characters it takes to express a certain thing. So it's very compact.

S2

Speaker 2

15:49

But what you realize with that compactness comes, there's a culture that prizes compactness, and so the code gets more and more compact and less and less readable to a point where it's like, like to be a good programmer in Perl, you write code that's basically unreadable. There's a culture like that.

S3

Speaker 3

16:09

Correct, and you're proud of it. Yeah, you're proud of it. Right, exactly, and it's like feels good, and It's really selective.

S3

Speaker 3

16:16

It means you have to be an expert in Perl to understand it. Whereas Python allowed you not to have to be an expert. You didn't have to take all this brain energy. You could leverage what I say.

S3

Speaker 3

16:25

You could leverage your English language center, which you're using all the time. I've wondered about other languages, particularly non-Latin-based languages. Latin-based languages with the characters are at least similar. I think people have an easier time, but I don't know what it's like to be a Japanese or a Chinese person trying to learn a different syntax.

S3

Speaker 3

16:46

Like, what would computer programming look like in that? I haven't looked at that at all, but it certainly doesn't, you know, leveraging your Chinese language center, I'm not sure Python or any programming language does that. But that was a big deal. The fact that it was accessible, I could be a scientist.

S3

Speaker 3

17:00

What I really liked is many programming languages really demand a lot of you, and you can get a lot, you know, you do a lot if you learn it. But Python enables you to do a lot without demanding a lot of you. There's nuance to that statement, but it certainly is more accessible. So more people could actually, as a scientist, as somebody who, or an engineer, who was trying to solve another problem besides point programming, I could still use this language and get things done and be happy about it.

S3

Speaker 3

17:27

Now I was also comfortable in C

S2

Speaker 2

17:28

at that time. And MATLAB you did a little bit.

S3

Speaker 3

17:31

And MATLAB I did a lot before that, exactly. So I was comfortable in those 3 languages were really the tools I used during my studies and schooling. But to your point about language helping you think, 1 of the big things about MATLAB was it was, and APL before it, I don't know if you remember APL.

S3

Speaker 3

17:47

Nope. APL is actually the predecessor of array-based programming, which I think is really an underappreciated. If I talk to people who are just steeped in computer programming, computer science, like most of the people that Microsoft has hired in the past, for example, Microsoft as a company generally did not understand array-based programming. Culturally, they didn't understand it.

S3

Speaker 3

18:06

So they kept missing the boat, kept missing the understanding of what this was. They've gotten better, but there's still a whole culture of folks that doesn't. Programming, that's systems programming or web programming or lists and maps. And what about an n-dimensional array?

S3

Speaker 3

18:22

Oh, yeah, that's just an implementation detail. Well, you can think that, but then actually if you have that as a construct, you actually think differently. APL was the first language to understand that, and it was in the 60s. The challenge of APL is APL had very dense, not only glyphs, like new characters, new glyphs, they even had a new keyboard, because to produce those glyphs, this was back in the early days of computing, when the QWERTY keyboard maybe wasn't as established.

S3

Speaker 3

18:48

Like, well, we can have a new keyboard, no big deal. But it was a big deal, and it didn't catch on, and the language, APL, very much like Perl, as people would pride themselves on how much, could they write the game of life in 30 characters of APL? APL has characters that mean summation, and they have adverbs, you know, they have adjectives and these things called adverbs, which are like methods, like reduction, reduction would be an adverb on an ad operator, right? So, but doing, using these tools, you could construct, and then you start to think at that level.

S3

Speaker 3

19:20

You think in n dimensions, is something I like to say, and you start to think differently about data at that point.

S2

Speaker 2

19:25

Now you're, it really helps. Yeah, I mean, outside of programming, if you really internalize linear algebra as a course, I mean, it philosophically allows you to think of the world differently. It's almost like liberating.

S2

Speaker 2

19:39

You don't have to think about the individual numbers in the n-dimensional array. You could think of it as an object in itself, and all of a sudden this world can open up. You're saying MATLAB and APL were the early, I don't know if many languages got that right ever.

S3

Speaker 3

19:55

No, no, no they didn't.

S2

Speaker 2

19:56

Even still.

S3

Speaker 3

19:57

Even still, I would say.

S2

Speaker 2

19:59

I

S3

Speaker 3

19:59

mean, NumPy is an inheritor of the traditions, I would say APLJ was another version that was, what it did is not have the glyphs, just have short characters, but still a Latin keyboard could type them. And then Numeric inherited from that, in terms of let's add arrays plus broadcasting, plus methods, reduction, even some of the language like rank is a concept that was in Python, it's still in Python for the number of dimensions. That's different than say the rank of a matrix which people think of as well.

S3

Speaker 3

20:31

So it came from that tradition, but NumPy is a very pragmatic, practical tool. NumPy inherited from Numeric, and we can get to where NumPy came from, which is the current array, at least current as of

S1

Speaker 1

20:44

2015, 2017,

S3

Speaker 3

20:46

now there's a ton of them over the past 2 or 3 years. We can get into that too.

S2

Speaker 2

20:50

So if we just sort of linger on the early days of what was your favorite feature of Python? Do you remember like what? It's so interesting to linger on like the, what really makes you connect with a language.

S2

Speaker 2

21:06

I'm not sure it's obvious to introspect that.

S3

Speaker 3

21:09

No, it isn't, and I've thought about that to some length. I think definitely the fact that I could read it later, that I could use it productively without becoming an expert. Like other languages I had to put more effort into.

S3

Speaker 3

21:21

Right,

S2

Speaker 2

21:22

that's like an empirical observation. Like you're not analyzing any 1 aspect of the language, it just seems time after time, you look back, it's somehow readable.

S3

Speaker 3

21:30

It's somehow readable, Then it was sort of, I could take executable English and translate it to Python more easily. Like I didn't have to go, there was no translation layer. As an engineer or as a scientist, I could think about what I wanted to do.

S3

Speaker 3

21:43

And then the syntax wasn't that far behind it.

S2

Speaker 2

21:45

Yeah.

S3

Speaker 3

21:46

Right. Now there was some, there are some, there's some warts there still. It wasn't perfect. There were some areas where I'm like, ah, it would be better if this were different or if this were different.

S3

Speaker 3

21:54

Some of those things got out of the language too. I was really grateful for some of the early pioneers in the Python ecosystem back, Because Python got written in 91, that's when the first version came out. But Guido was very open to users. And 1 of the sets of users were people like Jim Huganin and David Asher and Paul Dubois and Conrad Hinson.

S3

Speaker 3

22:13

These were people that were on the main list And they were just asking for things like, hey, we really should have complex numbers in this language. So let's, you know, there's a J, there's A1J, right? And the fact they went the engineering route of J is interesting. I don't think that's entirely favoring engineers.

S3

Speaker 3

22:28

I think it's because I is so often used as the index of a for loop. I think that's actually why. Probably.

S2

Speaker 2

22:35

I mean there's a pragmatic aspect.

S3

Speaker 3

22:36

The fact that complex numbers were there, I love that. The fact that I could write ndarray constructs and that reduction was there. Very simple to write summations and broadcasting was there.

S3

Speaker 3

22:46

I could do addition of whole arrays. So that was cool. Those are some things I loved about it.

S2

Speaker 2

22:52

I don't know what to start talking to you about because you've created so many incredible projects that basically change the whole landscape of programming. But okay, let's start with, let's go chronologically with SciPy. You created SciPy over 2 decades ago now,

S3

Speaker 3

23:09

right? Yes, yes, I love to talk about SciPy. SciPy was really my baby.

S2

Speaker 2

23:12

What is it? What was its goal? What is its goal?

S2

Speaker 2

23:16

How does it work?

S3

Speaker 3

23:17

Yeah, fantastic. So SciPy was effectively, here I'm using Python to do stuff that I previously used MATLAB to use. And I was using Numeric, which is an array library that made a lot of it possible.

S3

Speaker 3

23:28

But there's things that were missing. Like I didn't have an ordinary differential equation solver I could just call, right? I didn't have integration. I wanted to integrate this function, okay, well, I don't have just a function I can call to do that.

S3

Speaker 3

23:40

These are things I remember being critical things that I was missing. Optimization, I just want to pass a function to an optimizer and have it tell me what the optimum value is. Those are things like, well, why don't we just write a library that adds these tools? And I started to post to the mailing list and there had previously been, you know, people have discussed, I remember Conrad Hinson saying, wouldn't it be great if we had this optimizer library?

S2

Speaker 2

24:01

Or

S3

Speaker 3

24:01

David Asch would say this stuff. And I'm, you know, I'm an ambitious, ambitious is the wrong word, an eager and probably more time than sense. I was a poor graduate student.

S3

Speaker 3

24:13

My wife thinks I'm working on my PhD, And I am, but part of a PhD that I loved was the fact that it's exploratory. You're not just taking orders, fulfilling a list of things to do, you're trying to figure out what to do. And so I thought, well, I'm writing tools for my own use in a PhD, so I'll just start this project. And so in 99, 98 was when I first started to write libraries for Python.

S3

Speaker 3

24:36

But really when I fell in love with Python 98, I thought, oh, well, there's just a few things missing. Like, oh, I need a reader to read DICOM files. I was in medical imaging and DICOM was a format that I want to be able to load that into Python. Okay, how do I write a reader for that?

S3

Speaker 3

24:48

So I wrote something called, it was an IO package, right? And that was my very first extension module, which is C. So I wrote C code to extend Python so that in Python I could write things more easily. That combination kind of hooked me.

S3

Speaker 3

25:02

It was the idea that I could, here's this powerful tool I can use as a scripting language and a high-level language to think about, but that I can extend easily. Easily in C. Easily for me because I knew enough C. And then Guido had written a link.

S3

Speaker 3

25:15

I mean, the only, the hard part of extending Python was something called the way memory management works, and you have to reference counting. And so there's a tracking of reference counting you have to do manually. And if you don't, you have memory leaks. And so that's hard.

S3

Speaker 3

25:29

Plus then C, you know, it's much more, you have to put more effort into it. It's not just I have to now think about pointers and I have to think about stuff that is different. I have to kind of, you're like putting a new cartridge in your brain. Like you're, okay, I'm thinking about MRI, now I'm thinking about programming.

S3

Speaker 3

25:43

And there are distinct modules you end up having to think about. So it's harder. When I was just in Python, I could just think about MRI and high-level writing. But I could do that, and I liked it.

S3

Speaker 3

25:53

I found that to be enjoyable and fun, and so I ended up, oh, well, let me just add a bunch of stuff to Python to do integration. Well, and the cool thing is, is that, you know, the power of the internet, I just looking around and I found, oh, there's this NetLib, which has hundreds of 4chan routines that people have written in the 60s and the 70s and the 80s. And 4chan 77, fortunately, it wasn't 4chan 60s, it had been ported to 4chan

S1

Speaker 1

26:16

77.

S3

Speaker 3

26:18

And 4chan 77 is actually a really great language. Fortran 90 probably is my favorite Fortran because it's got complex numbers, got arrays, and it's pretty high level. Now, the problem with it is you'd never want to write a program in Fortran 90 or Fortran 77, But it's totally fine to write a subroutine in.

S3

Speaker 3

26:34

Right, and so, and then Fortran kind of got a little off course when they tried to compete with C++. But at the time, I just want libraries that do something. Like, oh, here's an order of difference equation. Here's integration, here's run cut integration.

S3

Speaker 3

26:46

Already done, I don't have to think about that algorithm. I mean, you could, but it's nice to have somebody who's already done 1 and tested it. And so I sort of started this journey in 98, really, if you look back at the main list, there's sort of this productive era of me writing an extension module to connect Runge-Kutta integration to Python and making an ordinary additional equation solver. And then releasing that as a package, so we call it ODE pack, I think I called it then, quad pack, and then I just made these packages.

S3

Speaker 3

27:14

Eventually that became multi-pack because they were originally modular, you can install them separately. But a massive problem in Python was actually just getting your stuff installed. At the time, releasing software for me, like today, people think, what does that mean? Well, then it meant some poorly written webpage, I had some bad webpage up and I put a tarball, just a gzip tarball of source code.

S3

Speaker 3

27:35

That was the release.

S2

Speaker 2

27:37

But, okay, can we just end that? Because the community aspect of creating the package and sharing that, that's rare. To have, to both have the, at that time, so like the

S3

Speaker 3

27:51

raw stuff. Yeah, it

S2

Speaker 2

27:51

was pretty early, yeah. So, oh, well, not rare. Maybe you can correct me on this, but it seems like in the scientific community, so many people, you were basically solving the problems you needed to solve to process the particular application, the data that you need.

S2

Speaker 2

28:08

And to also have the mind that I'm going to make this usable for others, that's...

S3

Speaker 3

28:15

I would say I was inspired. I'd been inspired by Linux. I'd been inspired by Linus and him making his code available and I was starting to use Linux at the time.

S3

Speaker 3

28:23

I went, this is cool. So I'd kind of been previously primed that way. And generally I was into science because I liked the sharing notion. I liked the idea of, hey, let's, if collectively we build knowledge and share it, we can all be better off.

S2

Speaker 2

28:35

Okay, so you were energized by that idea.

S3

Speaker 3

28:37

So I was energized by that idea already, right? And I can't deny that, I was. I'm sort of, I had this very, I liked that part of science, that part of sharing.

S3

Speaker 3

28:45

And then all of a sudden, oh wait, here's something, and here's something I could do. And then I slowly over years learned how to share better so that you could actually engage more people faster. 1 of the key things was actually giving people a binary they could install, right? So that it wasn't just, here's source code, good luck.

S3

Speaker 3

29:01

Compile this and then. It's compiled, ready to install, just you know, so in fact a lot of the journey from 98, even through 2012 when I started Anaconda, was about that. Like it's why, you know, it's really the key as to why a scientist with dreams of doing MRI research ended up starting a software company that

S2

Speaker 2

29:19

installs software. I work with a few folks now that don't program like on the creative side, the video side, the audio side. And because my whole life is run on scripts, I have to try to get them, I'm having now the task of teaching them how to do Python enough to run the scripts.

S2

Speaker 2

29:39

And so I've been actually facing this, whether it's on the condo, some with the task of how do I minimally explain, basically to my mom, how to write a Python script. And it's an interesting challenge. It's a to-do item for me to figure out, what is the minimal amount of information I have to teach? What are the tools you use?

S2

Speaker 2

29:57

That's 1, you enjoy it. 2, you're effective at it.

S3

Speaker 3

30:00

And they're related. Those are 2 related questions.

S2

Speaker 2

30:02

And then the debugging, like the iterative process of running the script to figure out what the error is, maybe even for some people to do the fix yourself. So do you compile it, do you distribute, like how do you distribute that code to them? And it's interesting because I think it's exactly what you're talking about.

S2

Speaker 2

30:20

If you increase the circle of empathy, the circle of people that are able to use your programs, you increase its effectiveness and its power. And so you have to think, you know, can I write scripts? Can I write programs that can be used by medical engineers, by all kinds of people that don't know programming and actually maybe plant a seed, have them catch the bug of programming so that they start on their journey? That's a huge responsibility and ultimately has to do with the Amazon 1 click buy.

S2

Speaker 2

30:55

Like how frictionless can you make

S3

Speaker 3

30:57

the early steps? Frictionless is actually really key. To grow in any community is, any friction point, you're just gonna lose some people.

S3

Speaker 3

31:05

Now sometimes you may wanna intentionally do that. If you're early enough on, you need a lot of help, you need people who have the skills. You might actually, it's helpful, you don't necessarily have too many users as opposed to contributors if you're early on. Anyway, there's, Sci-Fi started in 98, but it really emerged as this collection of modules that I was just putting on the net, people were downloading, and I think I got 100 users, right, by the end of that year.

S3

Speaker 3

31:32

But the fact that I got 100 users and more than that, people started to email me with fixes. And that was actually intoxicating. That was the, here I'm writing papers, I'm giving conferences, and I get people to say hello, but yeah, good job. But mostly it was you're viewed with, it's competitive.

S3

Speaker 3

31:51

You publish a paper and people are like, oh, it wasn't my paper. I was starting to see that sense of academic life where it was so much, I thought there was this cooperative effort, but it sounds like we're here just to one-up each other. Right. And, you know, that's not true across the board, but a lot of that's there.

S3

Speaker 3

32:08

But here in this world, I was getting responses from people all over the world. You know, I remember Piaro Peterson in Estonia, right, was 1 of the first people. And he sent me back this make file because, you know, the first thing it is, yeah, your build thing stinks and here's a better make file. Now, it was a complex make file.

S3

Speaker 3

32:24

I think I never understood that make file actually, but it worked and it did a lot more. And so I was like, thanks, this is cool. And that was my first kind of engagement with community development. But you know, the process was he sent me a patch file, I had to upload a new tarball.

S3

Speaker 3

32:39

And I just found I really love that. And the style back then was here's a mailing list is very, it wasn't as there's certainly weren't the tools that are available today, it was very early on, but I really started to, that's the whole year, I think I did about 7 packages that year, right? And then by the end of the year, I collected them into a thing called Multipack.

S2

Speaker 2

32:57

So in

S1

Speaker 1

32:57

99,

S3

Speaker 3

32:58

there was this thing called Multipack, and that's when a high school student, I know he was a high school student at the time, a guy named Robert Kern, took that package and made a Windows installer. And then of course a massive increase of usage.

S2

Speaker 2

33:12

So by the way, most of this development was under Linux.

S3

Speaker 3

33:15

Yes, yes It was on Linux. I was a Linux developer doing it on a Unix box. I mean, at the time I was actually getting into, I had a new hard drive, did some kernel programming to to make the hard drive work.

S3

Speaker 3

33:26

I mean, not programming, but modification to the kernel so I could actually get a hard drive working. I love that aspect of it. I was also, at school, I was building a cluster. I took Mac computers and you put Yellow Dog Linux on them.

S3

Speaker 3

33:40

At the Mayo Clinic, they were just, all these Macs that were older, they were just getting rid of, and so I kind of got permission to go grab them together. I put about 24 of them together in a cluster, in a cabinet, and put Yellow Dog Linux on them all, and I wrote a C++ program to do MRI simulation. That was what I was doing at the same time for my day job, so to speak. So I was loving the whole process.

S3

Speaker 3

34:03

At the same time, I was, oh, I need an ordinary differential equation. That's why ordinary differential equations were key, was because that's the heart of a block equation for simulating MRI, is an ODE solver. And so that's, but I actually did that, those happened at the same time. That's why, you know, kind of what you're working on and what you're interested in, they're coinciding.

S3

Speaker 3

34:20

I was definitely scratching my own itch

S2

Speaker 2

34:22

in

S3

Speaker 3

34:22

terms of building stuff, and which helped in the sense that I was using it for me, so at least I had 1 user. I had 1 person who was like, well, no, this is better. I like this interface better.

S3

Speaker 3

34:31

I had the experience of MATLAB to guide some of what those APIs might look like. But you're just doing yourself. You're building all this stuff. But with the Windows installer, it was the first time I realized, oh, yeah, the binary installer really helps people.

S3

Speaker 3

34:43

And so That led to spending more time on that side of things. So around 2000, so I graduated my PhD in

S1

Speaker 1

34:51

2000,

S3

Speaker 3

34:53

end of 2000. So 99 doing a lot of work there, 98 doing a lot of work there, 99 kind of spending more time on my PhD, you know, helping people use the tools, thinking about what do I want to go from here. There was a company, there was a guy actually, Eric Jones and Travis Vought, they were 2 friends who founded a company called Enthought, it's here in Austin, still here.

S3

Speaker 3

35:13

And they, Eric contacted me at the time when I was a graduate student still. And he said, hey, why don't you come down? We want to build a company. We're thinking of a scientific company, and we want to take what you're doing and kind of add it to some stuff that he'd done.

S3

Speaker 3

35:29

He'd written some tools. And then Piero Peterson had done F2Pi, let's come together and build, pull this all together and call it SciPy. So that's the origin of the SciPy brand. It came from, you know, Multipack and a whole bunch of modules I'd written, plus a few things from some other folks, and then pulled together in a single installer.

S3

Speaker 3

35:47

SciPy was really a distribution of Python masquerading as a library.

S2

Speaker 2

35:51

How did you think about SciPy in context of Python, in context of Numeric? Like what? So we

S3

Speaker 3

35:56

saw SciPy as a way to make an R&D environment for Python, like use Python, depended on numeric. So numeric was the array library we depended on. And then from there, extend it with a bunch of modules that allowed for, and at the time, the original vision of SciPy was to have plotting, was to have, you know, REPL environment, and kind of a whole, really a whole data environment that you could then install and get going with.

S3

Speaker 3

36:20

And that was kind of the thinking. It didn't really evolve that way, right? It sort of had a, but 1, it's really hard to do massive scale projects with open source collectives. Actually, there's sort of an intrinsic cooperation limit as to which, you know, too many cooks in the kitchen, you know, you can do amazing infrastructure work.

S3

Speaker 3

36:42

When it comes down to bringing it all together into a single deliverable, that actually requires a little more, a little more product management that doesn't really emerge from the same dynamic. So it's struggle, struggle to get, almost too many voices, it's hard to have everybody agree, consensus doesn't really work at that scale. You end up with politics, with the same kind of things that's happening in large organizations trying to decide on what to do together. So consensus building was still, was challenging at scale as more people came in, right?

S3

Speaker 3

37:13

Early on it's fine because there's nobody there, And so it works. But then as you get more successful and more people use it, all of a sudden, oh, there's this scale at which this doesn't work anymore and we have to come up with different approaches. So Sidebyte came out officially in 2001, was the first release. Most of the time, I remember the days of getting that release ready.

S3

Speaker 3

37:30

It was a Windows installer and there were bugs on how the Windows compiler handled complex numbers, and you were chasing segmentation faults. And it's a lot of work. There was a lot of effort that had nothing to do with my area of study. And at the same time, I had just gotten an offer.

S3

Speaker 3

37:47

So he wondered if I wanted to come down and help him start that company with his friend. And at the time, I was like, I was intrigued, but I was squaring a path, an academic path. And I had just got an offer to go and teach at my alma mater. So I took that tenure track position.

S3

Speaker 3

38:02

And SciPy, and kind of, then I started working on SciPy as a professor too. So that's, I left, I got to Mayo Clinic, graduated, wrote my thesis using SciPy, wrote, you know, there's images that were created. Now the plotting tool I used was something from Yorick actually. It was a plotting PLT, kind of a plotting language that I used.

S3

Speaker 3

38:22

Yorick is a programming language? It was a programming language, it had a plotting tool, Dyslin, it had integration to Dyslin. I ended up using Dyslin plus some of the plotting from Yorick linked to from Python. Anyway, it was, people don't plot that way now, but this was before, and SciPy was trying to add plotting.

S2

Speaker 2

38:40

Yeah.

S3

Speaker 3

38:40

Right? It didn't have much success. Really, the success of plotting came from John Hunter, who had a similar experience to my experience, my kind of maverick experience as a person just trying to get stuff done and kind of having more time than money maybe, right?

S2

Speaker 2

38:53

And John Hunter created what? Mapplotlib. He's the creator of Mapplotlib?

S3

Speaker 3

38:57

Yeah, so John Hunter was, you know, he wasn't a student at the time, but he was an actor, he was working in quant field, and he said, we need better plotting. So he just went out and said, cool, I'll make a new project, and we'll call it Matplotlib, and he released it in 2001, about the same time that SciPy came out. And it was separate library, separate install, used numeric, SciPy used numeric.

S3

Speaker 3

39:15

And so SciPy, you know, in 2001 we released SciPy and then Enthoc created a conference called SciPy, which brought people together to talk about the space. And that conference is still ongoing, it's 1 of the favorite conferences of a lot of people because it's, you know, it's changed over the years, but early on it was a collection of 50 people who care about, scientists mostly, practicing scientists who want to care about coding and doing it well and not using MATLAB. I remember being driven by, I like MATLAB, but I didn't like the fact that, So I'm not opposed to proprietary software. I'm actually not an open source zealot.

S3

Speaker 3

39:50

I love open source for what it brings, but I also see the role for proprietary software. But what I didn't like was the fact that I would develop code and publish it, and then effectively telling somebody here, to run my code, you have to have this proprietary software.

S2

Speaker 2

40:02

Right, and there's also culture around MATLAB, as much, because I've talked to a few folks, MathWorks creates MATLAB. I mean, there's just a culture, they try really hard, but it just is this corporate IBM style culture that's like, or whatever, I don't want to say negative things about IBM

S3

Speaker 3

40:20

or whatever, but there's a... No, it's really that connection. It's something I'm in the middle of right now, is the business of open source, and how do you connect the ethos of cooperative development with the necessity of creating profits.

S3

Speaker 3

40:35

Right now today, I'm still in the middle of that. That's actually the early days of me exploring this question. Because I was writing sci-fi, as an aside. I also had 3 kids at the time.

S3

Speaker 3

40:46

I have 6 kids now. I got married early, wanted a family. I had 3 kids and I remember reading, I remember, read Richard Stallman's post and I was, I was a fan of Stallman. I would read his work.

S3

Speaker 3

40:56

I liked this collective ideas he would have. Certainly the ideas on IP law, I read a lot of stuff, but then he said, you know, okay, well, I remember that. Well, how do I make money with this? How do I make a living?

S3

Speaker 3

41:06

How do I pay for my kids? All this stuff was in my mind. Young graduate student making no money, thinking I gotta get a job. And he said, well, you know, I think just be like me and don't have kids, right?

S3

Speaker 3

41:15

That's just don't. That's his take on it, that's his chat. That was what he said in that moment, right? That's the thing I read and I went, okay, this is a train I can't get on.

S2

Speaker 2

41:24

There has to be a way to preserve the culture of open source and still be able to make sufficient money to feed your kids.

S3

Speaker 3

41:30

Yes, exactly, there's gotta be. Well, so that actually led me to a study of economics because at the time I was ignorant and I really was. I'm actually, I'm embarrassed for educational system that they could let me, and I was valedictorian in my high school class and I did super well in college.

S3

Speaker 3

41:44

Academically I did great, But the fact that I could do that and then be clueless about this key part of life, it led me to go, there's a problem. Like, I should have learned this in fifth grade. I should have learned this in eighth grade. Like, everybody should come out with a basic knowledge of economics.

S2

Speaker 2

42:01

You're an interesting example because you've created tools that changed the lives of probably millions of people and the fact that you don't understand at the time of the creation of those tools, the basics economics of how to build up a giant system is a problem.

S3

Speaker 3

42:15

Yeah, it's a problem. And so during my PhD at the same time, this is back in 98, 99, at the same time, I was in a library, I was reading books on capitalism, I was reading books on Marxism, I was reading books on, you know, what is this thing? What does it mean?

S3

Speaker 3

42:29

And I encountered a, basically, I encountered a set of writings from people that said they were the inheritors of Adam Smith.

S2

Speaker 2

42:35

Read

S3

Speaker 3

42:35

Adam Smith for the first time, which is the wealth of nations and this notion of emergent societies and realized, oh, there's this whole world out here of people. The challenge of economics is also political. Because economics, people, different parties running for office, they want their economic friends.

S3

Speaker 3

42:57

They want their economists to back them up, right? Or to be their magicians, like the magicians in Pharaoh's court, right? The people that are going to say, hey, you should listen to me because I've got the expert who says this. And so it gets really muddled, right?

S3

Speaker 3

43:11

But I was looking at it as a scientist going, what is this space? What does this mean? How does Paris get fed? What is money?

S3

Speaker 3

43:18

How does it work? I found a lot of writings I really loved. I found some things that I really loved, and I learned from that. It was writings from people like Von Mises.

S3

Speaker 3

43:26

He wrote a paper in 1920 that still should be read more than it is. It was the economic calculation problem of the socialist commonwealth. It was basically in response to the Bolshevik Revolution in 1917. And his basic argument was, it's not going to work to not have private property.

S3

Speaker 3

43:41

You're not going to be able to come up with prices. The bureaucrats aren't going to be able to determine how to allocate resources without a price system. And a price system emerges from people making trades. And they can only make trades if they have authority over the thing they're trading.

S3

Speaker 3

43:55

And that creates information flow that you just don't have if you try to top down it. Right. It's like, huh, that's a really good point.

S2

Speaker 2

44:04

Yeah, the prices have a signal that's used, and it's important to have that signal when you're trying to build a community of productive people like you would in the software engineering space.

S3

Speaker 3

44:13

Yeah, the prices are actually an important signaling mechanism, right? And that money is just a bartering tool,

S2

Speaker 2

44:20

right?

S3

Speaker 3

44:20

So this is the first time I've encountered any of this concept, right? And the fact that, oh, this is actually really critical. Like it's so critical to our prosperity and that we're dangerously not learning about this, not teaching our children about this.

S3

Speaker 3

44:35

So you

S2

Speaker 2

44:36

had the 3 kids, you had to make some hard decisions.

S3

Speaker 3

44:37

Had to make some money, right, had to figure it out. But I didn't really care. I mean, I've never been driven by money, just need it,

S2

Speaker 2

44:43

in fact. Right, to eat. So how did that resolve itself in terms of side by?

S3

Speaker 3

44:48

So I would say it didn't really resolve itself. It sort of started a journey that I'm continuing on. I'm still on, I would say.

S3

Speaker 3

44:54

I don't think it resolved itself. But I will say I went in eyes wide open. Like I knew that there were problems with giving stuff away and creating the market externalities, that the fact that, yeah, people might use it and I might not get paid for it and I'll have to figure something else out to get paid. Like at least I can say I'm not bitter that a lot of people have used stuff that I've written and I haven't necessarily benefited economically from it.

S3

Speaker 3

45:20

I've heard other people be bitter about that when they write or they talk, like, oh, I should've got more value out of this. And I'm also, I want to create systems that let people like me, who might have these desires to do things, let them benefit. So it actually creates more of the same.

S2

Speaker 2

45:34

Not to turn on your bitterness, Majo, but there's some aspect, I wish there was mechanisms for me to reward whoever created Sai Pai and Nam Pai, because it brought so much joy to my life.

S3

Speaker 3

45:45

I appreciate that. You know

S2

Speaker 2

45:46

what I mean?

S3

Speaker 3

45:46

The tip jar notion was there. I appreciate that.

S2

Speaker 2

45:49

But there should be a very frictionless mechanism.

S3

Speaker 3

45:50

There should be frictionless mechanism. I totally agree. I would love to talk about some of the ideas I have because I actually came across, I think I've come up with some interesting notions that could work, but they'll require, Anything that will work takes time to emerge.

S3

Speaker 3

46:03

Things don't just turn overnight. That's definitely 1 thing I've also understood and learned is any fixes, that's why it's kind of funny, we often give credit to, oh, this president gets elected, and oh, look how great things have done. And I saw that when I had a transition in a condo when a new CEO came in, right? And it's like the success that's happening, there's an inertia there.

S2

Speaker 2

46:23

Yeah. Right? And sometimes the decision you made like 10 years before is the reason why the success is there.

S3

Speaker 3

46:28

Right, exactly. So we're sort of just running around taking credit for stuff.

S2

Speaker 2

46:32

The credit assignment has like a delay to it. Yes. That makes the credit assignment basically wrong more than right.

S3

Speaker 3

46:39

Wrong more than right, exactly. And so I'm like, oh, this is, you know, that's the stuff I would read a ton about, you know, early on. So I don't, I feel like I'm with you.

S3

Speaker 3

46:47

Like I want the same thing. I want to be able to, and honestly, not for personally, I've been happy. I've been, I've been happy. I feel like I don't have any, I mean, we've been done reasonably okay,

S2

Speaker 2

46:55

but

S3

Speaker 3

46:55

I've had to pursue it. Like that's, that's really what started my trajectory from academia is reading that stuff. Let me say, oh, entrepreneurship matters.

S3

Speaker 3

47:05

I love software, but we need more entrepreneurs and I wanna understand that better. So once I kind of had that virus infect my brain, Even though I was on a trajectory to go to a tenure track position at a university and I was there for 6 years, I was kind of already out the door when I started. And we can get into that. What, can

S2

Speaker 2

47:28

I just ask a quick question on, is there some design principles that were in your mind around SciPy? Like, is there some key ideas that were just like sticking to you that this is the fundamental ideas? Yeah, I

S3

Speaker 3

47:40

would say so. I would think it's basically accessibility to scientists. Like give them, give scientists and engineers tools that they don't have to think a lot about programming.

S3

Speaker 3

47:48

So give them really good building blocks. Give them functions that they want to call and sort of just the right length of spelling. You know, there's a 1 tradition in programming where it's like, you know, make very, very long names, right? And you can see it in some programming languages where the names get, you know, take half the screen.

S3

Speaker 3

48:08

And in the 4chan world, characters would have to be 6 letters early on, right? And that's way too much, too little. But I was like, I liked to have names that were informative, but short.

S2

Speaker 2

48:18

So even though Python, well this is a different conversation, but documentation is doing some work there. So when you look at great scientific libraries and functions, there's a richness of documentation that helps you get into the details. The first glance at a function gives you the intuition of all it needs to do by looking at the headers and so on.

S2

Speaker 2

48:40

But to get the depths of all the complexities involved, all the options involved, documentation does some of the work.

S3

Speaker 3

48:45

Documentation is essential. Yeah, so that was actually, so we thought about several things. 1 is we wanted plotting, we wanted interactive environment, we wanted good documentation.

S3

Speaker 3

48:54

These were things we knew we wanted. The reality is those took about 10 years to evolve, right? Given the fact that we didn't have a big budget, it was all volunteer labor. It was sort of, when Enthought got created and they started to try to find projects, people would pay for pieces and they were able to fund some of it, not nearly enough to keep up with what was necessary.

S3

Speaker 3

49:15

And I'm no, no criticism, just simply the reality. I mean, it's hard to start a business and then do consulting and then also promote an open source project that's still fairly new. Saipa was fairly niche. We stayed connected all while I was a student, sorry, a professor.

S3

Speaker 3

49:30

I went to BYU and started to teach electrical engineering, all the applied math courses. I loved teaching signal processing, probability theory, electromagnetism, I was, if you look at my professor, which my kids love to do, I wasn't, I got some bad reviews because people.

S2

Speaker 2

49:46

What was the criticism?

S3

Speaker 3

49:48

I would speak too high of a level. Like I definitely had a calibration problem coming out of graduate work where I hate to be condescending to people. Like I really have a ton of respect for people fundamentally.

S3

Speaker 3

49:59

Like my fundamental thing is I respect people. Sometimes that can lead to a, I was thinking they were, they had more knowledge than they did. And so I would just speak at a very high level, assume they got it.

S2

Speaker 2

50:10

But they need to rise to the standard that you set. I mean, that's 1 of the, Some of the greatest teachers do that.

S3

Speaker 3

50:16

And I agree. And that was kind of what was inspiring me. But you also have to, I cannot say I was an articulate of some of the greatest teachers.

S3

Speaker 3

50:25

Right? Like 1 classic example, when I first taught at BYU, My very first class, it was overheads, transparencies, overheads. Before projectors were really that common, I put transparencies, I'm writing my notes out. I go in, room's half dark, I just blaring through these transparencies.

S3

Speaker 3

50:42

Here it is, here it is, here it is. And I gave a quiz after 2 weeks, nowhere knew anything. Nothing I had gotten anywhere. And I realized, okay, this is not working.

S3

Speaker 3

50:54

So I put away the transparencies and I turned around and just started using the chalkboard. And what it did is it slowed me down. The chalkboard just slowed me down and gave people time to process and to think and then that made me focus. My writing wasn't great on the chalkboard, but I really loved that part of the teaching.

S3

Speaker 3

51:10

So that entered SciPy's world in terms of we always understood that there's a didactic aspect of SciPy, Kind of how do you take the knowledge and then produce it? The challenge we had was the scope. Like ultimately SciPy was everything, right? And so 2001 when it first came out, people were starting to use it.

S3

Speaker 3

51:26

No, this is cool. This is a tool we actually use. At the same time, 2001 timeframe, there was a little bit of like the Hubble Space Telescope, the folks at Hubble, and started to say, hey, Python, we're going to use Python for processing images from Hubble. And so Perry Greenfield was a good friend and running that program.

S3

Speaker 3

51:42

And he had called me before I left to BYU and said, you know, we want to do this, but numeric actually has some challenges in terms of, you know, the array doesn't have enough types. We need more operations. You know, broadcast needs to be a little more settled. They wanted record arrays.

S3

Speaker 3

51:57

They wanted, you know, record arrays are like a data frame, but a little bit different. But they wanted more structured data. So he had called me even early on then and he said, you know, would you want to work on something to make this work? And I said, yeah, I'm interested, but I'm going here and we'll see if I have time.

S3

Speaker 3

52:11

So in the meantime, while I was teaching and SciPy was emerging and I had a student, I was constantly while I was teaching trying to figure a way to fund this stuff So I had a graduate student my only graduate student a Chinese fellow Lou Hongsa is his name great guy He wrote a bunch of stuff for iterative iterative linear algebra like got into writing some of the iterative literary algebras, tools that are currently there in SciPy, and they've gotten better since, but this is in 2005, kept working on SciPy, but Perry has started working on a replacement to Numeric called Numerae.

S2

Speaker 2

52:45

And in

S1

Speaker 1

52:45

2004,

S3

Speaker 3

52:46

a package called NDImage, it was an image processing library

S2

Speaker 2

52:50

that

S3

Speaker 3

52:50

was written for NumArray. And it had in it a morphology tool. I don't know if you know what morphology is.

S3

Speaker 3

52:56

It's open, dilations, you know, there was sort of this. As a medical imaging student, I knew what it was because it was used in segmentation a lot. And in fact, I had wanted to do something like that in Python, in SciPy, but just had never gotten around to it. So when it came out that it worked only on numerae, and SciPy needed numeric, and so we effectively had the beginning of this split,

S2

Speaker 2

53:19

and

S3

Speaker 3

53:19

numeric and numerate didn't share data. They were just 2, so you could have a gigabyte of numeric data and a gigabyte of numeric data, and they wouldn't share it.

S2

Speaker 2

53:27

And

S3

Speaker 3

53:27

so you had these scientific libraries written on top. I got really bugged by that. Yeah.

S2

Speaker 2

53:32

I

S3

Speaker 3

53:32

got really like, oh man, this is not good. We're not cooperating now. We're sort of redoing each other's work and we're just this young community.

S3

Speaker 3

53:40

So that's what led me, even though I knew it was risky because I was on a tenure track position, 2004 I got reviewed. They said, hey, things are going okay, you're doing well, paper's coming out, but you're kind of spending a lot of time on this open source stuff, maybe do a little less of that, and a little more of the paper writing and grant writing, which was naive, but it was definitely the thinking. Still goes on. Still goes on.

S2

Speaker 2

54:02

You're basically creating a thing which enables science in the 21st century. Right. Maybe don't emphasize that so much in your 4 year tenure.

S3

Speaker 3

54:12

Right. It illustrates some of the challenges. Yes. It does, and it's, people mean well, but we've gotten broken in a bunch of ways.

S2

Speaker 2

54:22

Certain things, programming, understanding the role of software engineering and programming in society is a little bit lagging, yes.

S3

Speaker 3

54:28

Now, I was in an electrical engineering position.

S2

Speaker 2

54:29

Right, that's even worse,

S3

Speaker 3

54:32

there. Yeah, they were very focused, And so, good people, and I had a great time. I loved my time, I loved my teaching, I loved all the things I did there. The problem was this split was happening in this community that I loved.

S3

Speaker 3

54:43

I saw people and I went, oh my gosh, this is gonna be, this is not great. And so, I happened, you know, fate, I had a class I had signed up for, I was trying to build an MRI system. So I had a kind of a radio, instead of a radio, a digital radio class, it was a digital MRI class. And I had people sign up, 2 people signed up, then they dropped, and so I had nobody in this class.

S3

Speaker 3

55:07

And I didn't have any other courses to teach, and I thought, oh, I've got some time, and I'll just write a merger of numeric and num-array. Like, I'll basically take the numeric code base, add the features NumPy was adding, and then kind of come up with a single array library that everybody can use. So that's where NumPy came from, was my thinking, hey, I can do this, and who else is going to? Because at that point, I'd been around the community long enough, and I'd written enough C code, I knew the structures.

S3

Speaker 3

55:33

In fact, my first contribution to Numeric had been writing the C API documentation that went in the first documentation for NumPy, for Numeric, sorry. This is Paul Dubois, David Asher, Conrad Hinson, and myself. I got credit because I wrote this chapter,

S2

Speaker 2

55:47

which

S3

Speaker 3

55:47

is all the C API of numeric, all the C stuff. So I said, I'm probably the 1 to do it. Nobody else is gonna do this.

S3

Speaker 3

55:54

So it was sort of, out of a sense of duty and passion, knowing that I don't think my academic, I don't think the department here is gonna appreciate this, but it's the right thing to do. Can we just

S2

Speaker 2

56:07

linger on that moment? Because the importance of the way you thought and the action you took, I feel is understated and is rare And I would love to see so much more of it because what happens as the tools become more popular, there's a split that happens. And it's a truly heroic and impactful action to in that early split to step up.

S2

Speaker 2

56:34

And it's like great leaders throughout history, like get, what is the brave heart, like get on a horse and rally the troops because I think that can make a big difference. We have TensorFlow versus PyTorch in the machine learning community. We have

S3

Speaker 3

56:49

the same problem today. Yeah, it's actually bigger.

S2

Speaker 2

56:52

I wonder if it's possible in the early days to rally the troops.

S3

Speaker 3

56:58

It is possible, especially in the early days. The longer it goes, the harder, right? And the more energy in the factions, the harder.

S3

Speaker 3

57:03

But in the early days, it is possible, and it's extremely helpful. And there's a willingness there, but the challenge is there's usually not a willingness to fund it. There's not a willingness to, you know, like I was literally walking into a field saying, I'm gonna do this and here I am, I have 5 kids at home now. Yeah.

S3

Speaker 3

57:23

Pressure builds. Sometimes my wife hears these stories and she's like, you did what? I thought we were gonna, I thought you were actually on a path to make sure we had resources and money. But again, there's an aspect.

S3

Speaker 3

57:36

I'm a very hopeful person. I'm an optimistic person by nature. I love people. I learned that about myself later on.

S3

Speaker 3

57:44

Part of my religious beliefs actually lead to that. And it's why I hold them dear, because it's actually how I feel about, it's what leads me to these attitudes, sort of this hopefulness and this sense of, yeah, it may not work out for me financially, or maybe, but that's not the ultimate gain. Like, that's a thing, but it's not, you know, that's not the scorecard for me. And so I just wanted to be helpful, and I knew, and partly because these Sci-Fi conferences, because the mailing list conversations, I knew there was a lot of need for this, right?

S3

Speaker 3

58:13

And so I had this, it wasn't like I was alone in terms of no feedback. I had these people who knew, but it was crazy. Like people who at the time said, yeah, we didn't think you'd be able to do it. We thought it was crazy.

S2

Speaker 2

58:22

And also instructive, like practically speaking, that you had a cool feature that you were chasing the morphology, like the- Yes. Like it's not just like- There's

S3

Speaker 3

58:32

an end result.

S2

Speaker 2

58:33

It's not some visionary thing, I'm going to unite the community. You were like, you were actually practically, this is what 1 person actually could do and actually build.

S3

Speaker 3

58:43

Because that is important. Because you can get over your skis. You can definitely get over your skis.

S3

Speaker 3

58:48

And I had, in fact, this almost got me over my skis, right? I would say, well, in retrospect, I hate looking back. We can, I can tell you all the flaws with NumPy, right? When I go into it, I would, there's lots of stuff that I'm like, oh man, that's embarrassing.

S3

Speaker 3

59:01

That was wrong. I wish I had somebody slap me with a wet fish there.

S2

Speaker 2

59:04

Like

S3

Speaker 3

59:04

I needed, like what I'd wished I'd had was somebody with more experience and certainly library writing and array library. There's like, I wish I had me. I could go back in time and go do this, do that.

S3

Speaker 3

59:14

There's a Morton Bean. There's things we did that are still there that are problematic, that created challenges for later. And I didn't know it at the time, didn't understand how important that was. And in many cases, didn't know what to do.

S3

Speaker 3

59:26

Like there was pieces of the design of NumPy, I didn't know what to do until 5 years ago. Now I know what they should have been, but I didn't know at the time and I couldn't get the help. Anyway, so I wrote it. It took about, it took 4 months to write the first version, then about 14 months to make it usable.

S3

Speaker 3

59:43

But it was that first 4 months of intense writing, coding, getting something out the door that worked. That was, it was definitely challenging. And then the big thing I did was create a new type object called D-type. That was probably the contribution.

S3

Speaker 3

59:58

And then the fact that I added the.