The following is a conversation with Jeremy Howard.

He's the founder of Fast AI, a research institute dedicated to making deep learning more accessible.

He's also a distinguished research scientist at the University of San Francisco, a former president of Kegel, as well as a top-ranking competitor there and in general he's a successful entrepreneur, educator, researcher, and inspiring personality in the AI community.

When someone asks me how do I get started with deep learning, Fast.ai is 1 of the top places I point them to.

It's free, it's easy to get started, it's insightful and accessible.

And if I may say so, it has very little BS.

They can sometimes dilute the value of educational content on popular topics like deep learning.

Fast.ai has a focus on practical application of deep learning and hands-on exploration of the cutting edge that is incredibly both accessible to beginners and useful to experts.

This is the Artificial Intelligence podcast.

If you enjoy it, subscribe on YouTube, give it 5 stars on iTunes, support it on Patreon, or simply connect with me on Twitter at Lex Friedman, spelled F-R-I-D-M-A-N.

And now, here's my conversation with Jeremy Howard.

What's the first program you ever written?

First program I wrote that I remember would be at high school.

I did an assignment where I decided to try to find out if there were some better musical scales than the normal 12 tone, 12 interval scale.

So I wrote a program on my Commodore 64 in BASIC that searched through other scale sizes to see if it could find 1 where there were more accurate, you know, harmonies.

Like you want an actual exactly 3 to 2 ratio, whereas with a 12 interval scale, it's not exactly 3 to 2, for example.

So I played saxophone and clarinet and piano and guitar and drums and whatever.

So how does that thread go through your life?

I, for various reasons, couldn't really keep it going, particularly because I had a lot of problems with RSI with my fingers.

And so I had to kind of like cut back anything that used hands and fingers.

I hope 1 day I'll be able to get back to it health wise.

Well, probably bass saxophone, but they're awkward.

Well, I always love it when music is coupled with programming.

There's something about a brain that utilizes those that emerges with creative ideas.

So you've used and studied quite a few programming languages.

Can you give an overview of what you've used?

My favorite programming environment, almost certainly was Microsoft Access back in the earliest days.

That was Visual Basic for Applications, which is not a good programming language, but the programming environment was fantastic.

It's like the ability to create, you know, user interfaces and tie data and actions to them and create reports and all that is, I've never seen anything as good.

There's things nowadays like Airtable, which are like small subsets of that, which people love for good reason, but unfortunately nobody's ever achieved anything like that.

program that Microsoft produced, part of Office, and that kind of withered, you know, but basically it lets you in a totally graphical way create tables and relationships and queries and tie them to forms and set up, you know, event handlers and calculations.

And it was a very complete, powerful system designed for not massive, scalable things, but for like useful little applications that I loved.

So what's the connection between Excel and Access?

So Access kind of was the relational database equivalent, if you like.

So people still do a lot of that stuff that should be in access in Excel.

But it's just not as rich a programming model as VBA combined with a relational database.

And so I've always loved relational databases, but today programming on top of a relational database is just a lot more of a headache.

You know, you generally either need to kind of, you know, you need something that connects, that runs some kind of database server, unless you use SQLite, which has its own issues.

And you kind of often, if you want to get a nice programming model, you'll need to like create an add an ORM on top.

And then, I don't know, there's all these pieces tied together, and it's just a lot more awkward than it should be.

There are people that are trying to make it easier.

So in particular, I think of F sharp, you know, Don Syme, who, him and his team have done a great job of making something like a database appear in the type system.

So you actually get like tab completion for fields and tables and stuff like that.

Anyway, so that was kind of, anyway, so like that whole VBA office thing, I guess, was a starting point, which I still miss.

which- That's interesting just to pause on that for a second.

It's interesting that you're connecting programming languages to the ease of management of data.

So in your use of programming languages, you always had a love and a connection with data.

I've always been interested in doing useful things for myself and for others, which generally means getting some data and doing something with it and putting it out there again.

So I also did a lot of stuff with AppleScript back in the early days.

So it's kind of nice being able to get the computer and computers to talk to each other and to do things for you.

And then I think that 1, the programming language I most loved then would have been Delphi, which was Object Pascal, created by Anders Helsberg, who previously did Turbo Pascal, and then went on to create .NET, and then went on to create TypeScript.

Delphi was amazing, because it was like a compiled, fast language that was as easy to use as Visual Basic.

Delphi, what is it similar to in more modern languages?

So I'm not sure there's anything quite like it anymore.

If you took like C sharp or Java and got rid of the virtual machine and replaced it with something you could compile a small type binary.

I feel like it's where Swift could get to with the new Swift UI and the cross-platform development going on.

Like that's 1 of my dreams is that we'll Hopefully get back to where Delphi was.

There is actually a free Pascal project nowadays called Lazarus, which is also attempting to kind of recreate Delphi.

So okay, Delphi, that's 1 of your favorite programming languages.

Again, I'd say Pascal's not a nice language.

If you wanted to know specifically about what languages I like, I would definitely pick J as being an amazingly wonderful language.

except from doing a little research on the work you've done.

OK, so Not at all surprising you're not familiar with it because it's not well known, but it's actually 1 of the main families of programming languages going back to the late 50s, early 60s.

So there was a couple of major directions.

1 was the kind of lambda calculus, Alonzo Church direction, which I guess kind of Lisp

Scheme and whatever, which has a history going back to the early days of computing.

The second was the kind of Imperative slash, oh, you know, algo, simula, going under C, C++, so forth.

There was a third, which are called array-oriented languages, which started with a paper by a guy called Ken Iverson, which was actually a math theory paper, not a programming paper.

It was called Notation as a Tool for Thought.

And it was the development of a new way, a new type of math notation.

And the idea is that this math notation was much more flexible, expressive, and also well-defined than traditional math notation, which is none of those things.

And so he actually turned that into a programming language.

And because this was the early 50s, or the, sorry, late 50s, all the names were available.

So he called his language a programming language or APL.

So APL is a implementation of notation as a tool of the thought, by which he means math notation.

And Ken and his son went on to do many things, but eventually they actually produced a, you know, a new language that was built on top of all the learnings of APL.

And J is the most expressive, composable, language of, you know, beautifully designed language I've ever seen.

Not really, it's an array oriented language.

So array-oriented means that you generally don't use any loops, but the whole thing is done with kind of a extreme version of broadcasting, if you're familiar with that NumPy slash Python concept.

And the idea is that you can kind of, because you can do so much with 1 line of code, a single screen of code is very unlikely to, you very rarely need more than that to express your program.

And so you can kind of keep it all in your head and you can kind of clearly communicate it.

It's interesting that APL created 2 main branches, K and J.

J is this kind of like open source niche community of crazy enthusiasts like me.

And then the other path, K, was fascinating.

It's an astonishingly expensive programming language, which many of the world's most ludicrously rich hedge funds use.

So the entire K machine is so small it sits inside level 3 cache on your CPU and it easily wins every benchmark I've ever seen in terms of data processing speed.

But you don't come across it very much because it's like $100,000 per CPU to run it.

It's like this path of programming languages is just so much, I don't know, so much more powerful in every way than the ones that almost anybody uses every day.

It's pretty heavily focused on computation.

I mean, so much of programming is data processing by definition.

And so there's a lot of things you can do with it, but Yeah, there's not much work being done on making like Use user interface Toolkits or whatever.

I mean there's some but it's they're not great

at the same time you've done a lot of stuff with Perl and Python.

So where does that fit into the picture of J and K and APL?

Like in the end, you have to end up where the libraries are.

Because to me, my focus is on productivity.

I just want to get stuff done and solve problems.

I created an email company called Fastmail and Pell was great because back in the late 90s, early 2000s, it just had a lot of stuff it could do.

I still had to write my own monitoring system and my own web framework, my own whatever, because none of that stuff existed.

But it was a super flexible language to do that in.

And you used Perl for FastMail, you used it as a backend?

Why do you think Perl hasn't succeeded or hasn't dominated the market where Python really takes over a lot of

But then the guy that ran Perl, Larry Wall, kind of just didn't put the time in anymore.

And no project can be successful if there isn't, you know, particularly 1 that started with a strong leader that loses that strong leadership.

You know, Python is a lot less elegant language in nearly every way, but it has the data science libraries and a lot of them are pretty great.

So I kind of use it because it's the best we have, but it's definitely not good enough.

But what do you think the future of programming looks like?

What do you hope the future of programming looks like?

If we zoom in on the computational fields, on data science, on machine learning?

Because the goal of Swift, the way Chris Latner describes it, is to be infinitely hackable.

I want something where me and the people I do research with and my students can look at and change everything from top to bottom.

There's nothing mysterious and magical and inaccessible.

Unfortunately with Python, it's the opposite of that because Python's so slow, it's extremely unhackable.

You get to a point where it's like, okay, from here on down at C.

So your debugger doesn't work in the same way, your profiler doesn't work in the same way, your build system doesn't work in the same way.

Is it for the objective of optimizing training of neural networks, inference of neural networks, is it performance of the system?

Or is there some non-performance related just

I mean, in the end, I want to be productive as a practitioner.

So that means that, so like at the moment, our understanding of deep learning is incredibly primitive.

Most things don't work very well, even though it works better than anything else out there.

There's so many opportunities to make it better.

So you look at any domain area, like, I don't know, speech recognition with deep learning or natural language processing classification with deep learning or whatever.

Every time I look at an area with deep learning, I always see like, oh, it's terrible.

There's lots and lots of obviously stupid ways to do things that need to be fixed.

So then I want to be able to jump in there and quickly experiment and make them better.

You think the programming language is, has a role in that?

So currently Python, has a big, gap in terms of our ability to innovate, particularly around recurrent neural networks and natural language processing, because it's so slow.

The actual loop where we actually loop through words, we have to do that whole thing in CUDA C.

So we actually can't innovate with the kernel, the heart of that most important algorithm.

So we hit, you know, research limitations.

Another example, convolutional neural networks, which are actually the most popular architecture for lots of things, maybe most things in deep learning.

We almost certainly should be using sparse convolutional neural networks.

But only like 2 people are, because to do it you have to rewrite all of that courtesy level stuff.

And yeah, just researchers and practitioners don't.

So like there's just big gaps in like what people actually research on, what people actually implement because of the programming language problem.

So you think, you think it's just too difficult to write in Kudas C that a programming, a higher level programming language like Swift should enable the easier fooling around creative stuff with RNNs or with sparse convolutional networks.

Who's in charge of making it easy for a researcher to play around?

And I mean, part of the fault is that we ignored that whole APL kind of direction.

Most or nearly everybody did for 60 years, 50 years.

But recently people have been starting to reinvent pieces of that and kind of create some interesting new directions in the compiler technology.

So the place where that's particularly happening right now is something called MLIR, which is something that again, Chris Latner, the Swift guy is leading.

And yeah, because it's actually not going to be Swift on its own that solves this problem.

Because the problem is that currently writing a acceptably fast, you know, GPU program is too complicated regardless of what language you use.

And that's just because if you have to deal with the fact that I've got, you know, 10,000 threads and I have to synchronize between them all and I have to put my thing into grid blocks and think about warps and all this stuff.

It's just, it's just so much boilerplate that to do that well, you have to be a specialist at that.

And it's going to be a year's work to, you know, optimize that algorithm in that way.

But With things like tensor comprehensions and Tile and MLIR and TVM, there's all these various projects which are all about saying, let's let people create domain-specific languages for tensor computations.

These are the kinds of things we do generally on the GPU for deep learning.

And then have a compiler which can optimize that tensor computation.

A lot of this work is actually sitting on top of a project called Halide, which is a mind-blowing project where they came up with such a domain-specific language.

1 domain-specific language for expressing this is what my tensor computation is.

And another domain-specific language for expressing this is the kind of the way I want you to structure the compilation of that, like do it block by block and do these bits in parallel.

And they were able to show how you can compress the amount of code by 10x compared to optimized GPU code and get the same performance.

So that's like, so these other things are kind of sitting on top of that kind of research and MLIR is pulling a lot of those best practices together.

And now we're starting to see work done on making all of that directly accessible through Swift, so that I could use Swift to kind of write those domain-specific languages.

And hopefully we'll get then Swift CUDA kernels written in a very expressive and concise way that looks a bit like J in APL.

And then Swift layers on top of that, and then a Swift UI on top of that.

And, you know, that'll be so nice if we can get to that point.

Now, does it all eventually boil down to CUDA and NVIDIA GPUs?

But 1 of the nice things about MLIR, if AMD ever gets their act together, which they probably won't, is that they or others could write MLIR backends for other GPUs or other tensor computation devices, of which today there are increasing number like Graphcore or Vertex AI or whatever.

So yeah, being able to target lots of backends would be another benefit of this.

And the market really needs competition because at the moment, NVIDIA is massively overcharging for their kind of enterprise class cards because there is no serious competition because nobody else is doing the software properly.

In the cloud there is some competition right?

TPUs are almost unprogrammable at the moment.

So you can't, the TPUs has the same problem that you can't.

So TPUs, Google actually made an explicit decision to make them almost entirely unprogrammable because they felt that there was too much IP in there.

And if they gave people direct access to program them, people would learn their secrets.

So you can't actually directly program the memory in a TPU.

You can't even directly create code that runs on and that you look at on the machine that has the GPU, it all goes through a virtual machine.

So all you can really do is this kind of cookie cutter thing of like plug-in high-level stuff together, which is just super Tedious and annoying and totally unnecessary.

So what was the, tell me if you could, the origin story of Fast.ai?

What is the motivation, its mission, its dream?

So I guess the founding story is heavily tied to my previous startup, which is a company called Enlytic, which was the first company to focus on deep learning for medicine.

And I created that because I saw there was a huge opportunity to, there's about a 10X shortage of the number of doctors in the world, in the developing world that we need.

Expected it would take about 300 years to train enough doctors to meet that gap, but I guess that maybe if we used deep learning for some of the analytics, we could maybe make it so you don't need as highly trained doctors.

Where's the biggest benefit, just before we get to fast AI, where's the biggest benefit of AI in medicine that you see today?

Not much happening today in terms of like stuff that's actually out there.

It's very early, but in terms of the opportunity, It's to take markets like India and China and Indonesia, which have big populations, Africa, small numbers of doctors, and provide diagnostic, particularly treatment planning and triage, kind of on device so that if you do a, you know, test for malaria or tuberculosis or whatever, you immediately get something that even a healthcare worker that's had a month of training can get a very high quality assessment of whether the patient might be at risk and tell, you know, okay, we'll send them off to a hospital.

So for example, in Africa, outside of South Africa, there's only 5 pediatric radiologists for the entire continent.

So if your kid is sick and they need something diagnosed through medical imaging, the person, even if you're able to get medical imaging done, the person that looks at it will be, you know, a nurse at best.

But actually in India, for example, and China, almost no x-rays are read by anybody, by any trained professional, because they don't have enough.

So if instead we had an algorithm that could take the most likely high-risk 5% and say triage, basically say, okay, someone needs to look at this.

It would massively change the kind of way that what's possible with medicine in the developing world.

And remember, they have, increasingly, they have money.

They're the developing world, they're not the poor world, they're developing world.

So they have the money, so they're building the hospitals, they're getting the diagnostic equipment, but they just, there's no way for a very long time will they be able to have the expertise.

Okay, and that's where the deep learning systems can step in and magnify the expertise they do have.

So you do see just to linger it a little bit longer, the interaction, do you still see the human experts still at the core of these systems?

Is there something in medicine that could be automated almost completely?

I don't see the point of even thinking about that because we have such a shortage of people Why would we not why would we want to find a way not to use them?

Right, like we have people so the idea of like even from an economic point of view, if you can make them 10X more productive, getting rid of the person doesn't impact your unit economics at all.

And it totally ignores the fact that there are things people do better than machines.

So it's just, to me, that's not a useful way of framing

See all Lex Fridman transcripts on Youtube

Jeremy Howard: fast.ai Deep Learning Courses and Research | Lex Fridman Podcast #35