See all Lex Fridman transcripts on Youtube

youtube thumbnail

Regina Barzilay: Deep Learning for Cancer Diagnosis and Treatment | Lex Fridman Podcast #40

1 hours 17 minutes 28 seconds

🇬🇧 English

S1

Speaker 1

00:00

The following is a conversation with Regina Barsley. She's a professor at MIT and a world-class researcher in natural language processing and applications of deep learning to chemistry and oncology or the use of deep learning for early diagnosis, prevention and treatment of cancer. She has also been recognized for teaching of several successful AI-related courses at MIT, including the popular Introduction to Machine Learning course. This is the Artificial Intelligence Podcast.

S1

Speaker 1

00:32

If you enjoy it, subscribe on YouTube, give it 5 stars on iTunes, support it on Patreon, or simply connect with me on Twitter at Lex Friedman, spelled F-R-I-D-M-A-N. And now, here's my conversation with Regina Barsley.

S2

Speaker 2

00:48

In an interview, you've mentioned that if there's 1 course you would take, it would be a literature course with a friend of yours, that a friend of yours teaches. Just out of curiosity, because I couldn't find anything on it, Are there books or ideas that had profound impact on your life journey, books and ideas, perhaps outside of computer science and the technical fields?

S3

Speaker 3

01:11

I think because I'm spending a lot of my time at MIT and previously in other institutions where I was a student, I have a limited ability to interact with people. So a lot of what I know about the world actually comes from books. And there were quite a number of books that had profound impact on me and how I view the world.

S3

Speaker 3

01:31

Let me just give you 1 example of such a book. I've maybe a year ago read a book called The Emperor of All Melodies. It's a book about, it's kind of a history of science book on how the treatments and drugs for cancer were developed. And that book, despite the fact that I am in the business of science, really opened my eyes on how imprecise and imperfect the discovery process is and how imperfect our current solutions and what makes science succeed and be implemented.

S3

Speaker 3

02:10

And sometimes it's actually not the strength of the idea, but devotion of the person who wants to see it implemented. So this is 1 of the books that, you know, at least for the last year, quite changed the way I'm thinking about scientific process just from the historical perspective and what do I need to do to make my ideas really implemented. Let me give you an example of a book, which is a fiction book, is a book called Americana. And this is a book about a young female student who comes from Africa to study in the United States.

S3

Speaker 3

02:54

And it describes her past, you know, within her studies and her life transformation that, you know, in a new country and kind of adaptation to a new culture. And when I read this book, I saw myself in many different points of it, But it also kind of gave me the lens on different events and some events that I never actually paid attention. 1 of the funny stories in this book is how she arrives to her new college and she starts speaking in English and she has this beautiful British accent because that's how she was educated in her country. This is not my case.

S3

Speaker 3

03:41

And then she notices that the person who talks to her, you know, talks to her in a very funny way, in a very slow way, and she's thinking that this woman is disabled, and she's also trying to kind of to accommodate her. And then after a while, when she finishes her discussion with this officer from her college, she sees how she interacts with other students, with American students, and she discovers that actually she talked to her this way because she saw that she doesn't understand English. And I thought, wow, this is a fun experience. And literally within a few weeks, I went to LA to a conference and I asked somebody in the airport how to find a cab or something.

S3

Speaker 3

04:26

And then I noticed that this person is talking in a very strange way. And my first thought was that this person has some pronunciation issues or something, and I'm trying to talk very slowly to him. And I was with another professor, Ernst Frankel, and he's like laughing because it's funny that I don't get that the guy is talking in this way because he thinks that I cannot speak. So it was really kind of mirroring experience and it led me think a lot about my own experiences moving from different countries.

S3

Speaker 3

04:56

So I think that books play a big role in my understanding of the world.

S2

Speaker 2

05:03

On the science question, you mentioned that it made you discover that personalities of human beings are more important than perhaps ideas. Is that what I heard?

S3

Speaker 3

05:13

It's not necessarily that they are more important than ideas, but I think that ideas on their own are not sufficient. And many times, at least at the local horizon, it's the personalities and their devotion to their ideas is really that locally changes the landscape. Now, if you're looking at AI, like let's say 30 years ago, you know, dark ages of AI or whatever word is symbolic times, you can use any word.

S3

Speaker 3

05:42

You know, there were some people, now we're looking at a lot of that work and we are kind of thinking this was not really maybe a relevant work, but you can see that some people managed to take it and to make it so shiny and dominate the academic world and make it to be the standard. If you look at the area of natural language processing, it is well-known fact that the reason that statistics in NLP took such a long time to become mainstream because there were quite a number of personalities which didn't believe in this idea and then stop research progress in this area. So I do not think that kind of asymptotically maybe personalities matters, but I think locally it does make quite a bit of impact.

S2

Speaker 2

06:33

And it's

S3

Speaker 3

06:35

generally, you know, speeds up the rate of adoption of the new ideas.

S2

Speaker 2

06:41

Yeah, and the other interesting question is in the early days of particular discipline, I think you mentioned in that book was, is ultimately a book of cancer.

S3

Speaker 3

06:52

It's called The Emperor of All Melodies.

S2

Speaker 2

06:55

Yeah, and those melodies included the trying to, the medicine, was it centered around that?

S3

Speaker 3

07:00

So it was actually centered on how people thought of curing cancer. Like for me, it was really a discovery how people, what was the science of chemistry behind drug development, that it actually grew up out of dyeing, like coloring industry that people who develop chemistry in 19th century in Germany and Britain to do the really new dyes, they looked at the molecules and identified that they do certain things to cells. And from there, the process started.

S3

Speaker 3

07:34

And like historians say, yeah, this is fascinating that they managed to make the connection and look under the microscope and do all this discovery. But as you continue reading about it and you read about how chemotherapy drugs were developed in Boston, and some of them were developed. And Dr. Farber from Dana-Farber, how the experiments were done, that there was some miscalculation, let's put it this way, and they tried it on the patients and those were children with leukemia and they died.

S3

Speaker 3

08:09

And then they tried another modification. You look at the process, how imperfect is this process? And if we're again looking back like 60 years ago, 70 years ago, you can kind of understand it. But some of the stories in this book, which were really shocking to me, were really happening, you know, maybe decades ago.

S3

Speaker 3

08:28

And we still don't have a vehicle to do it much more fast and effective and scientific, the way I'm thinking, computer science scientific.

S2

Speaker 2

08:38

So from the perspective of computer science, you've gotten a chance to work the application to cancer and to medicine in general. From a perspective of an engineer and a computer scientist, how far along are we from understanding the human body, biology, of being able to manipulate it in a way we can cure some of the maladies, some of the diseases?

S3

Speaker 3

08:59

So this is very interesting question. And if you're thinking as a computer scientist about this problem, I think 1 of the reasons that we succeeded in the areas we as a computer scientist succeeded is because we are not trying to understand in some ways. Like if you're thinking about like e-commerce, Amazon, Amazon doesn't really understand you and that's why it recommends you certain books or certain products, correct?

S3

Speaker 3

09:31

And in, you know, traditionally when people were thinking about marketing, you know, they divided the population to different kinds of subgroups, identify the features of the subgroup and come up with a strategy which is specific to that subgroup. If you're looking about recommendation system, they're not claiming that they're understanding somebody, they're just managing from the patterns of your behavior to recommend your product. Now if you look at the traditional biology, and obviously I wouldn't say that I am at any way educated in this field, but what I see, there's really a lot of emphasis on mechanistic understanding. And it was very surprising to me coming from computer science how much emphasis is on this understanding.

S3

Speaker 3

10:17

And given the complexity of the system, maybe the deterministic full understanding of this process is beyond our capacity. And the same way as in computer science, when we're doing recognition, when you do recommendation and many other areas, it's just probabilistic matching process. And in some way, maybe in certain cases, we shouldn't even attempt to understand, or we can attempt to understand, but in parallel, we can actually do this kind of matching that would help us to find hero to do early diagnostics and so on. And I know that in this communities it's really important to understand, but I'm sometimes wondering what exactly does it mean to understand here?

S2

Speaker 2

11:02

Well, there's stuff that works, but that can be, like you said, separate from this deep human desire to uncover the mysteries of the universe, of science, of the way the body works, the way the mind works. It's the dream of symbolic AI, of being able to reduce human knowledge into logic and be able to play with that logic in a way that's very explainable and understandable for us humans. I mean, that's a beautiful dream.

S2

Speaker 2

11:31

So I understand it, but it seems that what seems to work today, and we'll talk about it more, is as much as possible, reduce stuff into data, reduce whatever problem you're interested in to data and try to apply statistical methods, apply machine learning to that. On a personal note, you were diagnosed with breast cancer in

S1

Speaker 1

11:53

2014.

S2

Speaker 2

11:55

What did facing your mortality make you think about? How did it change you?

S3

Speaker 3

12:00

You know, this is a great question. And I think that I was interviewed many times, and nobody actually asked me this question. I think I was

S1

Speaker 1

12:08

43

S3

Speaker 3

12:09

at a time. And the first time I realized in my life that I may die, and I never thought about it before. And there was a long time since you diagnosed until you actually know what you have and how severe is your disease.

S3

Speaker 3

12:20

For me, it was like maybe 2 and a half months. And I didn't know where I am during this time because I was getting different tests and 1 would say it's bad and I would say, no, it is not. So until I knew where I am, I really was thinking about all these different possible outcomes.

S2

Speaker 2

12:38

Were you imagining the worst, or were you trying to be optimistic?

S3

Speaker 3

12:41

It would be really, I don't remember what was my thinking. It was really a mixture with many components at the time, speaking, you know, in our terms. And 1 thing that I remember, and you know, Every test comes and then you're saying, oh, it could be this, or it may not be this, and you're hopeful and then you're desperate.

S3

Speaker 3

13:04

So it's like there is a whole slew of emotions that goes through you. But what I remember is that when I came back to MIT, I was kind of going the whole time through the treatment to MIT, but my brain was not really there. But when I came back, really finished my treatment and I was here teaching and everything, I look back at what my group was doing, what other groups was doing, and I saw these trivialities. It's like people are building their careers on improving some parts around 2% or 3% or whatever.

S3

Speaker 3

13:37

It's like, seriously, I did a work on how to decipher Ugaritic, like a language that nobody speak and whatever. What is significance? When all of a sudden I walked out of MIT, which is when people really do care what happened to your Eclair paper, what is your next publication, to ACL, to the world where people... You see a lot of suffering.

S3

Speaker 3

14:02

I'm kind of totally shielded on it on daily basis. And it's like the first time I've seen real life and real suffering. And I was thinking, why are we trying to improve the parser or deal with some trivialities when we have capacity to really make a change. And it was really challenging to me because on 1 hand, you know, I have my graduate students who really want to do their papers and their work, and they want to continue to do what they were doing, which was great.

S3

Speaker 3

14:31

And then it was me who really kind of reevaluated what is the importance. And also at that point, because I had to take some break, I look back into my years in science and I was thinking, you know, like 10 years ago, this was the biggest thing, I don't know, topic models. We have millions of papers on topic models and variation of topics models, now it's totally irrelevant. And you start looking at this, what do you perceive as important at different points of time and how it fades over time.

S3

Speaker 3

15:08

And since we have a limited time, all of us have limited time on Earth, it's really important to prioritize things that really matter to you, maybe matter to you at that particular point, but it's important to take some time and understand what matters to you, which may not necessarily be the same as what matters to the rest of your scientific community, and pursue that vision.

S2

Speaker 2

15:34

So that moment, did it make you cognizant, you mentioned suffering, of just the general amount of suffering in the world. Is that what you're referring to? So as opposed to topic models and specific detailed problems in NLP, did you start to think about other people who have been diagnosed with cancer?

S2

Speaker 2

15:56

Is that the way you started to see the world, perhaps?

S3

Speaker 3

16:00

Oh, absolutely, and it actually creates, because like, for instance, you know, there is parts of the treatment where you need to go to the hospital every day, and you see, you know, the community of people that you see, and many of them are much worse than I was at the time, and you all of a sudden see it all. And people who are happier some day just because they feel better, and for people who are in our normal realm, you take it totally for granted that you feel well, that if you decide to go running, you can go running and you're pretty much free to do whatever you want with your body. I saw a community, my community became those people.

S3

Speaker 3

16:44

And I remember 1 of my friends, Dina Katabi, took me to Prudential to buy me a gift for my birthday. And it was like the first time in months that I went to kind of to see other people. And I was like, wow, first of all, these people, you know, they are happy and they're laughing and they're very different from these other my people. And second thing, I think it's totally crazy.

S3

Speaker 3

17:04

They're like laughing and wasting their money on some stupid gifts. And they may die. They already may have cancer and they don't understand it. So you can really see how the mind changes, that you can see that, you know, before that you can ask, didn't you know that you're going to die?

S3

Speaker 3

17:24

Of course I knew, but it was a kind of a theoretical notion. It wasn't something which was concrete. And at that point when you really see it and see how little means sometimes the system has to harm, you really feel that we need to take a lot of our brilliance that we have here at MIT and translate it into something useful.

S2

Speaker 2

17:48

Yeah, and useful can have a lot of definitions, but of course, alleviating suffering, alleviating trying to cure cancer is a beautiful mission. So I of course know theoretically the notion of cancer, but just reading more and more about it, 1.7 million new cancer cases in the United States every year,

S1

Speaker 1

18:09

600,000

S2

Speaker 2

18:11

cancer-related deaths every year. So this has a huge impact, United States globally. When broadly, before we talk about how machine learning, how MIT can help, when do you think we as a civilization will cure cancer?

S2

Speaker 2

18:32

How hard of a problem is it from everything you've learned from it recently?

S3

Speaker 3

18:37

I cannot really assess it. What I do believe will happen with the advancement in machine learning is that a lot of types of cancer we will be able to predict way early and more effectively utilize existing treatments. I think, I hope at least, that with all the advancements in AI and drug discovery, we would be able to much faster find relevant molecules.

S3

Speaker 3

19:04

What I'm not sure about is how long it will take the medical establishment and regulatory bodies to kind of catch up and to implement it. And I think this is a very big piece of puzzle that is currently not addressed.

S2

Speaker 2

19:20

That's a really interesting question. So first, a small detail that I think the answer is yes, but is cancer 1 of the diseases that when detected earlier, that's a significantly improves the outcomes. It's so like, cause we will talk about there's the cure and then there is detection.

S2

Speaker 2

19:42

And I think 1 machine learning can really help is earlier detection. So is detection help?

S3

Speaker 3

19:48

Detection is crucial. For instance, the vast majority of pancreatic cancer patients are detected at the stage that they are incurable. That's why they have such a terrible survival rate.

S3

Speaker 3

20:03

It's like just few percent over 5 years is pretty much today a death sentence. But if you can discover this disease early, there are mechanisms to treat it. And in fact, I know a number of people who were diagnosed and saved just because they had food poisoning. They had terrible food poisoning.

S3

Speaker 3

20:25

They went to ER and they got scan. There were early signs on the scan and that's what saved their lives. But this wasn't really an accidental case. So as we become better, we would be able to help too many more people that are likely to develop diseases.

S3

Speaker 3

20:46

And I just want to say that as I got more into this field, I realized that cancer is, of course, terrible disease, but there are really the whole slew of terrible diseases out there, like neurodegenerative diseases and others. So we, of course, a lot of us are fixated on cancer just because it's so prevalent in our society. And you see these people, but there are a lot of patients with neurodegenerative diseases and the kind of aging diseases that we still don't have a good solution for. And I felt as a computer scientist, we kind of decided that it's other people's job to treat these diseases, because it's like traditionally people in biology or in chemistry or MDs are the ones who are thinking about it and after kind of start paying attention, I think that it's really a wrong assumption and we all need to join the battle.

S2

Speaker 2

21:42

So how, it seems like in cancer specifically, that there's a lot of ways that machine learning can help. So what's the role of machine learning in the diagnosis of cancer?

S3

Speaker 3

21:55

So for many cancers today, we really don't know what is your likelihood to get cancer. And for the vast majority of patients, especially on the younger patients, it really comes as a surprise. Like for instance, for breast cancer, 80% of the patients are first in their families, it's like me.

S3

Speaker 3

22:15

And I never saw that I had any increased risk because, you know, nobody had it in my family. And for some reason in my head, it was kind of an inherited disease. But even if I would pay attention, the models that currently, these very simplistic statistical models that are currently used that in clinical practice really don't give you an answer, so you don't know. And the same true for pancreatic cancer, the same true for non-smoking lung cancer, and many others.

S3

Speaker 3

22:45

So what machine learning can do here is utilize all this data to tell us Ellie, who is likely to be susceptible and using all the information that is already there, be it imaging, be it your other tests and eventually liquid biopsies and others, where the signal itself is not sufficiently strong for human eye to do good discrimination because the signal may be weak, but by combining many sources, a machine which is trained on large volumes of data can really detect it early and that's what we've seen with breast cancer and people are reporting it in other diseases as well.

S2

Speaker 2

23:25

That really boils down to data, right? And in the different kinds of sources of data. And you mentioned regulatory challenges.

S2

Speaker 2

23:33

So what are the challenges in gathering large data sets in this space?

S3

Speaker 3

23:40

Again, another great question. So it took me after I decided that I want to work on it 2 years to get access to data.

S2

Speaker 2

23:48

Any data, like any

S3

Speaker 3

23:49

significant data set. Any significant amount, like right now in this country, there is no publicly available data set of modern mammograms that you can just go on your computer, sign a document and get it. It just doesn't exist.

S3

Speaker 3

24:02

I mean, obviously every hospital has its own collection of mammograms. There are data that came out of clinical trials. But we're talking about you as a computer scientist who just want to run his or her model and see how it works. These data like ImageNet doesn't exist.

S3

Speaker 3

24:24

And the, there is an set which is called like Florida data set, which is a film mammogram from nineties, which is totally not representative of the current developments. Whatever you're learning on them doesn't scale up. This is the only resource that is available. And today there are many agencies that govern access to data, like the hospital holds your data, and the hospital decides whether they would give it to the researcher to work with his data or not.

S2

Speaker 2

24:52

An individual hospital?

S3

Speaker 3

24:54

Yeah, I mean the hospital may, assuming that you're doing research collaboration, you can submit, there is a proper approval process guided by our RP, and you, if you go through all the processes, you can eventually get access to the data, but if you yourself know our AI community, there are not that many people who actually ever got access to data because it's very challenging process.

S2

Speaker 2

25:20

And sorry, just a quick comment. MGH or any kind of hospital, are they scanning the data? Are they digitally storing it?

S3

Speaker 3

25:29

Oh, It is already digitally stored. You don't need to do any extra processing steps. It's already there in the right format.

S3

Speaker 3

25:36

It's that right now there are a lot of issues that govern access to the data because the hospital is legally responsible for the data. And they have a lot to lose if they give the data to the wrong person, but they may not have a lot to gain if they give it as a hospital, as a legal entity, as giving it to you. And the way, you know, what I would imagine happening in the future is the same thing that happens when you're getting your driving license. You can decide whether you want to donate your organs.

S3

Speaker 3

26:09

You can imagine that whenever a person goes to the hospital, it should be easy for them to donate their data for research. And it can be different kind of, do they only give you a test results or only imaging data or the whole medical record? Because at the end, we all will benefit from all this insights. And it's not like you can say, I want to keep my data private, but I would really love to get it from other people because other people are thinking the same way.

S3

Speaker 3

26:40

So if there is a mechanism to do this donation and the patient has an ability to say how they want to use their data for research, it would be really a game changer.

S2

Speaker 2

26:54

People, when they think about this problem, there's a, it depends on the population, depends on the demographics, but there's some privacy concerns. Generally, not just medical data, just any kind of data. It's what you said, my data, it should belong kind of to me, I'm worried how it's gonna be misused.

S2

Speaker 2

27:13

How do we alleviate those concerns? Because that seems like a problem that needs to be, that problem of trust, of transparency needs to be solved before we build large data sets that help detect cancer, help save those very people in the future.

S3

Speaker 3

27:30

So I think there are 2 things that could be done. There is a technical solutions and there are societal solutions. On the technical end, we today have ability to improve disambiguation.

S3

Speaker 3

27:48

For instance, for imaging, you can do it pretty well. What's disambiguation? Sorry, disambiguation, removing the identification, removing the names of the people. There are other data, like if it is a raw text, you cannot really achieve 99.9%, but there are all these techniques, and actually some of them are developed at MIT, how you can do learning on the encoded data, where you locally encode the image, you train a network which only works on the encoded images, and then you send the outcome back to the hospital and you can open it up.

S3

Speaker 3

28:26

So those are the technical solutions. There are a lot of people who are working in this space where the learning happens in the encoded form. We are still early, but this is an interesting research area where I think we'll make more progress. There is a lot of work in the natural language processing community, how to do the identification better.

S3

Speaker 3

28:50

But even today, there are already a lot of data which can be de-identified perfectly, like your test data, for instance, correct? Where you can just, you know the name of the patient, you just want to extract the part with the numbers. The big problem here is again, hospitals don't see much incentive to give this data away on 1 hand, and then there is general concern. Now when I'm talking about societal benefits and about the education, the public needs to understand.

S3

Speaker 3

29:23

And I think that there are situations, and I still remember myself when I really needed an answer. I had to make a choice. There was no information to make a choice. You're just guessing.

S3

Speaker 3

29:36

And at that moment, you feel that your life is at stake, but you just don't have information to make the choice. And Many times when I give talks, I get emails from women who say, you know, I'm in this situation, can you please run statistics and see what are the outcomes? We get almost every week a mammogram that comes by mail to my office at MIT, I'm serious, that people ask to run because they need to make life-changing decisions. And of course, I'm not planning to open a clinic here, but we do run and give them the results for their doctors.

S3

Speaker 3

30:16

But the point that I'm trying to make, that we all at some point, or our loved ones, will be in the situation where you need information to make the best choice. And if this information is not available, you would feel vulnerable and unprotected. And then the question is, what do I care more? Because at the end, everything is a trade-off, correct?

S2

Speaker 2

30:40

Yeah, exactly. Just out of curiosity, it seems like 1 possible solution, I'd like to see what you think of it based on what you just said, based on wanting to know answers for when you're yourself in that situation. Is it possible for patients to own their data as opposed to hospitals owning their data?

S2

Speaker 2

31:00

Of course, theoretically, I guess patients own their data, but can you walk out there with a USB stick containing everything or upload it to the cloud where a company, you know, I remember Microsoft had a service, like I was really excited about, and Google Health was there. I tried to give, I was excited about it. Basically, companies helping you upload your data to the cloud so that you can move from hospital to hospital, from doctor to doctor. Do you see a promise of that kind of possibility?

S3

Speaker 3

31:32

I absolutely think this is the right way to exchange the data. I don't know now who's the biggest player in this field, but I can clearly see that even for totally selfish health reasons, when you are going to a new facility and many of us are sent to some specialized treatment, they don't easily have access to your data. And today, you know, we would want to send a smart program, need to go to the hospital, find some small office which gives them the CD and they ship as a CD.

S3

Speaker 3

32:04

So you can imagine we're looking at kind of decades old mechanism of data exchange. So I definitely think this is an area where hopefully all the right regulatory and technical forces will align and we will see it actually implemented.

S2

Speaker 2

32:23

It's sad because unfortunately, and I need to research why that happened, but I'm pretty sure Google Health and Microsoft Health Vault or whatever it's called, both closed down. Which means that there was either regulatory pressure or there's not a business case or there's challenges from hospitals, which is very disappointing. So when you say you don't know what the biggest players are, the 2 biggest that I was aware of and closed their doors.

S2

Speaker 2

32:50

So I'm hoping, I'd love to see why and I'd love to see who else can come up. It seems like 1 of those Elon Musk style problems that are obvious needs to be solved and somebody needs to step up and actually do this large scale data collection.

S3

Speaker 3

33:07

So I know there is an initiative in Massachusetts, I think, actually led by the governor to try to create this kind of health exchange system, where at least to help people who kind of, when you show up in emergency room and there is no information about what are your allergies and other things. So I don't know how far it will go. But another thing that you said, and I find it very interesting, is actually who are the successful players in this space and the whole implementation?

S3

Speaker 3

33:36

How does it go? To me, it is from the anthropological perspective, it's more fascinating that AI that today goes in healthcare. We've seen so many attempts and so very little successes. And it's interesting to understand that I have by no means have knowledge to assess it, why we are in the position where we are.

S2

Speaker 2

33:59

Yeah, it's interesting, because data is really fuel for a lot of successful applications. And when that data requires regulatory approval, like the FDA or any kind of approval, it seems that the Computer scientists are not quite there yet in being able to play the regulatory game, understanding the fundamentals of it.

S3

Speaker 3

34:21

I think that in many cases, when even people do have data, we still don't know what exactly do you need to demonstrate to change the standard of care. Let me give you an example related to my breast cancer research. So in traditional breast cancer risk assessment, there is something called density, which determines the likelihood of a woman to get cancer.

S3

Speaker 3

34:50

And this is pretty much says, how much white do you see on the mammogram? The whiter it is, the more likely the tissue is dense. And The idea behind density, it's not a bad idea, in 1967, a radiologist called Wolf decided to look back at women who were diagnosed and see what is special in their images, can we look back and say that they're likely to develop? So he come up with some patterns, and It was the best that his human eye can, you know, can identify.

S3

Speaker 3

35:20

Then it was kind of formalized and coded into 4 categories. And that's what we are using today. And today, this density assessment is actually a federal law from

S1

Speaker 1

35:32

2019,

S3

Speaker 3

35:34

approved by President Trump and for the previous FDA commissioner, where women are supposed to be advised by their providers if they have high density, putting them into higher risk category. And in some states, you can actually get supplementary screening paid by your insurance because you're in this category. Now you can say, how much science do we have behind it?

S3

Speaker 3

35:56

Whatever, biological science or epidemiological evidence. So it turns out that between 40 and 50% of women have dense breasts. So about 40% of patients are coming out of their screening and somebody tells them, you are in high risk. Now what exactly does it mean if you as half of the population are high risk?

S3

Speaker 3

36:19

It's from saying, maybe I'm not, you know, or what do I really need to do with it? Because the system doesn't provide me a lot of the solutions because there are so many people like me, we cannot really provide very expensive solutions for them. And the reason this whole density became this big deal, it's actually advocated by the patients who felt very unprotected because many women went and did the mammograms, which were normal, And then it turns out that they already had cancer, quite developed cancer. So they didn't have a way to know who is really at risk and what is the likelihood that when the doctor tells you, you're okay, you're not okay.

S3

Speaker 3

36:57

So at the time, and it was 15 years ago, this maybe was the best piece of science that we had, and it took quite 15, 16 years to make it federal law. But now this is a standard. Now with a deep learning model, we can so much more accurately predict who is going to develop breast cancer just because you are trained on a logical thing. And instead of describing how much white and what kind of white machine can systematically identify the patterns, which was the original idea behind the sort of the tradiologist machine is can do it much more systematically and predict the risk when you're training the machine to look at the image and to say the risk in 125 years.

S3

Speaker 3

37:42

Now you can ask me how long it will take to substitute this density, which is broadly used across the country and really is not helping to bring this new models. And I would say it's not a matter of the algorithm. Algorithms already orders of magnitude better than what is currently in practice. I think it's really the question, who do you need to convince?

S3

Speaker 3

38:04

How many hospitals do you need to run the experiment? All this mechanism of adoption, and how do you explain to patients and to women across the country that this is really a better measure. And again, I don't think it's an AI question. We can work more and make the algorithm even better, but I don't think that this is the current barrier.

S3

Speaker 3

38:28

The barrier is really this other piece that for some reason is not really explored. It's like anthropological piece. And coming back to your question about books, there is a book that I'm reading. It's called American Sickness by Elizabeth Rosenthal.

S3

Speaker 3

38:48

And I got this book from my clinical collaborator, Dr. Connie Lehman, and I said, I know everything that I need to know about American health system, but you know, every page doesn't fail to surprise me. And I think there is a lot of interesting and really deep lessons for people like us from computer science who are coming into this field to really understand how complex is the system of incentives in the system to understand how you really need to play to drive adoption.

S2

Speaker 2

39:19

You just said it's complex, but if we're trying to simplify it, who do you think most likely would be successful if we push on this group of people? Is it the doctors? Is it the hospitals?

S2

Speaker 2

39:31

Is it the governments or policy makers, is it the individual patients, consumers, who needs to be inspired to most likely lead to adoption? Or is there no simple answer?

S3

Speaker 3

39:46

There's no simple answer, but I think there is a lot of good people in medical system who do want to make a change. And I think a lot of power will come from us as consumers, because we all are consumers, or future consumers, of healthcare services. And I think we can do so much more in explaining the potential, and not in the hype terms, and not saying that we now cured Alzheimer and I'm really sick of reading these kind of articles which make these claims, but really to show with some examples what this implementation does and how it changes the care.

S3

Speaker 3

40:28

Because I can't imagine, it doesn't matter what kind of politician it is, we all are susceptible to these diseases. There is no 1 who is free. And eventually, we all are humans and we are looking for a way to alleviate the suffering. And this is 1 possible way where we currently are underutilizing, which I think can help.

S2

Speaker 2

40:51

So it sounds like the biggest problems are outside of AI in terms of the biggest impact at this point. But are there any open problems in the application of ML to oncology in general. So improving the detection or any other creative methods, whether it's on the detection segmentations or the vision perception side or some other clever of inference.

S2

Speaker 2

41:16

Yeah, what in general in your view are the open problems in this space?

S3

Speaker 3

41:20

Yeah, I just want to mention that beside detection, another area where I am kind of quite active and I think it's really an increasingly important area in healthcare is drug design. Absolutely. Because it's fine if you detect something early, but you still need to get drugs and new drugs for these conditions.

S3

Speaker 3

41:43

And today, All of the drug design, ML is non-existent there. We don't have any drug that was developed by the ML model or even not developed, but at least even knew that ML model plays some significant role. I think This area with all the new ability to generate molecules with desired properties to do in silica screening is really a big open area. To be totally honest with you, when we are doing diagnostics and imaging, primarily taking the ideas that were developed for other areas and you applying them with some adaptation.

S3

Speaker 3

42:20

The area of drug design is really technically interesting and exciting area. You need to work a lot with graphs and capture various 3D properties. There are lots and lots of opportunities to be technically creative. And I think there are a lot of open questions in this area.

S3

Speaker 3

42:46

You know, We're already getting a lot of successes even with the first generation of these models, but there is much more new creative things that you can do. And what's very nice to see is that actually the more powerful, the more interesting models actually do do better. So there is a place to innovate in machine learning in this area. And some of these techniques are really unique to, let's say, to graph generation and other things.

S3

Speaker 3

43:19

So.

S2

Speaker 2

43:21

Just to interrupt really quick, I'm sorry. Graph generation or graphs, drug discovery in general. How do you discover a drug?

S2

Speaker 2

43:31

Is this chemistry? Is this trying to predict different chemical reactions? Or is it some kind of, what do graphs even represent in this space?

S3

Speaker 3

43:42

Oh, sorry, sorry.

S2

Speaker 2

43:43

And what's a drug?

S3

Speaker 3

43:45

Okay, so let's say you're thinking there are many different types of drugs, but let's say you're gonna talk about small molecules because I think today the majority of drugs are small molecules. So small molecule is a graph. The molecule is just where the node in the graph is an atom and then you have the bonds.

S3

Speaker 3

44:01

So it's really a graph representation if you're looking at it in 2D, correct? You can do it 3D, but let's say, well, let's keep it simple and stick in 2D. So pretty much my understanding today, how it is done at scale in the companies without machine learning, you have high throughput screening. So you know that you are interested to get certain biological activity of the compound, so you scan a lot of compounds, like maybe hundreds of thousands, some really big number of compounds.

S3

Speaker 3

44:32

You identify some compounds which have the right activity and then at this point, the chemists come and they're trying to now to optimize this original heat to different properties that you want it to be, maybe soluble, you want it to decrease toxicity, you want it to decrease the side effects.

S2

Speaker 2

44:51

So that- Are those, sorry again to interrupt, can that be done in simulation or just by looking at the molecules or do you need to actually run reactions in real labs with lab coats

S3

Speaker 3

45:01

and stuff. So when you do high-throughput screening, you really do screening, it's in the lab. It's really the lab screening, you screen the molecules, correct?

S3

Speaker 3

45:10

I don't

S2

Speaker 2

45:11

know what screening is.

S3

Speaker 3

45:12

The screening is just check them for certain property.

S2

Speaker 2

45:15

Like in the physical space, in the physical world, like actually there's a machine probably that's doing some, that's actually running the reaction.

S3

Speaker 3

45:21

Actually running the reactions, yeah. So there is a process where you can run, and that's why it's called high throughput, that it become cheaper and faster to do it on very big number of molecules. You run the screening, you identify potential good starts.

S3

Speaker 3

45:40

And then when the chemists come in who have done it many times, and then they can try to look at it and say, how can I change the molecule to get the desired profile in terms of all other properties? So maybe how do I make it more bioactive and so on? And there, the creativity of the chemists really is the 1 that determines the success of this design, because again, they have a lot of domain knowledge of what works, how do you decrease the CCD and so on, and that's what they do. So all the drugs that are currently in the FDA-approved drugs or even drugs that are in clinical trials, they are designed using these domain experts, which goes through this combinatorial space of molecules or graphs or whatever, and find the right 1 or adjust it to be the right ones.

S2

Speaker 2

46:35

Sounds like the breast density heuristic from 67, the same echoes.

S3

Speaker 3

46:40

It's not necessarily that. It's really driven by deep understanding. It's not like they just observe it.

S3

Speaker 3

46:46

I mean, they do deeply understand chemistry and they do understand how different groups and how does it change the properties. So there is a lot of science that gets into it and a lot of kind of simulation, how do you want it to behave. It's very, very complex.

S2

Speaker 2

47:03

So they're quite effective at this design, obviously.

S3

Speaker 3

47:06

Now, effective, yeah, we have drugs. Like depending on how do you measure effective, if you measure it in terms of cost, it's prohibitive. If you measure it in terms of times, You know, we have lots of diseases for which we don't have any drugs and we don't even know how to approach and don't need to mention few drugs or neurodegenerative disease drugs that fail, you know.

S3

Speaker 3

47:26

So there are lots of, you know, trials that fail, you know, in later stages, which is really catastrophic from the financial perspective. So is it the effective, the most effective mechanism? Absolutely no, but this is the only 1 that currently works. And I was closely interacting with people in pharmaceutical industry.

S3

Speaker 3

47:49

I was really fascinated on how sharp and what a deep understanding of the domain do they have. It's not observation driven. There is really a lot of science behind what they do. But if you ask me, can machine learning change it?

S3

Speaker 3

48:02

I firmly believe yes, because even the most experienced chemists cannot hold in their memory and understanding everything that you can learn from millions of molecules and reactions.

S2

Speaker 2

48:16

And the space of graphs is a totally new space. I mean, it's a really interesting space for machine learning to explore, graph generation.

S3

Speaker 3

48:23

Yeah, so there are a lot of things that you can do here. So we do a lot of work. So the first tool that we started with was the tool that can predict properties of the molecules.

S3

Speaker 3

48:36

So you can just give the molecule and the property, it can be bioactivity property or it can be some other property and you train the molecules and you can now take a new molecule and predict this property. Now when people started working in this area, it is something very simple. They do kind of existing fingerprints, which is kind of handcrafted features of the molecule when you break the graph to substructures and then you run it in a feedforward neural network. And what was interesting to see that clearly, this was not the most effective way to proceed and you need to have much more complex models that can induce the representation, which can translate this graph into the embeddings and do these predictions.

S3

Speaker 3

49:21

So this is 1 direction. Then another direction, which is kind of related, is not only to stop by looking at the embedding itself, but actually modify it to produce better molecules. So you can think about it as machine translation, that you can start with a molecule and then there is an improved version of molecule, and you can again, with encoder, translate it into the hidden space and then learn how to modify it to improve in some ways version of the molecules. So that's, it's kind of really exciting.

S3

Speaker 3

49:52

We already have seen that the property prediction works pretty well and now we are generating molecules and there is actually labs which are manufacturing this molecule. So we'll see where it will get us.

S2

Speaker 2

50:06

Okay, that's really exciting. There's a lot of promise. Speaking of machine translation and embeddings, you have done a lot of really great research in NLP, natural language processing.

S2

Speaker 2

50:19

Can you tell me your journey through NLP? What ideas, problems, approaches were you working on? Were you fascinated with? Did you explore before this magic of deep learning reemerged and after?

S3

Speaker 3

50:33

So when I started my work in NLP, it was in 97. This was a very interesting time. It was exactly the time that I came to ACL and the time I could barely understand English.

S3

Speaker 3

50:45

But it was exactly like the transition point because half of the papers were really rule-based approaches where people took more kind of heavy linguistic approaches for small domains and try to build up from there. And then there were the first generation of papers, which were corpus-based papers. And they were very simple in our terms when you collect some statistics and do prediction based on them. But I found it really fascinating that 1 community can think so very differently about the problem.

S3

Speaker 3

51:19

And I remember my first paper that I wrote, it didn't have a single formula, it didn't have evaluation, it just had examples of outputs. And this was a standard of the field at the time. In some ways, I mean, people maybe just started emphasizing their empirical evaluation, but for many applications like summarization, you just show some examples of outputs. And then increasingly, you can see that how the statistical approach has dominated the field and we've seen increased performance across many basic tasks.

S3

Speaker 3

51:56

The sad part of the story may be that if you look again through this journey, we see that the role of linguistics in some ways greatly diminishes. And I think that you really need to look through the whole proceeding to find 1 or 2 papers which make some interesting linguistic references. It's really

S2

Speaker 2

52:17

big. Today.

S3

Speaker 3

52:18

Today, today. This was definitely

S2

Speaker 2

52:20

not there. So things like syntactic trees, just even basically against our conversation about human understanding of language, which I guess what linguistics would be, structured, hierarchical, representing language in a way that's human explainable, understandable is missing today.

S3

Speaker 3

52:39

I don't know if it is, what is explainable and understandable. In the end, you know, we perform functions and it's okay to have machine which performs a function. Like when you're thinking about your calculator, correct?

S3

Speaker 3

52:53

Your calculator can do calculation very different from you would do the calculation, but it's very effective in it. And this is fine. If we can achieve certain tasks with high accuracy, it doesn't necessarily mean that it has to understand it the same way as we understand it. In some ways, it's even naive to request because you have so many other sources of information that are absent when you are training your system.

S3

Speaker 3

53:17

So it's okay. And they deliver it. And I will tell you 1 application that's really fascinating. In 97, when I came to ACL, there were some papers on machine translation.

S3

Speaker 3

53:25

They were primitive. People were trying really, really simple. And the feeling, my feeling was that to make real machine translation system, it's like to fly and the moon and build a house and a garden and live happily ever after. I mean, it's like impossible.

S3

Speaker 3

53:42

I never could imagine that within 10 years we would already see the system working and now nobody is even surprised to utilize the system on a daily basis. So this was like a huge, huge progress, saying that people for a very long time tried to solve using other mechanisms and they were unable to solve it. That's why coming back to a question about biology, that in linguistics, people try to go this way and try to write the syntactic trees and try to obstruct it and to find the right representation. And they couldn't get very far with this understanding while these models using other sources actually capable to make a lot of progress.

S3

Speaker 3

54:31

Now I'm not naive to think that we are in this paradise space in NLP and I'm sure as you know, that when we slightly change the domain and when we decrease the amount of training, it can do like really bizarre and funny thing, but I think it's just a matter of improving generalization capacity, which is just a technical question.

S2

Speaker 2

54:51

Well, so that's the question. How much of language understanding can be solved with deep neural networks? In your intuition, I mean, it's unknown, I suppose.

S2

Speaker 2

55:03

But as we start to creep towards romantic notions of the spirit of the Turing test and conversation and dialogue and Something that maybe to me or to us silly humans feels like it needs real understanding. How much can that be achieved with these neural networks or statistical methods?

S3

Speaker 3

55:27

So I guess I am very much driven by the outcomes. Can we achieve the performance which would be satisfactory for us for different tasks? Now if you again look at machine translation systems, which are trained on large amounts of data, they really can do a remarkable job relatively to where they've been a few years ago.

S3

Speaker 3

55:51

And if you project into the future, if it will be the same speed of improvement, this is great. Now, does it bother me that it's not doing the same translation as we are doing now? If you go to cognitive science, we still don't really understand what we are doing. I mean, there are a lot of theories and there is obviously a lot of progress in studying, but our understanding of what exactly goes on in our brains when we process language is still not crystal clear and precise that we can translate it into machines.

S3

Speaker 3

56:25

What does bother me is that, again, that machines can be extremely brittle when you go out of your comfort zone of that, when there is a distributional shift between training and testing. And it have been years and years. Every year when I teach an LP class, I show them some examples of translation from some newspaper in Hebrew, whatever, it was perfect. Then I have a recipe that Tomi Ackerman's sister sent me a while ago and it was written in Finnish of Karelian pies.

S3

Speaker 3

56:57

And it's just a terrible translation. You cannot understand anything, what it does. It's not like some syntactic mistakes, it's just terrible. And year after year I try it and it will translate, and year after year it does this terrible work because I guess the recipes are not a big part of the training repertoire.

S2

Speaker 2

57:15

So, but in terms of outcomes, that's a really clean, good way to look at it. I guess the question I was asking is, do you think, imagine a future, do you think the current approaches can pass the Turing test in the way, in the best possible formulation of the Turing test, which is, would you want to have a conversation with a neural network for an hour?

S3

Speaker 3

57:42

Oh God, no. No, there are not that many people that I would want to talk for an hour. But- There

S2

Speaker 2

57:48

are some people in this world, alive or not, that you would like to talk to for an hour. Could a neural network achieve that outcome?

S3

Speaker 3

57:56

So I think it would be really hard to create a successful training set which would enable it to have a conversation, to contextual conversation for an hour. So you think

S2

Speaker 2

58:06

it's a problem of data, perhaps? I

S3

Speaker 3

58:08

think in some ways it's not a problem of data. It's a problem both of data and the problem of the way we're training our systems, their ability to truly to generalize, to be very compositional, in some ways it's limited, you know, in the current capacity, at least, you know, we can translate well, we can, you know, find information well, we can extract information. So there are many capacities in which it's doing very well.

S3

Speaker 3

58:35

And you can ask me, would you trust the machine to translate for you and use it as a source? I would say, absolutely, especially if we're talking about newspaper data or other data, which is in the realm of its own training set, I would say yes. But having conversations with a machine, it's not something that I would choose to do. But I would tell you something, talking about Turing tests and about all this kind of ELISA conversations, I remember visiting Tencent in China and they have this chat board and they claim that it's like really humongous amount of the local population, which like for hours talks to the chat board.

S3

Speaker 3

59:12

To me it was, I cannot believe it, but apparently it's like documented that there are some people who enjoy this conversation. And you know, it brought to me another MIT story about Eliza and Weissenbaum. I don't know if you're familiar with the story. So Weissenbaum was a professor at MIT.

S3

Speaker 3

59:30

And when he developed this Eliza, which was just doing string matching, very trivial, like restating of what you said with very few rules, no syntax. Apparently there were secretaries at MIT that would sit for hours and converse with this trivial thing. And at the time there was no beautiful interfaces, so you actually need to go through the pain of communicating. And with Zimbabwe himself was so horrified by this phenomenon that people can believe enough to the machine.

S3

Speaker 3

59:58

Do you just need to give them a little bit of time to communicate? And so, I think that's the reason why we have to be very careful about the communication of the machine. And I think that's the reason why we have to be very careful about the communication of the machine.