SPEAKER 1: You're listening to (Re)thinking Insurance, a podcast series from WTW where we discuss the issues facing P&C, life, and composite insurers around the globe, as well as explore the latest tools, techniques, and innovations that will help you rethink insurance.
RAVI SHARMA: Hello, and welcome to (Re)thinking Insurance. I'm your host Ravi Sharma. On today's podcast, we'll be discussing layered GBMs with Madeline Main, an Associate Director in WTW's Insurance Consulting and Technology Practice. And Rachel McNaughton also an Associate Director in WTW's Insurance Consulting and Technology Practice. Welcome to you both.
RACHAEL MCNAUGHTON: Hi, Ravi. Thanks for having me.
MADELINE MAIN: Hey. And thanks for having me as well.
RAVI SHARMA: I'm happy to have you both on the show. Now, as many of you know, we like to learn a little bit about our guests before we jump into our main topic. So, what's something that you both like to do outside of work? Madeline let's start with you.
MADELINE MAIN: I enjoy doing holiday crafts with my toddler.
RAVI SHARMA: Oh, that sounds super interesting. And I'm sure it's very fun. So, Rachel, what about you?
RACHAEL MCNAUGHTON: I enjoy spending time with my dog whom I have named data just because I love taking my work home with me.
RAVI SHARMA: Oh, I love it. That's a perfect name for a dog. So, let's transition into today's main topic, layered GBMs. For those who don't know what a layered GBM is, Rachel, could you define that for us?
RACHAEL MCNAUGHTON: Yeah, sure, Ravi. So layered GBM is a modification of a traditional gradient boosting machine algorithm. So, the way a traditional gradient boosting machine algorithm works is it essentially builds up lots and lots of decision trees, one after the other. Each of them tries to predict the difference between how the model's currently doing and what the actual values are.
So, it gets closer and closer as you add more and more trees to the right answer. So, one of the problems with the way a gradient boosting machine works is that as it builds up more and more trees-- and there can be thousands within these types of models-- it's not at all clear what sort of effects or trends there are in the actual model.
So, if you compare that to a Generalized Linear Model or a GLM where you'll have your main effects and your interaction effects very clearly isolated within the model definition itself, with a gradient boosting machine, you lose all that isolation. And it's not clear at all. And you can only approximate after the fact, whether there's an-- if it's a main factor, it's an interaction, or what's going on in the model at all.
So, the way a layered GBM works is it essentially breaks up the process. So instead of just building a decision tree that can capture your main effects, and interaction effects, all at once and get it all muddled, it essentially forces some structure. So, what it will do is it will build the first layer first which captures all main effects.
Once it's captured all the main effects and there's nothing left to find by just looking at a single variable, we can then move on to the second layer. And then it will build the second layer, marginalizing all the one-way main effects, and focusing just on building up the interaction effects with two factors in it. And so on and so forth.
It still captures the same predictive accuracy that a GBM will do, so it will be just as accurate as a traditional GBM. But it's forcing the interpretation or forcing the effects to be split out over independent layered so that you can better understand and isolate where the effects are in that model.
RAVI SHARMA: Great. Removing one of the common pitfalls that the black box of a GBM so you can have your cake and eat it too. How does this address the pitfalls of machine learning?
RACHAEL MCNAUGHTON: This is an interesting one. So, I mean, we've spent a lot of time with my team looking at applying machine learning to insurance. I've always found it really, interesting, much more so than generic applications of machine learning, because in insurance we have a lot more to worry about.
It's not just about predictive accuracy. We must think about regulations. We must think about being able to understand and interpret our models, much more so than typically must do elsewhere.
And so, we found when trying to apply machine learning methods that have been developed for other industries, there were a lot of challenges in doing that. So often we found that whilst we wanted to use the automation, the very high predictive power that machine learning models would provide, often came at quite a big cost when it comes to being able to interpret those models.
And as an industry where we've had a very long history of being able to use GLMs very, very effectively, a lot of the times, whilst we can gain automation from using something like a gradient boosting machine, say, it comes at too much of a cost.
So, where we can know exactly what effects and variables are impacting our prices with a GLM implementation, when we try and use something like GBM, for example, or other machine learning methods, we just lose all that interpretability. And it is quite a big hurdle to cross.
Instead of just using the open-source machine learning approaches, we have some very, very smart people on the team. We thought why not create an algorithm designed to have all that automation, the same level of productiveness, but with the interpretability built in by design so that we didn't have to give that up as well.
RAVI SHARMA: That's great, yeah. And I this is something Madeline's super passionate about because I know-- I've heard her on several calls before talk about democratizing the access to the models. And so, Madeline, I think that's a good segue way into getting access to users who don't traditionally have access to this type of analysis. Could you elaborate a little bit on that?
MADELINE MAIN: Yeah. So, I think that traditionally GBMs-- not to call out Rachel as a DNS scientist-- but I feel like they've been kind of relegated to the realm of data scientists, and some of those very highly analytical types. And oftentimes business users aren't as able to see it in a very intuitive fashion how the GBM could apply to their actual business decisions.
And so, one of the things, I think, is cool is that Rachel's team has developed a layered GBM that really allows you to see in an intuitive way how the different variables are impacting the model. And then through our decision support platform Radar, you're able to give access to those GBMs to product management and senior leaders.
So that even if, let's say, your CFO asks this current inflationary environment, how is that going to impact my rate changes over the next year? How are my results going to change? And really being able to bring all the insights of machine learning in an intuitive fashion to folks who might not have had access to that in the past.
RAVI SHARMA: So, in Rachel's description of GBMs and machine learning specifically, she talked a little bit about regulation. And I know in the US we're in a totally different regulatory environment than other places in the world. So, Rachel, I was hoping you could elaborate on how this is relevant in the UK.
RACHAEL MCNAUGHTON: Yeah, it's quite interesting. Especially from a regulatory point of view, it's probably a little bit less relevant than it is in the US. We certainly have regulations in place, and so it's probably almost more of a cultural and business led thing that really drove it in the UK. A lot of insurers are already using a lot of machine learning techniques in the UK. EBMs are very, very popular.
So, it's more just seeing the challenges that people faced when implementing those. So, we got to see insurers get to the point where they were implementing their GBMs. But the true cost at which they did that, the fact that when they're trying to understand what it is exactly those models are doing, when they're trying to justify them to their internal model governance boards, it was very much an approximation.
Any kind of analysis or interpretation of what was going on. And some clients were OK with that. But often, we found that was a really, big hurdle. And they were losing a lot in not being able to fully understand with confidence what the model was doing. And, in fact for a lot of the cases it was more so that, yes, they were trying to understand and interpret those gradient boosting machine models.
But they had to spend so much time to do it because it was such a difficult task and it was such a black box model that to gain the confidence that they needed to do, they were spending more time on the analysis than they were on building the model in the first place. So, there was a lot of efficiencies gained as well with the introduction of the layered GBM.
RAVI SHARMA: So, Madeline, I have a tall task for you here. What I'm hoping you can do is compare the deployment of machine learning, and specifically this layered GBM in the UK, in the US. And, really, what does employ this in the US regulatory environment look like?
MADELINE MAIN: So, with the disclaimer that I'm not a regulator, not giving specific regulatory advice. We know, of course, in the US the other challenge is that not only do we have more regulation than our friends in the UK, but we also have 50 states plus DC to contend with in terms of being able to file models and complying with each state's individual regulation.
So, I think there are a couple of different ways that we talk here about implementing GBMs, and just more broadly, machine learning. I think one big way that also can be very helpful for insurers who might be contending with legacy, rating engines, or policy admin systems, that you might not be compatible with deploying a GBM, is to use the IBMs in the background to enhance your decision-making process.
So, a few key or quick examples are that, for example, you can use a GBM to identify the drivers of dislocation. So, we are this past year in June released a library component in our software that allows you to easily create dashboards that use a GBM pointed at the dislocation-- so the changes between your proposed and your current rates-- to say which variables are driving that.
All of us on the insurance side, the biggest question when you implement a new rating plan from a lot of your key stakeholders, regulators, senior management, agents, is, OK, well, what's causing this change? My rates went up 50%. Why did they go up 50%? And so being able to pinpoint and give relative waiting to these different variables, I think, becomes a very valuable thing for the business.
RAVI SHARMA: Yeah. I can imagine those data points would be key points to have when having conversations with key stakeholders.
MADELINE MAIN: Right. As opposed to sitting there and saying, OK, well, I did a few test cases, or, here, let me show you, being able to quantify the changes. Another thing that we think about, again, within the context of understanding that a lot of folks want to still deploy a traditional GLM-- which I know to Rachel the ERs in the UK sounds quite antiquated and quaint-- is that we're able to use a GBM and other machine learning techniques to automatically identify segments that maybe need further review.
So, for example, looking at and saying, OK, well, these are the biggest variables that would be of the most important in a traditional GLM. As opposed to doing some of the traditional techniques like forwards or backwards regression, that type of thing.
And so being able to automatically identify, OK, here are my top 20 variables I want to look at. Let me see what the gain for each of those is in the model. And then maybe I can even see from-- and we have worked on this recently-- being able to see that when I add five more additional variables, I'm only getting a little bit more lift, so it's probably not worth the extra cost.
Which I think is another thing that from the business standpoint is important to talk about, is, OK, yes, I might have a slightly better model. But is it worth the 20 or 30 extra hours that I'm going to spend on it? Or the extra cost of, for example, purchasing external data, is that worth it? Or are we just not getting enough lift, we're already explaining the signal sufficiently with our existing variables?
RAVI SHARMA: So, to put it in layman's terms, it sounds like rather than increasing the sophistication of your existing models, you can use this as a decision-making tool to highlight the areas that you need to deploy your resources to maximize the return on your existing investment in analytics.
MADELINE MAIN: Exactly. And I think to-- another thing would to be able to identify interactions because I think that can be another big challenge, that can be a very time-consuming task to weed through interactions, to figure out, OK, is this variable more important than the main effects? Or is it just coming into play in the interactions?
And so, I think that that's another key area that the layered GBM can identify, OK, this much signal is explained by the main effect. But this variable is mostly bringing into the model from the interaction portion of it. So maybe something like payment frequency you realize isn't as important to include as a main effect in the GLM. But really, it's coming into play in the interaction.
RAVI SHARMA: Well, thanks for that. So as usual, our time here today is flown by, but I want to give you both an opportunity. If you could leave our listeners with one parting thought, what would it be? Rachel let's start with you.
RACHAEL MCNAUGHTON: I think for me the one thing I'd like to leave everyone with is almost just an encouragement that it's not just about, oh, where are the insurance industry? And we're a little bit more behind everyone else. And we need to catch up and use machine learning, and all that kind of stuff.
I think it's important to look deep down at what problems we have that not many other people have. And really try and innovate around what can we do to solve these problems? Because it could be the same as, all right, let's just use some open-source machine learning to solve this problem. Or it could be, as we've done here, create an entirely new algorithm to solve that problem more directly.
And even then, in the UK we can just use this algorithm near it to deploy it directly into production to produce rates. But Madeline and her team in the US take a completely different direction than saying, well, yes, sure, we can't deploy GBMs in most cases. But here are these 10 other ways that we can use it to make everything a lot more efficient and a lot better.
So, I think that's the thought for me that I'd like to leave everyone with, is insurance is an incredibly interesting, very challenging but exciting space to be in. And, yeah, just making sure that we are continuing to innovate, and really think about what's unique about the environment that we're in. And how we can impact data science and analytics itself.
RAVI SHARMA: I like that. So often we have solutions, and we go looking for a problem. But, really, let's just look at our problems and deploy solutions as necessary.
RACHAEL MCNAUGHTON: Yeah, exactly. What about you, Madeline?
MADELINE MAIN: Of course, I'd echo everything Rachel said. But I also think being able to expand their reach of analytics. We've had a lot of exciting developments in the past few years. But often, especially business users might have seen that as, OK, well, that's for the data scientists to handle. Not relevant to my job, not relevant to the c-suite.
But really seeing that these machine learning techniques can make the decision-making process easier, democratizing access to models and data across the enterprise. And can be relevant to anyone, from a product manager to the CFO, not just the traditional actuaries and data scientists.
RAVI SHARMA: I really enjoyed this conversation. Thank you both for joining us today.
RACHAEL MCNAUGHTON: Yeah, thanks for having me, Ravi.
MADELINE MAIN: Thank you, Ravi. It was a lot of fun.
RAVI SHARMA: Well, thanks, Rachel, and Madeline. And hopefully, you can join us again soon. Thank you to our listeners for joining us today on (Re)thinking Insurance. And we look forward to having you next time.
SPEAKER 1: Thank you for joining us for this WTW podcast featuring the latest perspectives on the intersection of people, capital, and risk. For more information, visit the inside section of wtwco.com.