Deploying Python and open source to impact business results

In this episode of (Re)thinking Insurance, Mani Heer, UK Head of Data Science, talks to Pardeep Bassi, Global Head of Data Science, about overcoming the challenges in deploying open source effectively.

Video Player is loading.

Putting theory into practice: Deploying Python and open source to impact business results

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

(Re)thinking Insurance Podcast Season 4, Episode 17: Putting theory into practice: Deploying Python and open source to impact business results

Transcript for this episode:

MANI HEER: Removing all of the issues and challenges we've just mentioned and allowing those who understand the business to apply their judgment, but also deploy models as quickly as possible into the market is a key part of the reason why some insurers succeed and others really struggle.

NARRATOR: You're listening to (Re)thinking Insurance, a podcast series from WTW, where we discuss the issues facing P&C, life and composite insurers around the globe, as well as exploring the latest tools, techniques, and innovations that will help you rethink insurance.

MANI HEER: Welcome to another episode of (Re)thinking Insurance. I'm Mani Heer, and I head up our data science consulting practice in the UK, and today I'm joined by Pardeep Bassi, global proposition leader for data science.

And we'll be discussing the challenges clients face when it comes to Python and open-source deployment. Welcome, Pardeep. Could you tell us a little bit about the trends that you're seeing in the market?

View Full Transcript

PARDEEP BASSI: Let's take a step back and look at the key trends over the last five years. So there's been four key things which have happened. The first is open-source algorithm development has really gathered pace. More and more data has become available.

Python skill set in particular has come out of universities and education, and lastly, the compute power and access to the compute power has increased, so this has led to insurers really wanting to drive flexibility and innovation via leveraging Python and open-source.

MANI HEER: You mentioned flexibility, and we've seen a lot of companies in the industry sort of using that flexibility to deploy open-source models. Can you expand on what that flexibility sort of provides the industry?

PARDEEP BASSI: So one of the key principles behind data scientists or those leveraging custom Python code is to have that flexibility of writing anything they want, do any custom feature engineering leverage, any open-source algorithm, and that to really drive that innovation.

But what we've seen quite often is when it comes to the stage of deployment, they have to limit that flexibility. So it's a key barrier to actually extracting value from that innovation, having to convert into model forms such as PMML or ONNX, which are severely limited in terms of what they allow you to do.

So a lot of this flexibility and innovation isn't taken into practice and impacting real business decisions because of these intermediate model forms which restrict what you can take out.

MANI HEER: And converting into those intermediate forms can end up introducing additional testing and reconciliation that needs to be done, which just increases how long things take to deploy and make happen. You mentioned those barriers. What proportion of insurers do you think are actually able to deploy the models that they build offline?

PARDEEP BASSI: So I'd say the majority of insurers have played with open-source and Python, but in an offline manner, that when it comes to actually putting them in for near real-time production use cases, I'd say probably less than a quarter have put those models live, and even out of those quarter, few have got a well-governed, secure, scalable, and efficient methodology to actually take those models up.

MANI HEER: Wow 25% is quite a low proportion. What do you think those barriers are?

PARDEEP BASSI: So in terms of key challenges insurers face to extract this value from that flexibility and innovation of writing custom Python code and leveraging open-source can be summarized in about five key categories.

I think the first is all around the security and governance. So if you break that down further, it's ensuring that there isn't any malicious code, your data isn't being sent to IP addresses where it shouldn't be. So you have that protection around PII and sensitive data.

You have that vulnerability scanning of your packages, you have the ability to understand who's made what decision when and provide the appropriate controls and access.

So authentication protocols as well as permissions and role-based access to production and deployment environments, they're all things that you really need when you're impacting those business critical decisions in a highly regulated environment such as insurance.

The second key area is about auditability and reproducibility. So from an audit auditability perspective, you need to be able to understand who's made what decision at what point in time, and that's particularly important when you're considering the deployment aspect.

From a reproducibility piece, the bit to really focus in on is, when you're using open-source packages, it's that control of versions and dependencies.

You have multiple different Python packages which are dependent on different versions of other packages, so being able to control that and allow you to reproduce any model prediction at any point in time is a key enabler to actually deploy these models and meet your regulatory and business requirements.

Another key area to focus in on is the robustness of these systems that you're building. So very much when you are impacting business-critical decisions, because insurance is an essential service, it's not just having that low latency robustness for quotability, it's about providing that service to customers when at point of claim.

So if you have machine learning models at that point of claim when the customer really needs you, you need to ensure that your models are performant and robust enough that they give you a response when you require it.

And then lastly, one of the very important things is the management of costs. So quite often when you are looking at this low-inference, low-latency, high-availability solution, and multiple environments, including test environments, hundreds of models, the ability to manage your costs become really important because you need to be able to deploy these models in an efficient manner.

So when you are leveraging cloud compute, which is flexible in terms of scaling up and down, you need to really reign in your costs.

MANI HEER: You made a good point, which is that's a lot for data scientists to have. It's not a skill set that they typically have, and so we've seen it a lot across the industry where you start with three data scientists and then you quickly realize you need a team of 50 plus, not just data scientists, but engineers, DevOps engineers, and so on, to make this thing work if you're building it all yourself.

These challenges that you mentioned, is it the same for all industries, or is there anything that's sort of unique to insurance?

PARDEEP BASSI: Yeah. So I think it's worth starting off with just explaining that insurance is highly regulated. So you really need to choose the right model form to give you that transparency and interpretability.

And then going slightly deeper, there's been analytics in insurance for a very, very long time, and the analytics is bespoke for a reason, and the main reason is, you're predicting an event which takes quite some time to occur. Once it does occur, it takes quite some time for it to actually develop, and mature, and have the end result.

So taking bespoke approaches, with that in mind, really does mean you need to bring together the newer algorithms as well as that deep underlying insurance domain expertise together in one place. And then a specific focus on the deployment piece, because you are impacting these business-critical decisions. And like I said, insurance is an essential service.

The need for robustness and those very high SLAs is a key part of the solution required for insurance. So if you take the several cloud outages that we've seen recently, not to name any of the big cloud providers, but the need to have that robustness in your solution with multi-region disaster recovery feeding into those high SLAs is critical for insurance.

MANI HEER: That is quite a lot of insurance-focused challenges there. It's something that, again, we've seen a lot across the market where when faced with these challenges, it can just add a lot of complexity to processes, and just makes everything sort of take a lot longer to deploy, and to extract that business value out of.

And so a lot of the time, a lot of the things that we spend time on, a lot of the effort that we're spending time on is how do we how do we extract that business value from open-source, but also maintain the simplicity and maintain the speed of which we deploy things. How do you think we can we can help with that?

PARDEEP BASSI: So I think one of the key things to keep in mind is, insurers which are really succeeding from the use of advanced analytics, data science techniques, it's all about empowerment of those who really understand insurance, and the practitioners, those who are actually building models.
So removing all of the issues and challenges we've just mentioned and allowing those who understand the business to apply their judgment, but also deploy models as quickly as possible into the market is a key part of the reason why some insurers succeed and others really struggle.

MANI HEER: OK. I think we've covered some very interesting points today, and it is clear that having that ability to take some custom Python code but deploy it in a way that's simple and doesn't introduce all of this additional burden, will be will be a powerful asset for any insurance company to have.

PARDEEP BASSI: I think it is a key differentiator being able to take that innovation and put that into production. I think, to summarize everything we've discussed today in terms of the two key points I really think we should focus in on, the first is we should not compromise on that flexibility driving innovation.
So where you have data scientists, those are writing custom Python code, thinking about production in mind, they should not have to compromise in terms of the model form and the ability to write custom code, and take that into production, because you need to set yourself up for the future, not just now in terms of there may be new model forms coming about.

So if you choose a deployment mechanism, a route to market which isn't sustainable and which restricts that flexibility and innovation, you are setting yourself up for issues in the future. So that would be my number one piece to take away.

And the second would be very much focus on business value and insurance-specific knowledge you have, and taking that out to market as quickly as possible. And I do want to labor on this particular point, which is, those insurers who have succeeded in the use of analytics, it's all about that empowerment.

Empowerment of those who build the models, those who understand insurance, and giving them with the right controls and the right governance, the ability to take their decision-making ability out to market as quickly as possible.

MANI HEER: So as you know, Radar is used across the industry to do predictive analytics, to do sort of impact analysis based off of the models that we build, but you mentioned obviously you want to be able to build custom open-source models and get the value out of that as well. So how is Radar adapting to meet those needs?

PARDEEP BASSI: So one of the key capabilities which we've been working on recently coming out to market is the ability to take custom Python code, leveraging any open-source package within Conda straight into deployment without conversion to PMML, ONNX, or any other form.

MANI HEER: So that sounds like a good new development. What sort of benefits does that bring existing users of Radar and potential new users as well?

PARDEEP BASSI: We're taking away all of the burdens and challenges that insurers have with respect to the security, the governance, the auditability, and the cost management aspects, as we explained earlier today.

So now insurers have a direct mechanism to take that innovation, the flexibility provided by Python and open-source, directly interacting, impacting key business decisions.

We've seen some insurers try to build their own technology, put significant investment in terms of building their own deployment capability, but it requires significant effort, potentially new skills, and resources that they don't have, but where it's really hurting them is when it comes to the actual maintenance and the support of the models when they're live.

So my advice to many insurers would be, focus on the insurance-specific aspects of making the most appropriate decisions as quickly as possible and taking them out to market rather than trying to develop technology, which costs a lot to build, costs even more to maintain and support.

MANI HEER: That sounds like an exciting development. The flexibility to bring your own sort of custom Python code and deploy that within live rating without having those issues and challenges that we've spoken about today, sounds like a game changer. So thank you, Pardeep, for your time today.
Thank you for listening to this episode of (Re)thinking Insurance. If you would like to check out other episodes, please visit the (Re)thinking Insurance web page or platform.

NARRATOR: Thank you for joining us for this WTW podcast, featuring the latest perspectives on the intersection of people, capital, and risk. For more information, visit the insights section of wtwco.com.

This podcast is for general discussion and or information only. It is not intended to be relied upon, and action based on, or in connection with anything contained herein, should not be taken without first obtaining specific advice from a suitably qualified professional.

Podcast host

Mani Heer

UK Data Science Lead, Insurance Consulting and Technology

Mani is Head of our UK Data Science Consulting practice at WTW. By combining Insurance domain expertise with data science capability, he has successfully delivered many data science initiatives across the insurance value chain such as Pricing, Claims and Marketing.

email Email

Podcast guest

Pardeep Bassi

Global Proposition Leader, Data Science

Pardeep is the Global Proposition Leader for Data Science at WTW's Insurance Consulting and Technology (ICT) division, responsible for driving WTW’s increased focus of growth into Data Science from a software and consulting perspective.

He has built and led Data Science teams and initiatives for organisations, with his previous role being Chief Data Science Officer at a leading UK insurer.

email Email

List of website locations and languages available in Americas
Location	Languages Available
Argentina	Spanish
Bermuda	English
Brazil	Portuguese
Canada	English French
Chile	Spanish
Colombia	Spanish
Costa Rica	Spanish
El Salvador	Spanish
Guatemala	Spanish
Honduras	Spanish
Mexico	Spanish
Nicaragua	Spanish
Panama	Spanish
Peru	Spanish
United States	English
Venezuela	Spanish

List of website locations and languages available in Asia-Pacific
Location	Languages Available
Australia	English
China	Simplified Chinese
Hong Kong (China, SAR)	English
India	English
Indonesia	English
Japan	Japanese
Korea	Korean
Malaysia	English
New Zealand	English
Philippines	English
Singapore	English
Taiwan	Traditional Chinese
Thailand	English Thai
Vietnam	English

List of website locations and languages available in Europe
Location	Languages Available
Austria	German
Belgium	English French Flemish
Croatia	English Croatian
Czech Republic	English Czech
Denmark	Danish
Finland	Finnish
France	French
Germany	German
Greece	Greek
Hungary	Hungarian
Ireland	English
Italy	Italian
Kazakhstan	Kazakh Russian
Luxembourg	French
Netherlands	Dutch English
Norway	Norwegian
Poland	Polish
Portugal	Portuguese
Romania	Romanian
Serbia	Serbian
Slovakia	Slovak
Spain	Spanish
Sweden	English Swedish
Switzerland	English French German
Turkey	Turkish
Ukraine	Ukrainian
United Kingdom	English

List of website locations and languages available in Middle East and Africa
Location	Languages Available
Cameroon	English French
Congo	French
Egypt	English
Ghana	English
Ivory Coast	French
Israel	English
Jordan	English
Kenya	English
Kuwait	English
Mauritius	English
Nigeria	English
Saudi Arabia	English
Senegal	French
South Africa	English
UAE	English
Uganda	English