December 30, 2024
IBM was early to AI, then lost its way. CEO Arvind Krishna explains what's next
IBM CEO Arvind Krishna talked with CNBC about his specific views on regulation, the business of generative AI, IBM's mistakes and its future plan. 

IBM is angling hard for an AI comeback story, and CEO Arvind Krishna is counting on a recent pivot to get it there. 

Since May, the company has reintroduced the Watson brand as part of the company’s larger strategy shift to monetize its AI products for businesses. WatsonX is a development studio for companies to “train, tune and deploy” machine learning models. Krishna says the product has already amounted to “low hundreds of millions of dollars” in bookings in the third quarter, and could be on track for a billion dollars in bookings per year.

But IBM has steep competition in the enterprise AI realm: Microsoft, Google, Amazon and others all have similar offerings. And the company has long been critiqued for falling behind in the AI race, particularly when it comes to making money from its products.

Nearly two years ago, IBM sold its Watson Health unit for an undisclosed amount to private equity firm Francisco Partners. Now, the company is in the midst of selling its weather unit, including The Weather Channel mobile app and websites, Weather.com, Weather Underground and Storm Radar, to the same firm, also for an undisclosed sum. 

“I think that’s a fair criticism, that we were slow to monetize and slow to make really consumable the learnings from Watson winning Jeopardy, and the mistake we made was that I think we went after very big, monolithic answers, which the world was not ready to absorb,” IBM CEO Arvind Krishna told CNBC in an interview, adding, “Beginning that way was the wrong approach.” 

Krishna talked with CNBC about his specific views on regulation, the business of generative AI, IBM’s mistakes and its future plan. 

This interview has been lightly edited for length and clarity.

On the morning you took over as CEO in 2020, you sent an email to employees saying you’ll focus on AI and hybrid cloud as the future’s technologies. How has your view on AI’s use in business – real-life use cases, saturation – changed since that day? 

If you don’t mind, I’ll use a baseball analogy just because it helps to sort of say – at the time when I called those two technologies, I think people understood cloud and AI as ‘Okay, he’s saying it, but not clear – is that a market, is it big, is it small, is it really that important? Cloud is 10 times bigger.’ So to use a baseball analogy, at that point cloud was maybe the third inning, and AI had not even entered the field. 

If you fast-forward to today, I will tell you cloud is probably in its fifth or sixth inning of a game – so you know how it’s going, it’s a mature game, you kind of know where it’s going to play out. AI is in the first inning, so still unclear who all will be the winners, who all will not win, et cetera. The difference is that it is on the field, so it is a major league game. Unclear on who exactly is going to win – that may be the only question. 

So my view, I looked at the amount of data, I looked at the nature of automation needed in the demographic shifts that are going on and I looked at the sheer amount of work that we all have to do. And you go look at the backlog that’s sitting inside places, inside government – the VA has six months worth of claims to process, insurance companies take months to get going for the tougher claims, you look at the backlog in customer service. You look at all those things, and you say, ‘This mixture of the data explosion and this need to get work done – which technology could help us address that?’ And just from my experience, you look across and you say, ‘The only one I can think of is artificial intelligence.’

That’s why you get… a massive shift going on with people and with data, a big unmet need and a technology that could possibly address it. Now it’s up to us as innovators, as inventors, as technologists to go make it happen. 

Biden’s recent executive order had a long list of sections that related to AI-generated content and the risks involved, including the order that AI companies share safety test results with the U.S. government before the official release of AI systems. What changes will IBM need to make? 

We are one of, I think, a total of a dozen companies who participated in the signing of the executive order on the 30th of October, and we endorsed it with no qualifications. Look, to me… all regulation is going to be imperfect, by its very nature. There’s no way that, even in this case a 100-page document, can capture the subtleties of such a massive, emerging, impactful, nascent technology. So if I put that [thought] on it, then we are completely fine with the EO as written – we support it, we believe that having something is better than not having something, we believe that having safeguards is better than having no guardrails. 

Now, I think that this has now come down to how they want to implement it. Do I have any concerns with sharing what tests we have done with the federal government? Actually, I have none. I am one who’s publicly advocated that companies that put out AI models should be held accountable to their models. I actually go even further – I say you should put in legislation that requires us to be legally liable for what our models do, which means if your models do bad things, you can get sued. I’m not saying that’s a very popular viewpoint, but that is one that I have articulated. 

So do I have concerns with sharing it with the government? No. Do I have concerns if the government is now going to put this into a public database so everybody else knows my secret recipes and what I do? Yeah, I do have concerns about that. Because I do believe that there should be competition – we should be allowed to have our own copyrighted ways of doing things, and those don’t need to be made public. So my concern is kind of on the edges, but they haven’t yet told us how they want us to do all those things, and I’m hoping that we can influence – whether it’s NIST or commerce or whoever is coming up with all these rules – to sort of allow for confidentiality. But behind confidentiality, I don’t really have concerns, per se, about this. 

There’s an industry-wide debate, especially in light of the executive order, about too much regulation stifling innovation: Some say it’s irresponsible and even inefficient to move forward without oversight for bias and harms; some say it stifles advancement and open-source AI development. Share your thoughts and where you think trust/governance is headed? 

I’m going to tell you what I told Senator Schumer… This is a really authentically and deeply-held point of view. Number one, we actually said that whatever we do should allow for a lot of open innovation and not stifle innovation. Two, I said that model developers should be held accountable for what they create. And three, I believe we should regulate use cases based on risk, not the technology or the algorithms themselves. 

So… we strongly advocated that we should allow for open innovation. What does that then preclude? It would preclude a very onerous, hard licensing regime. So if you create a licensing regime, you more or less shut everybody who’s not part of the license out – because that is the one that would shut down. If somebody does open innovation and they can’t deploy because you need a license to deploy, then if you’re two kids in a basement, it’s really hard to run the gauntlet of getting a license from the federal government. So we advocated for that to be open, so you can allow AI innovation. 

Now, if somebody’s going to deploy it, how are you going to be accountable? Well, accountability always depends on the depth of your pocketbook. So if you’re a larger company with more resources, by definition, you have more to lose, and more to gain – so that seems like a fair system of competition. And the reason we said to regulate the use case, not the technology, is so that open innovation can flourish. Because if you regulate the technology, now you’re stomping on the innovation – but use case, if it’s in medicine or self-driving cars, you probably want to be more careful than if it’s summarizing an email for you. So there is a different risk that we should accept that comes from real life. 

Speaking of WatsonX – the development studio IBM began rolling out in July for companies to train, tune and deploy AI – it’s a big bet for IBM. What sets it apart from competing offerings from other big tech companies? 

At one level, most of the companies are going to have their own studios, they have ways that their clients can both experiment with AI models and put them into production – so at that level, you’d say, “Hey, it kind of smells similar to this.” We use the word assistant, others use the word copilots – I’ll look at you and I’ll acknowledge that it’s kind of the same difference. Now it comes down to how do you deploy it, how much can you trust it, how curated is the data that went into it and what kind of protections do you give the end users? That’s where I’ll walk through some of the differences. 

So we don’t want to constrain where people deploy it. Many of the current tech players – I won’t say all, but many – insist that it gets deployed only in their public cloud environment. I have clients in the Middle East, and they want to deploy it on their sovereign territory; I have clients in India who want to deploy it in India; we have clients in Japan who want to deploy it in Japan; I might have, maybe, hypothetically, a bank that is worrying a lot about the data that they might put into it, so they want to deploy it in their private infrastructure. So as you go through those examples, we don’t want to constrain where people deploy it. So they want to deploy it on a large public cloud, we’ll do it there. If they want to deploy it at IBM, we’ll do it at IBM. If they want to do it on their own, and they happen to have enough infrastructure, we’ll do it there. I think that’s a pretty big difference. 

Also, we believe that models, in the end, are not going to be generated by a single company. So we also want to allow for a hybrid model environment, meaning you might pick up models from open source, you might pick up models from other companies, you will get models from IBM, and then we want to give you the flexibility to say which is which because they will come with different attributes. Some could be more capable, some could be cheaper, some could be smaller, some could be larger, some may have IP protection, some may not. 

And how is WatsonX doing – can you give us growth numbers, specific clients that differ from the initial ones announced, etc.? Or any industries/sectors it’s being used for that surprised you? 

We released it at the end of July, so until the second quarter, the revenue was zero. We did say in our third-quarter earnings – and I think that that’s the number I’ll probably stick to – that we did low hundreds of millions of dollars in bookings, across both large and small. 

So going from zero to low hundreds [of millions], I think, is a pretty good rate. Now, that’s not a growth rate, that’s… sort of quarter-to-quarter. But you know, if I was to extrapolate low hundreds [of millions] – if I was just hypothetically, I’m not saying it is, but if you call it 200 [million], and you say you get a bit more over time, you’re getting close to a billion dollars a year, if you can maintain that rate for a year. That feels pretty good – it feels like you’re taking share, you’re getting a footprint, you’re getting there. This is across a mixture of large and small. So that characterizes it financially, probably, as much as I would at this time. 

Now, you said sectors – this actually is one of the surprising technologies where we’re finding interest across the sectors. Yes, you would expect that IBM is naturally going to get traction in financial and regulated industries, but it’s much, much more than that – it’s telecom, it’s retail, it’s manufacturing. I really am finding that there’s a lot of interest from a lot of things, but different use cases. Some want it for, “How do you answer phone calls?” Some want it for, “How do you train your own employees?” Some want it for, “How do I take bureaucracy out of an organization?” Some want it for, “How do I make the finance team more effective?” So you’re getting a lot of different use cases, across people. 

Critics say that IBM has fallen behind in the AI race. What would you tell them? 

Well, let’s see. Deep Blue was 1996, 1997 – we certainly did monetize it. And then I’d look at it tongue-in-cheek and say, “I don’t know, maybe 20 years of… all the supercomputing records had something to do with the fact that we built Deep Blue.” Because I think from ’96 to 2015, we typically had a supercomputer in the world’s top five list… and all of the work we did there, I think, applied to the way we did weather modeling…

I’d then roll forward to 2011, and when Watson won Jeopardy. I think, honestly, history should show… that maybe was the moment when the world woke up to the potential for AI. I think then, I’ve got to give OpenAI credit – it’s kind of like the Netscape moment. Suddenly, the Netscape moment made the internet very tangible, very personal to everybody, and I think ChatGPT made AI very tangible to most people. So now the market need exploded, “Okay, I can get a sense of what this can do.” I’ve also got to give credit to many universities that worked on the underlying technology of large language models. 

So, while the critique that you stated is accurate – that’s what people say – I actually think that they really mean something different. What they mean is, “Hey, you guys talked about Watson and Jeopardy back in 2011. Where’s the proof? Where’s the pudding? Where’s the return? You’re talking about these clients now, why not five years ago?” So I think that’s a fair criticism, that we were slow to monetize and slow to make really consumable the learnings from Watson winning Jeopardy. And the mistake we made was that I think we went after very big, monolithic answers, which the world was not ready to absorb. People wanted to be able to tinker with it, people wanted to be able to fine-tune things, people wanted to be able to experiment, people wanted to be able to say, “I want to modify this for my use case.” And in hindsight – and hindsight is 20/20 – every technology market has gone like that. It begins with people wanting to experiment and iterate and tinker. And only then do you go towards the monolithic answer. And so beginning that way was the wrong approach. 

So that’s how we pivoted early this year, and that’s why we very quickly took the things we had, and the innovations – because we’ve been working on the same innovations as the rest of the industry – and then put them into the Watson X platform. Because as you could imagine, you couldn’t really do it in three months. It’s not like we announced it in May, and we had it in July. As you can imagine, we had been working on it for three or four years. And the moment was now. So that’s why now.

Let’s talk about the business of generative AI. This past quarter, IBM released Granite generative AI models for composing and summarizing text. And there are consumer apps galore but what does the technology really mean for businesses?

I think I would separate it across domains. In pure language, I think there will be a lot of – maybe not thousands, but there will be tens – of very successful models. I’ve got to give credit, in language, to what OpenAI does, what Microsoft does, what Google does, what Facebook does, because human language is a lot of what any consumer app is going to deal with. Now, you would say, “Okay, you give credit to all these people, and you’re acknowledging their very good models – why don’t you do it?” Well, because I do need a model in which I can offer indemnity to our clients, so I have to have something for which I know the data that is ingested, I know the guardrails built in… so we do our own.

I also want to separate the large language part and the generative part. I think the large language part is going to unlock massive productivity in enterprises. This is where I think the $4 trillion per year number from McKinsey is grounded in. By 2030 – I like McKinsey’s number, and we triangulate to about the same – they say $4.4 trillion of annual productivity by 2030. That’s massive for what enterprises and governments can achieve. The generative side is important because then the AI for simple use cases – “Hey, can you read this?” or “What is the example that my client was talking about yesterday…?” That is the large language side. 

The generative side, here, is important, but it’s a minor role, which is, “Give the output in a way that is appealing to me as opposed to kind of robotic.” Now, the other side of generative – in terms of modifying artwork, creating images, advertisements, pictorials, music – we’re not the experts, we’re not going to be doing any of that side of it. And I do worry a little bit about copyright and some of the issues that have been brought up by artists on that side of it. But making writing better so that it’s more appealing and easy to read? That’s a great use of generative, as far as I’m concerned. 

In that same vein, IBM today launched a governance product for businesses and companies who want to make sure their models comply with regulation, including “nutrition labels” for AI. What groups did the company work with to develop the bias and fairness monitoring metrics? Did you work with any minority leaders in the space? 

We have been open, before, in terms of exposing everything we do to the whole community, both universities and some of the people from the past – I’m not going to name all the names – who have been pretty vocal about how these models can be… 

Right now we try to be very careful. We don’t want to be the oracle, so we say, “What’s enshrined in law?” So in the US, I think there are 15 categories that are protected by law. Those are the categories that we will do the bias… Now, obviously, clients can choose to add more into that, but we try to stick to what’s enshrined in law in every place, and that is the way that we want to go forward… 

We want to be active in, we want to influence, we want to advocate for these rules and safety standards, but I hesitate to say that we should be the complete arbiters… We should work with those in government and regulatory bodies, and in the larger community, there. I worry that the community doesn’t have enough resources to do this. If you want to go verify a large model and run some tests and see how it’s trained, you’re talking about hundreds of billions of dollars of infrastructure. So it’s got to be done by government, because I fear that even a well-intentioned NGO will not be able to get this done.

You’ve said in the past that AI will create more jobs than it takes, but in recent months, IBM announced a decision to replace about 8,000 jobs with AI. Does the company have any plans to use AI to upskill current employees in those sectors, or types of roles it’ll replace versus not?

We’re actually massively upskilling all of our employees on AI. In August, we took a week and ran a challenge inside IBM, where we encouraged all our employees to create what I call mini-applications using WatsonX as a platform – 160,000 of our employees participated for the week, and we had 30,000 teams, who all came up with really cool ideas. We picked the top dozen, which we rewarded, and we got to take those all the way to full production. In the next couple of months, we’ll do it again. So we really are taking a lot of time, we give them a lot of material, we encourage them to go learn about this and see how to use it and deploy it. I’m convinced that will make them much better employees, and it will also make them much more interesting to our clients. So it’s great – they’re good for us, and they’re more marketable, so it’s actually good for them. 

I also think that many people when they hear this – I actually disagree with the way many economists and many people characterize it, that if you make somebody more productive, then you need less of them. That’s actually been false in history. If you are more productive, that means you have a natural economic advantage against your competition, which means you’re going to get more work, which means you’re going to need more people. And I think people forget that – they come from a zero-sum mentality to say it’s a zero-sum game… The world I live in, you’re more competitive, so that means you’re going to get more work, which means you need more people to do that work. So yes, certain roles will shrink because you don’t need so many people doing, maybe, email responses or phone calls, but then it will shift to maybe more applications will get done, or maybe you’ll be advertising to different markets that you previously could access. So there will be a shift – yes, the first bucket decreases, and everybody fixates on that. By the way, at our scale, that’s 3% of our entire employee population…

I fundamentally believe we’ll get more jobs. There wasn’t an internet job in 1995. How many are there today, 30 million…? There was no CNBC.com in 1995. There was a television channel.

In your eyes, what’s the most over-hyped and under-hyped aspect – specifically – of AI today?

The most overhyped is obviously this existential risk of AI taking over humanity. It is so overhyped that I think it’s fantastical, and I use that word publicly. The most underhyped is the productivity it’s going to bring to every one of the bureaucratic tasks we all live with, inside enterprises and with government.