Inside LinkedIn’s generative AI cookbook: How it scaled people search to 1.3 billion users

LinkedIn is launching its new AI-powered people search this week, after what seems like a very long wait for what should have been a natural offering for generative AI.

It comes a full three years after the launch of ChatGPT and six months after LinkedIn launched its AI job search offering. For technical leaders, this timeline illustrates a key enterprise lesson: Deploying generative AI in real enterprise settings is challenging, especially at a scale of 1.3 billion users. It’s a slow, brutal process of pragmatic optimization.

The following account is based on several exclusive interviews with the LinkedIn product and engineering team behind the launch.

First, here’s how the product works: A user can now type a natural language query like, "Who is knowledgeable about curing cancer?" into LinkedIn’s search bar.

LinkedIn's old search, based on keywords, would have been stumped. It would have looked only for references to "cancer". If a user wanted to get sophisticated, they would have had to run separate, rigid keyword searches for "cancer" and then "oncology" and manually try to piece the results together.

The new AI-powered system, however, understands the intent of the search because the LLM under the hood grasps semantic meaning. It recognizes, for example, that "cancer" is conceptually related to "oncology" and even less directly, to "genomics research." As a result, it surfaces a far more relevant list of people, including oncology leaders and researchers, even if their profiles don't use the exact word "cancer."

The system also balances this relevance with usefulness. Instead of just showing the world's top oncologist (who might be an unreachable third-degree connection), it will also weigh who in your immediate network — like a first-degree connection — is "pretty relevant" and can serve as a crucial bridge to that expert.

See the video below for an example.

Arguably, though, the more important lesson for enterprise practitioners is the "cookbook" LinkedIn has developed: a replicable, multi-stage pipeline of distillation, co-design, and relentless optimization. LinkedIn had to perfect this on one product before attempting it on another.

"Don't try to do too much all at once," writes Wenjing Zhang, LinkedIn's VP of Engineering, in a post about the product launch, and who also spoke with VentureBeat last week in an interview. She notes that an earlier "sprawling ambition" to build a unified system for all of LinkedIn's products "stalled progress."

Instead, LinkedIn focused on winning one vertical first. The success of its previously launched AI Job Search — which led to job seekers without a four-year degree being 10% more likely to get hired, according to VP of Product Engineering Erran Berger — provided the blueprint.

Now, the company is applying that blueprint to a far larger challenge. "It's one thing to be able to do this across tens of millions of jobs," Berger told VentureBeat. "It's another thing to do this across north of a billion members."

For enterprise AI builders, LinkedIn's journey provides a technical playbook for what it actually takes to move from a successful pilot to a billion-user-scale product.

The new challenge: a 1.3 billion-member graph

The job search product created a robust recipe that the new people search product could build upon, Berger explained.

The recipe started with with a "golden data set" of just a few hundred to a thousand real query-profile pairs, meticulously scored against a detailed 20- to 30-page "product policy" document. To scale this for training, LinkedIn used this small golden set to prompt a large foundation model to generate a massive volume of synthetic training data. This synthetic data was used to train a 7-billion-parameter "Product Policy" model — a high-fidelity judge of relevance that was too slow for live production but perfect for teaching smaller models.

However, the team hit a wall early on. For six to nine months, they struggled to train a single model that could balance strict policy adherence (relevance) against user engagement signals. The "aha moment" came when they realized they needed to break the problem down. They distilled the 7B policy model into a 1.7B teacher model focused solely on relevance. They then paired it with separate teacher models trained to predict specific member actions, such as job applications for the jobs product, or connecting and following for people search. This "multi-teacher" ensemble produced soft probability scores that the final student model learned to mimic via KL divergence loss.

The resulting architecture operates as a two-stage pipeline. First, a larger 8B parameter model handles broad retrieval, casting a wide net to pull candidates from the graph. Then, the highly distilled student model takes over for fine-grained ranking. While the job search product successfully deployed a 0.6B (600-million) parameter student, the new people search product required even more aggressive compression. As Zhang notes, the team pruned their new student model from 440M down to just 220M parameters, achieving the necessary speed for 1.3 billion users with less than 1% relevance loss.

But applying this to people search broke the old architecture. The new problem included not just ranking but also retrieval.

“A billion records," Berger said, is a "different beast."

The team’s prior retrieval stack was built on CPUs. To handle the new scale and the latency demands of a "snappy" search experience, the team had to move its indexing to GPU-based infrastructure. This was a foundational architectural shift that the job search product did not require.

Organizationally, LinkedIn benefited from multiple approaches. For a time, LinkedIn had two separate teams — job search and people search — attempting to solve the problem in parallel. But once the job search team achieved its breakthrough using the policy-driven distillation method, Berger and his leadership team intervened. They brought over the architects of the job search win — product lead Rohan Rajiv and engineering lead Wenjing Zhang — to transplant their 'cookbook' directly to the new domain.

Distilling for a 10x throughput gain

With the retrieval problem solved, the team faced the ranking and efficiency challenge. This is where the cookbook was adapted with new, aggressive optimization techniques.

Zhang’s technical post (I’ll insert the link once it goes live) provides the specific details our audience of AI engineers will appreciate. One of the more significant optimizations was input size.

To feed the model, the team trained another LLM with reinforcement learning (RL) for a single purpose: to summarize the input context. This "summarizer" model was able to reduce the model's input size by 20-fold with minimal information loss.

The combined result of the 220M-parameter model and the 20x input reduction? A 10x increase in ranking throughput, allowing the team to serve the model efficiently to its massive user base.

Pragmatism over hype: building tools, not agents

Throughout our discussions, Berger was adamant about something else that might catch peoples’ attention: The real value for enterprises today lies in perfecting recommender systems, not in chasing "agentic hype." He also refused to talk about the specific models that the company used for the searches, suggesting it almost doesn't matter. The company selects models based on which one it finds the most efficient for the task.

The new AI-powered people search is a manifestation of Berger’s philosophy that it’s best to optimize the recommender system first. The architecture includes a new "intelligent query routing layer," as Berger explained, that itself is LLM-powered. This router pragmatically decides if a user's query — like "trust expert" — should go to the new semantic, natural-language stack or to the old, reliable lexical search.

This entire, complex system is designed to be a "tool" that a future agent will use, not the agent itself.

"Agentic products are only as good as the tools that they use to accomplish tasks for people," Berger said. "You can have the world's best reasoning model, and if you're trying to use an agent to do people search but the people search engine is not very good, you're not going to be able to deliver."

Now that the people search is available, Berger suggested that one day the company will be offering agents to use it. But he didn’t provide details on timing. He also said the recipe used for job and people search will be spread across the company’s other products.

For enterprises building their own AI roadmaps, LinkedIn's playbook is clear:

Be pragmatic: Don't try to boil the ocean. Win one vertical, even if it takes 18 months.
Codify the "cookbook": Turn that win into a repeatable process (policy docs, distillation pipelines, co-design).
Optimize relentlessly: The real 10x gains come after the initial model, in pruning, distillation, and creative optimizations like an RL-trained summarizer.

LinkedIn's journey shows that for real-world enterprise AI, emphasis on specific models or cool agentic systems should take a back seat. The durable, strategic advantage comes from mastering the pipeline — the 'AI-native' cookbook of co-design, distillation, and ruthless optimization.

(Editor's note: We will be publishing a full-length podcast with LinkedIn's Erran Berger, which will dive deeper into these technical details, on the VentureBeat podcast feed soon.)

LinkedIn is launching its new AI-powered people search this week, after what seems like a very long wait for what should have been a natural offering for generative AI.

The following account is based on several exclusive interviews with the LinkedIn product and engineering team behind the launch.

First, here’s how the product works: A user can now type a natural language query like, “Who is knowledgeable about curing cancer?” into LinkedIn’s search bar.

LinkedIn’s old search, based on keywords, would have been stumped. It would have looked only for references to “cancer”. If a user wanted to get sophisticated, they would have had to run separate, rigid keyword searches for “cancer” and then “oncology” and manually try to piece the results together.

The new AI-powered system, however, understands the intent of the search because the LLM under the hood grasps semantic meaning. It recognizes, for example, that “cancer” is conceptually related to “oncology” and even less directly, to “genomics research.” As a result, it surfaces a far more relevant list of people, including oncology leaders and researchers, even if their profiles don’t use the exact word “cancer.”

The system also balances this relevance with usefulness. Instead of just showing the world’s top oncologist (who might be an unreachable third-degree connection), it will also weigh who in your immediate network — like a first-degree connection — is “pretty relevant” and can serve as a crucial bridge to that expert.

See the video below for an example.

Arguably, though, the more important lesson for enterprise practitioners is the “cookbook” LinkedIn has developed: a replicable, multi-stage pipeline of distillation, co-design, and relentless optimization. LinkedIn had to perfect this on one product before attempting it on another.

“Don’t try to do too much all at once,” writes Wenjing Zhang, LinkedIn’s VP of Engineering, in a post about the product launch, and who also spoke with VentureBeat last week in an interview. She notes that an earlier “sprawling ambition” to build a unified system for all of LinkedIn’s products “stalled progress.”

Now, the company is applying that blueprint to a far larger challenge. “It’s one thing to be able to do this across tens of millions of jobs,” Berger told VentureBeat. “It’s another thing to do this across north of a billion members.”

For enterprise AI builders, LinkedIn’s journey provides a technical playbook for what it actually takes to move from a successful pilot to a billion-user-scale product.

The new challenge: a 1.3 billion-member graph

The job search product created a robust recipe that the new people search product could build upon, Berger explained.

The recipe started with with a “golden data set” of just a few hundred to a thousand real query-profile pairs, meticulously scored against a detailed 20- to 30-page “product policy” document. To scale this for training, LinkedIn used this small golden set to prompt a large foundation model to generate a massive volume of synthetic training data. This synthetic data was used to train a 7-billion-parameter “Product Policy” model — a high-fidelity judge of relevance that was too slow for live production but perfect for teaching smaller models.

However, the team hit a wall early on. For six to nine months, they struggled to train a single model that could balance strict policy adherence (relevance) against user engagement signals. The “aha moment” came when they realized they needed to break the problem down. They distilled the 7B policy model into a 1.7B teacher model focused solely on relevance. They then paired it with separate teacher models trained to predict specific member actions, such as job applications for the jobs product, or connecting and following for people search. This “multi-teacher” ensemble produced soft probability scores that the final student model learned to mimic via KL divergence loss.

But applying this to people search broke the old architecture. The new problem included not just ranking but also retrieval.

“A billion records,” Berger said, is a “different beast.”

The team’s prior retrieval stack was built on CPUs. To handle the new scale and the latency demands of a “snappy” search experience, the team had to move its indexing to GPU-based infrastructure. This was a foundational architectural shift that the job search product did not require.

Distilling for a 10x throughput gain

With the retrieval problem solved, the team faced the ranking and efficiency challenge. This is where the cookbook was adapted with new, aggressive optimization techniques.

To feed the model, the team trained another LLM with reinforcement learning (RL) for a single purpose: to summarize the input context. This “summarizer” model was able to reduce the model’s input size by 20-fold with minimal information loss.

The combined result of the 220M-parameter model and the 20x input reduction? A 10x increase in ranking throughput, allowing the team to serve the model efficiently to its massive user base.

Pragmatism over hype: building tools, not agents

Throughout our discussions, Berger was adamant about something else that might catch peoples’ attention: The real value for enterprises today lies in perfecting recommender systems, not in chasing “agentic hype.” He also refused to talk about the specific models that the company used for the searches, suggesting it almost doesn’t matter. The company selects models based on which one it finds the most efficient for the task.

The new AI-powered people search is a manifestation of Berger’s philosophy that it’s best to optimize the recommender system first. The architecture includes a new “intelligent query routing layer,” as Berger explained, that itself is LLM-powered. This router pragmatically decides if a user’s query — like “trust expert” — should go to the new semantic, natural-language stack or to the old, reliable lexical search.

This entire, complex system is designed to be a “tool” that a future agent will use, not the agent itself.

“Agentic products are only as good as the tools that they use to accomplish tasks for people,” Berger said. “You can have the world’s best reasoning model, and if you’re trying to use an agent to do people search but the people search engine is not very good, you’re not going to be able to deliver.”

For enterprises building their own AI roadmaps, LinkedIn’s playbook is clear:

Be pragmatic: Don’t try to boil the ocean. Win one vertical, even if it takes 18 months.
Codify the “cookbook”: Turn that win into a repeatable process (policy docs, distillation pipelines, co-design).
Optimize relentlessly: The real 10x gains come after the initial model, in pruning, distillation, and creative optimizations like an RL-trained summarizer.

LinkedIn’s journey shows that for real-world enterprise AI, emphasis on specific models or cool agentic systems should take a back seat. The durable, strategic advantage comes from mastering the pipeline — the ‘AI-native’ cookbook of co-design, distillation, and ruthless optimization.

(Editor’s note: We will be publishing a full-length podcast with LinkedIn’s Erran Berger, which will dive deeper into these technical details, on the VentureBeat podcast feed soon.)

The new challenge: a 1.3 billion-member graph

Distilling for a 10x throughput gain

Pragmatism over hype: building tools, not agents

The new challenge: a 1.3 billion-member graph

Distilling for a 10x throughput gain

Pragmatism over hype: building tools, not agents

Related News

You may have missed

CATEGORIES

Useful Links