fbpx Skip to main content
Spread the love


Recommendation systems, traditionally utilized by e-commerce platforms to suggest products to consumers, have evolved significantly with the advent of Language Models (LLMs). These advanced systems incorporate an additional layer of text comprehension into the recommendation process, pushing the capabilities of these systems to new heights.

Text understanding capabilities of AI have ushered in efficient recommendation systems, particularly when user-related textual data is available.

This article explores using Language Models (LLMs) to identify top influencers for any offer, based on their social media content.

The process we’re discussing has been successfully implemented for a social media aggregator startup, simplifying access to influencer content. However, its potential extends far beyond this specific use-case. In fact, any company can replicate this process using publicly available content, making it a versatile tool for identifying the best influencers for any given offer.

The implementation of semantic recommendation systems has yielded remarkable results, enabling us to pinpoint the most suitable influencers for any particular offer at any given time.

1. The problem we are solving

The challenge of recommending suitable products to users is not a recent phenomenon. Traditional e-commerce recommendation systems, popularized since 2017, primarily rely on users’ interaction data with a webpage to comprehend user behaviors. These systems then suggest products to users based on the interactions of other users, aiming to identify products that may pique the user’s interest.

The traditional recommendation method focuses on analyzing user interactions with a website, such as page views, clicks, and transactions. Although it can include some user information, it primarily emphasizes the user’s actions rather than their inherent traits. This approach, while effective, lacks a comprehensive understanding of the user’s intrinsic characteristics.

The problem we are addressing is the scarcity of interaction data. However, in many instances, we have access to user-generated textual content, particularly from their social media pages and posts. This rich source of information provides an alternative to traditional interaction data, offering insights into user preferences and behaviors that can be leveraged in influencer identification.

Social media posts provide an alternative to traditional interaction data, offering insights into user preferences and behaviors that can be leveraged in influencer identification.

The value of social media content is immense as it provides detailed insights into a person’s interests, personality traits, and buying tendencies over time. The adage, “social media know you better than your life partner” holds true in this context. This knowledge can be leveraged to identify the best influencers for any offer, a problem that businesses often grapple with.

Social media know you better than your life partner.

We are tackling the challenge of creating an AI model, utilizing language models like OpenAI’s GPT-4, to decipher user interests from social media content. The goal is to identify the most influential figures for a specific commercial offer by employing a scoring method. This innovative approach aims to streamline the process of finding the right influencers to promote a product or service, thereby enhancing marketing strategies and boosting business growth.

The goal is to identify the most influential figures for a specific commercial offer by employing a scoring method.

2. What are semantic recommendation systems

Recommendation systems are a specialized branch of AI designed to predict the most suitable item for a user based on specific algorithms and relevant historical data. The complexity of personalizing recommendations lies in the continuous process of learning, predicting, and adapting to each user’s unique preferences. The goal is to deliver an engaging, seamless, and highly personalized experience. Before we delve into the nuances of semantic recommendation systems, it’s important to understand traditional recommendation systems. These systems are the foundation upon which semantic approaches are built, addressing key tasks and employing associated data and models. In the following sections, we will begin by reviewing how traditional recommendation systems operate before exploring the unique aspects of semantic recommendation systems.

Traditional recommendation systems


Traditional recommendation systems utilize various data types to discern user patterns and offer personalized suggestions. 

  • User Data
    User data includes demographics and psychographics, which are often anonymized and segmented for better personalization. Item data refers to the characteristics of the recommended items, such as product descriptions, price, and author. This data can be enriched with external sources for increased accuracy.
  • Product Data
    Product data encompasses details such as specifications, pricing, and manufacturer information, which are crucial for crafting precise recommendations. This information can be augmented with customer reviews, ratings, and even social media trends to enhance the system’s understanding of each item’s appeal and relevance. By analyzing this enriched product data, recommendation systems can more accurately match products with users’ preferences and needs, leading to a more personalized shopping experience.
  • Interaction data
    Interaction data, which captures user interactions with items, including ratings, viewing history, and purchase history, is crucial for understanding user preferences. Contextual data, involving the context in which interactions occur, helps understand situational factors influencing user choices.

Model (aka algorithm)

Traditional recommendation systems employ various algorithms to provide personalized suggestions. Collaborative Filtering, for instance, works like a friend recommending a movie based on your shared preferences. It uses past behavior to suggest items liked by people with similar tastes. Content-Based Filtering, on the other hand, is akin to a friend who knows your tastes so well, they suggest new items based on what you’ve liked before. It analyzes the properties of items you’ve liked and recommends similar ones. Hybrid Systems combine the two, leveraging both methods to provide more accurate recommendations. They suggest items that are popular among similar users and resemble your past preferences. These systems are like having the insights of all your friends combined, providing you with the most suitable recommendations.

Limitations of traditional recommendation systems

Traditional recommendation systems are limited in several ways. Architecturally, they primarily depend on user-platform interactions, often overlooking the user’s inherent characteristics. Additionally, these systems are not designed to process unstructured data, such as user descriptions, product details, or user-generated content like social media posts. This inability to utilize such rich data sources poses a significant limitation to their effectiveness.

Traditional recommendation systems primarily depend on user-platform interactions and are not designed to process unstructured, such as social media content.

Semantic recommendation systems

Semantic recommendation systems, unlike traditional systems, don’t rely on interaction data but extract textual data to characterize users and products. They construct a semantic representation of each entity and cross-reference them to build a numerical score, indicating the product’s relevance to the user. These systems utilize a feature of large language models called Embedding. It provides a vector representation of the text, which is essentially a sequence of numbers encoding the text’s meaning in the semantic space. Therefore, it’s a numerical representation of the text’s meaning. Thus, while both semantic and traditional systems aim to suggest the best products, the process of scoring their relevance to the user is significantly different.

Semantic recommendation systems, unlike traditional systems, don’t rely on interaction data but extract textual data to characterize users and products.

How to select the right approach (traditional vs. semantic)?

Semantic recommendation systems differ from traditional ones as they consider textual data to understand correlations between products and users. The choice between these approaches depends on the problem, available data, the significance of textual data, and the volume and variety of interaction data. It’s crucial to stay vigilant when choosing the right approach.

Can we combine and get the best of the two approaches?

In some cases, we can combine the two approaches. By utilizing both interaction data (clicks, page views, transactions) and textual data (descriptions of users and offers), these systems first recommend products based on interaction data. Then, a second filter is applied using textual data, refining the selection for increased relevance. This dual-approach system can yield highly accurate recommendations when sufficient and varied data is available.

3. Technical approach

Our technical approach begins with a database of approximately 250,000 influencers, covering a vast array of fields. We also have a promotional offer for a brand that could belong to any sector. The challenge is to identify the influencers with the greatest potential for the specific brand from this vast pool. We have data about the offer and its content, presented as a 200-word paragraph, and extensive data for the influencers, including all their social media content from recent years. Additionally, we have quantitative data about the influencers, such as their social media follower count, and the number of views and clicks on each of their content. This comprehensive data set forms the basis of our technical approach to finding the best influencers for any offer using semantic recommendation systems.

Step1: scoring metrics definitions

Our technical approach begins by defining two scoring criteria to assess an influencer’s relevance to an offer. The first, a quantitative criterion, measures the influencer’s power of influence through their follower count, number of posts, and content views and clicks. The second, a qualitative criterion, gauges the alignment of the influencer’s content with the offer. By merging these two criteria into one score, we can rank influencers, positioning the most relevant ones at the top. This method allows us to accurately identify the best influencers for any given offer.

Step2: compute topic relevance and influence power

We calculate semantic relevance by computing the embeddings of each influencer’s post and the embedding of the offer using a Large Language Model. Each post is then scored using a similarity metric, either cosine or Euclidean distance. This dual-scoring system helps us identify the topic similarity between the offer and the influencer content.

Step3: build an empirical aggregation function

After calculating the influence power and semantic relevance scores for each influencer, we employ an aggregation function. This function combines these two scores, producing a unique score for each influencer. The unique score effectively represents an influencer’s combined power of influence and semantic relevance. This approach allows us to identify the most suitable influencers for any given offer, considering both their reach and the relevancy of their content to the offer.

4. Results

Performance metrics: precision, recall, and macro-F1 scores

Before we delve into the results, it’s crucial to understand our performance evaluation method. We categorize recommendations into ‘positive creators’ (correctly targeted creators that match search criteria) and ‘negative creators’ (incorrect recommendations). We then measure ‘precision’ (the ratio of correctly predicted positive observations to the total predicted positives), ‘recall’ (the ratio of correctly predicted positive observations to all observations in actual class) and ‘F1 score’ (the weighted average of precision and recall). These metrics help us assess the effectiveness of our semantic recommendation systems in identifying the best influencers.

Performance considerations

Evaluating the results of using semantic recommendation systems to identify the best influencers for any offer is not as straightforward as it may seem. The results are not absolute and are influenced by the quality and relevance of the data used. In other words, if the initial database lacks creators on a specific topic, the system cannot return any suitable recommendations. This underlines the importance of having a diverse and comprehensive database for the system to draw from.

Additionally, it’s crucial to compare the performance of the system with that of a human performing the same task. This is because the relative performance of the model is often more significant than its absolute performance. For instance, if a human can manually identify better-suited influencers for a certain offer, then the system may need further refinement. In conclusion, while semantic recommendation systems can be powerful tools in identifying influencers, their effectiveness is heavily reliant on the data used and should be evaluated in comparison to human performance.

Test results

We conducted a test on our model using an initial database of content and two distinct offers: skincare products and merchandise items. The aim was to evaluate the system’s ability to accurately identify the most influential individuals for these specific offers, using semantic recommendation systems. The results would provide an understanding of the model’s effectiveness and efficiency in real-world applications.

SkinCare brand Merchandise products
Precision 70% 75%
Recall 92% 87%
F1 80% 81%

The primary advantage of using semantic recommendation systems is not just the enhanced accuracy in identifying the best influencers for any offer, but also the speed at which these recommendations are made. This rapidity in providing high-quality recommendations is something that manual methods could never match, thereby making this model an invaluable tool for businesses.

To enhance the model’s effectiveness, the subsequent step would be to incorporate more significant content. This would not only augment the semantic footprint of each creator but also boost the overall precision of the results. This will ensure a more accurate identification of the most suitable influencers for any given offer.

The primary advantage of using semantic recommendation systems is not just the enhanced accuracy in identifying the best influencers for any offer, but also the speed at which these recommendations are made.


The advent of semantic recommendation systems marks a pivotal shift in the landscape of influencer marketing, offering an innovative and highly effective method for identifying the best influencers for any brand. Through the integration of Language Models (LLMs) and analysis of social media data, these systems not only address the limitations of traditional recommendation approaches but also harness the rich, unstructured textual data available on social media platforms. The project discussed in this article showcases the potential of semantic recommendation systems to transform the way brands connect with influencers, providing a method that is both precise and efficient.

The results of implementing this approach speak volumes about its efficacy. With notable precision, recall, and F1 scores for different product categories, the system demonstrates its ability to accurately match influencers to brands, significantly improving upon the time-consuming processes traditionally employed. Furthermore, the system’s ability to rapidly generate high-quality recommendations underlines its potential as a tool for businesses looking to optimize their influencer marketing strategies.

As we continue to refine and expand these systems, their role in shaping the future of marketing and brand promotion becomes increasingly evident, promising a new era of precision-driven influencer marketing.

Mehdi B.

Mehdi B.

Mehdi is the founder of reco-genius.com, an AI agency specializing in performance solutions for reward platforms. He brings over a decade of private equity experience and a flair for innovative tech solutions. Mehdi is a software engineer, a graduate of École Polytechnique (aka "The French MIT"). He also holds a Professional Certificate in AI from Stanford and the AWS Machine Learning Certification.