Shubham's blog

Conversational Intelligence Platform for Indian BFSI

I had done this analysis as part of a hiring assignment looking to enter the Conversational Intelligence market.

Table of Contents

PART 1: MARKET RESEARCH

Market Landscape

Key Players in the Conversation Intelligence (CI) Market:

Convin AI

Mihup

Uniphore

Observe.ai

Other players like Corover.ai, Yellow.ai offer conversational platforms but don't offer CI as a solution.

The Opportunity

Based on the existing market players, there is still a large white space for deep conversational intelligence for sales. Except for Convin, no one seems to be solving for sales intelligence. BFSI as an industry is huge and the need for industry-specific solutions is prominent, which again are absent from all except Convin at this moment.

Problem Statement

Existing CI platforms for sales intelligence in Indic languages suffer from a lack of accuracy in ASR and are too focused on agent-level insights & customer support, thereby losing focus on sales intelligence, market intelligence & product insights from call data

In BFSI, where there is a high churn of agents, clients will be willing to pay a premium for in-depth agent coaching modules, but if the platform can be made useful for all teams within the firm, it makes for even stronger stickiness beyond just the sales team. Customer understanding can go beyond surveys (which can be biased) towards hard numbers derived from multiple conversation data points.

Hypothesis

Building a highly accurate call transcription capability across languages, combined with leveraging audio data & multiple small language models to extract deep customer, product & market-related insights from bulk conversational data to empower not just the sales team but also Product, Marketing, UX team & the CXOs of the firm to guide their decision making.

Target Audience

Primary Target: Business Heads

The initial target audience for pitching the product will be Business Heads since our end goal will be to increase sales conversion, while also offering key market intelligence, customer insights & product improvements needed. Business Heads/Senior leaders also have a direct say in the P&L, which allows for fast-tracking closure with potential clients. Targeting only sales leaders will limit the scope of the product and might fail to capture the attention towards our overall capabilities. Especially with the Digital Personal Data Protection Act (DPDP) coming into the picture, every inbound lead counts, making such intelligence all the more valuable instead of being left untapped.

Currently, most sales leaders are reliant on the feedback of 1-2 top agents as to what's working and what's not working. Improving agent productivity is not data-driven, nor is compliance adherence. Regulatory compliance alone will be difficult to sell as a solution. Hence, compliance adherence improvement will be an add-on to the potential P&L impact we are proposing.

End Users: Team Leads

The primary end users for the Sales intelligence platform will be the team leads based out of call centres who manage agents. If the team lead who has the pressure to achieve their daily targets sees a clear value in this → they will adopt the tool quickly, work towards implementing agent improvement recommendations and give the results that the company wants. Good feedback from team leads will convince the business managers of more adoption as they see an increase in on-ground sales.

Differentiation

Along with owning the communication suite, a major way in which a company can differentiate itself is through strength in ASR for Indic languages through fine-tuning open source LMs. Building vertical-specific expertise will further allow us to capture value from the BFSI market, which is largely conversational-driven.

If the company owns the communication suite, it can leverage the platform to bundle both the communication suite as well as the intelligence that comes with it. This is especially valuable for SMBs, which would not want to have multiple vendors and allows us to fast-track the onboarding of new clients.

Customer Segment

The initial focus will be to capture the high-growth Indian SMB market through our competitive pricing and then move up-market. Our technological superiority should give us an edge in pricing by building efficient ASR & small language models that allow for cheaper and faster inference, without a huge decrease in accuracy. This market is a complete blue ocean which has not been prominently focused on.

The deep integration with Indian SMBs will allow us to upsell future solutions as part of an integrated AI suite. This also allows us to strengthen the product & understand the nuances of tier 3-4 Indian cities, which can translate into strengths for our ASR & TTS API.

Indian SMBs in the BFSI market would involve:

These Indian SMBs often deal with multiple regional languages and dialects, which serves as the strength for the company in having speech capabilities that cater to Indic languages with high transcription accuracy. Combined with our conversational suite, CI will offer a compelling proposition as it reduces the complexity of having to integrate with multiple vendors. This market positioning allows us to build on our product's strengths, which no other firm has as of now in India.

PART 2: PRODUCT VISION & STRATEGY

Product Vision

Value Proposition

Compliance Insurance: Continued non-compliance can result in a huge penalty for many of these firms, leading to a complete business shutdown. A CI platform will not be a cost centre but will be an insurance policy against heavy fines due to non-compliance.

Sales Optimisation: On the sales end, a CI platform can reduce agent onboarding time (solving for lean period during high churn), replicate success at scale by improving agents' learning, thereby improving conversion & sales.

Business Intelligence: It can help generate trend-wise market intelligence, customer insights, product insights for usage across the company → thereby allowing unmet customer needs, changes in risk policy etc.

Given the complexities involved in the sales process & that it is a profit centre - the focus can first be on pitching CI for sales.

Success Metrics

Before measuring the below metrics, it will be important to measure the baseline pre-implementation of the CI platform so that a clear impact of using CI can be measured. A control group can also be created wherein the performance increase of the test group against the control group can be measured; this will also make it easy for decision-makers to try the product.

North Star Metrics

Supporting Metrics

PART 3: MVP DEFINITION & DESIGN

MVP Core Features

The initial MVP should be made to solve the most acute pain point of our customers. For conversational sales intelligence, the dashboards need to have:

Included in MVP:

Excluded from MVP:

UX Design

The design is here. Please download the HTML file and render it in Google Chrome. Made with the help of Gemini-Pro-2.5 by prompting an outline and design of the dashboard with the details.

PART 4: GO-TO-MARKET CONSIDERATIONS

Pricing & Packaging

The best pricing method would be to have usage-based pricing i.e. based on the number of hours analysed, with a free demo for a certain period and a certain hours allowed. This allows the client to test quickly if the data ingestion, key insights and report creation are as per their expectation and allows them to specify what specificity they want in the final product. The idea will be to get the key features in and then sell additional value-added services, the most effective playbook for most B2B SaaS businesses.

Partnerships with CRM providers used by Indian businesses (Zoho, Leadsquared), and telephony providers (Exotel, Genesys) can help reduce the time-to-value of the CI platform for our clients.

PART 5: THE AI ADVANTAGE - BEYOND TRANSCRIPTION

Product and Platform Features

Building a commanding presence in the BFSI market will require deep domain knowledge and an understanding of the sales process of each client. As a best practice, taking 50-100 calls manually for sales will be a good starting point for the TPM/PM to understand what it is they are dealing with and what modifications need to be made.

We should build on the AI advantage in the following areas, which will allow us to create a deep moat. P0 are the most important features, followed by P1.

P0 Features (Critical)

Highly Accurate Transcriptions

Inaccurate transcriptions across languages which do not capture the language-specific nuance remain a major point of concern for businesses with CI providers.

Each company has their own set of terminology/product names, etc. To further improve on our transcription, our base audio language model can be fine-tuned with the text on the client's blogs, webpages so that all terminologies are correctly captured in the transcription. The constant reduction in WER at a company level will enhance trust in the product amongst all the stakeholders in the organisation using the product.

Lead Prioritisation & Automated Next Best Action on a Call

Automatically identify the next steps on the call & tag a disposition against the same along with reason:

Disposition tagging of each call, i.e. tagging the result after the call has happened, is done manually today. Often, agents in a hurry to attend more calls don't correctly fill out the dispositions. These dispositions are extremely critical since they decide how many customers are not interested, not eligible for the loan or need to be followed up. Automating this will improve the accuracy of tagging.

Our LM should have enough context that, based on the details given by the user in the transcript, it should be able to infer if a user is eligible for a particular loan/policy. If a user is 'Not eligible', the disposition should be tagged so, and the reason can be entered as 'sub-disposition' of the call.

If a customer needs to be followed up, relevant information can be extracted (when to follow up, key points to discuss next) and fed into the CRM to be logged against that lead ID. In that way, the agent, when he/she talks next, has the context of what needs to be discussed and what the next steps are.

Predictive conversion: We can build emotion models (beyond just sentiment models that rely only on text) that combine both the audio and the transcript to track emotion changes throughout the call. Emotion changes can be tracked through analysing changes in F0, speed of speech & other vocal patterns. Pairing audio conversations with CRM data can help build a classification model that helps classify a conversation as high, medium or low conversion probability.

We can assign a score or classify leads into buckets as high conversion, medium conversion or low conversion. This conversion model will be trained on pairs of audio + transcripts with call status data from CRM (successfully converted, not converted). Mapping emotional state changes throughout the call (with base emotional state as the state during the first 15 seconds of the call) can help predict if a customer is truly interested and worth pursuing → thereby allowing for lead prioritisation. This improves agent productivity by targeting highly interested leads & reduces customer dissatisfaction with agents calling multiple times to disinterested customers.

Market Insights + Product/Process Insights

Our CI platform can help make Call centres the most powerful market research tool for our clients. This is something that most tools today are not strong on.

Strategic Use of the Platform

Make the CI platform so integral that it bridges the gap between boardrooms and the on-ground salesforce. CXOs are often looking for answers to fundamental questions:

Solution: Have any person in the organisation chat with the transcripts to extract relevant insights (point of concern: chatting with an entire range of transcripts is the most natural solution however, that can become difficult as the range of chats becomes bigger, and indexing everything will incur heavy vector DB costs for embedding).

Conversation insights can be made accessible to every team in the organization → product (where is the most friction in the digital journey or the CRM platform for the agent), UX & Marketing team (what kind of conversation patterns are the users most comfortable with; is our brand messaging consistent with the leads that come to the platform), support teams (for users complaining of misselling, what is the reason for the same so that the relevant feedback is communicated to sales teams)

Increased adoption of the build across all teams will build a strong moat for the CI platform.

Real-time Assist

For building 10x moments, we will need to move from a reactive approach to a proactive approach where the conversation is analysed in real-time. If a customer seems to be getting frustrated, it should pull up relevant snippets of text for the agent to tell the customer to alleviate their concern. If the customer seems not to understand a policy rule, the real-time assist can show an easier explanation. If an agent is not able to explain why their product is better, the real-time assist can show this on the CRM for the agent.

P1 Features (Important)

Compliance Check & Avoiding Mis-selling

This is marked P1 since although this is important this is being offered by most of the solution providers today.

Once the transcription has been done, we can train a model to identify compliance misses by fine-tuning a base model on a set of examples consisting of sample conversations and what kind of compliance miss it entails or does not entail. A small language model can be trained to classify a segment of text as having 'compliance transgression' or 'no compliance transgression'. A hybrid fine-tuning + RAG strategy will need to be used. While fine-tuning can help identify patterns, RAG can then be used it retrieve up-to-date factual knowledge on recent RBI/SEBI circulars.

We can also ensure that there is no mis-selling by ensuring each transcript satisfies a checklist of information to be disclosed.

Agent Training

Since churn is a huge problem, faster onboarding is a key problem to solve for. Capturing the essence of the top 2-3 superstar agents and injecting their skills into the rest of the agents can help replicate success at scale.

Call transcripts of successful conversion calls can be analysed to look at the common conversational patterns to build a 'Digital Sales Playbook'. We can analyse multiple calls to identify key language & audio characteristics of a successful sales call.

For easy training, a searchable audio library of call snippets can be created. Team leads can instantly find examples of the perfect handling of a "competitor offering a better deal" objection, the most effective way to explain a processing fee, etc.

With this, new hires can be productive in two weeks instead of six. Their curriculum can involve listening to a curated playlist of the top 10 "perfect" discovery calls for a personal loan & start taking live calls with Real-Time Agent Assist. When the system detects a keyword like "processing fee," a small pop-up appears on the agent's screen with the best-practice script to explain it.

For Users Already Using the company's Communication Platform

With clients who are already using the company's suite of products, we can use the CI platform to improve our agent responses & TTS by analysing at what instances our customers lose interest. Some of the questions that we can answer and build on: