Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

ML Framework for auto-responding to customer support queries

1.320 Aufrufe

Veröffentlicht am

The synopsis of this presentation is about how ML can be employed to develop a bot that has the capability to understand natural language and provide suitable response.

Veröffentlicht in: Daten & Analysen
  • Loggen Sie sich ein, um Kommentare anzuzeigen.

ML Framework for auto-responding to customer support queries

  1. 1. Frankbot - ML framework for auto-responding to customer support queries
  2. 2. Introduction to Freshdesk
  3. 3. Introduction to Freshdesk Freshdesk is a multi-channel cloud based customer support product, which enables businesses to ● Streamline all customer conversations in one place - these are conversations between the business and its end customers ● Automate repetitive work and make support agents more efficient ● Enable support agents to collaborate with other teams to resolve issues faster ● Freshdesk tickets are a record of customer conversations across channels (read phone, chat, e- mail, social, etc.) ○ A typical conversation includes customer queries and agent responses ○ Frequently recurring customer queries are called L1 tickets ● Freshdesk currently has ~150,000 customers from across the world
  4. 4. Motivation & Objectives
  5. 5. Exponential ticket Inflow 32,000 88,000 1,86,000 201820162014 2019 projection 2,41,000 09 Agents 50 Agents 85 Agents 201820162014 130 Agents 2019 Bringing reinforcements Q42019 Need for Automation Data sourced from Freshdesk’s Support portal
  6. 6. Analyzed ticket complexity Ticket Type Analysis Data sourced from Freshdesk’s Support portal
  7. 7. - Deflect L1 tickets - Free up agent time Automate Level 1 Support Objectives
  8. 8. Ask Freddy Frankbot in production
  9. 9. Data and Methodology
  10. 10. Data sources for Model Training ● Source - Freshdesk data pertaining to customer (business) accounts ○ Knowledge base articles, FAQs ○ Tickets from different channels such as e-mail, portal (raised on website), chat, social and phone ● Data of different accounts - All active and paid accounts with at least 100 tickets in the last 3 months and 1 article in the knowledge base ● Training strategy ○ One model per account trained end-end ○ Embeddings trained at industry level, models at account level Note: Tickets from email, portal-direct, chat and phone channels account for close to 95% of the ticket volume
  11. 11. 201720162015 0.00% 20.00% 40.00% Email Chat Portal Feedback Widget Phone Social Ticket Source 2018 Distribution of ticket volume by source Data sourced from Freshdesk’s Support portal
  12. 12. Modelling Pipeline L1 Embedding Layer ❖ Vectorization of preprocessed text using two different schemes ➢ LSA - Latent Semantic analysis ➢ FastText ❖ LSA vector space is learnt for each account in isolation ❖ FastText vectors are trained at industry level Data ❖ Historical Freshdesk tickets and Knowledge base articles Preprocessing ❖ Stopwords removal, stemming/lemmatization, TF-IDF based Bigram selection ❖ Removal of signatures and footers, code constructs, non- ascii characters, salutations, disclaimers, etc. *KB - Knowledge Base
  13. 13. Modelling Pipeline L2 Layer ❖ Classification model (XGBoost) to act as a gating function for L1 responses ❖ Model is trained using manually tagged data and real-time customer feedback ❖ Features - % word match between nouns/verbs/adjectives of words in the query and responses, word mover distance, ordered bigram & trigram counts Response Retrieval ❖ Given a customer query, generate two L1 vectors, using LSA and FastText representations ❖ Rank eligible solution articles based on cosine similarity in the L1 space - top 3 responses are chosen ❖ L2 model is used for gating the responses
  14. 14. L1 and L2 Gating Logic The chosen response(s) is served iff at least one of the following conditions are satisfied. ❖ L1 score > L1_Upper ❖ L2 score > L2_Upper ❖ (L1 score > L1_Lower) & (L2 score > L2_Lower) Where, the upper and lower thresholds are fine-tuned for each account based on observed deflection performance.
  15. 15. Additional features of the bot
  16. 16. Teach the bot
  17. 17. Teach the bot ● Teach the bot is a feature that allows customer support agents to explicitly train the bot by ingesting Q → A mappings ● When the Answer bot fails to respond to a query (Q), the agent can point the bot to the expected response (A) which should have been returned ● If a suitable response (A) does not exist in the Knowledge base, it can be created on-the-fly ● This expected response (A) is consumed and mapped to be close to the query vector (Q) in the L1 vector space ○ This ensures that article A would show up for future queries that are similar to Q ○ The same feature is re-purposed to resolve incorrect bot responses as well ○ This feature also helps to improve the overall coverage levels of the Answer bot
  18. 18. ● Model refresh is key to ensuring that the models are up to date and stay relevant over time ● This is done once a week; or as soon as an account accumulates a sizeable number of new queries or Knowledge base updates ● It involves the following steps ○ Retraining the LSA model after including the newly accumulated data ○ Incremental training of FastText vectors with new data ● Retraining the L2 model on recent data ○ The L2 model is updated using recent feedback provided by customers Periodic model refresh
  19. 19. Results
  20. 20. Metrics and business impact ❖ # Active clients - number of customers who are exposing the bot to their customers in their support portal ❖ % Deflection - Ratio of the # helpful bot responses (Helpful) and # Requests
  21. 21. Metrics and business impact ❖ # Requests - number of requests that the bot gets ❖ # Responded - number of requests responded/answered by the bot ❖ # Helpful - number of requests where the bot responses were helpful ❖ # No Feedback - number of bot responses for which there was no feedback from users *CSAT - Customer Satisfaction Score ● CSAT* - 79% with bots and 72% without bots ● Average First Response Time (overall) - 13 hrs with bots and 19 hrs without bots
  22. 22. ● Query could relate to a new topic for which there may not be enough FAQs or articles ● Query could relate to an existing topic but may contain keywords which are not in the vocabulary - This may result in low L1 and L2 confidence which may not satisfy the thresholds ● Query may be related to a particular action - Example: “Can you connect me to an agent?” which is a question for a task completion bot that has intent detection capabilities ● Query may not have a question or issue - Example: “I have an open ticket 3335924” ● Query may be ambiguous or unclear - Example: “discussion” Why are some suggestions not helpful and some queries not answered
  23. 23. Challenges and learnings Challenges: ● Developing a preprocessing mechanism that can extract only the salient components from messy emails ● Handling the complexity of storing and retrieving vector of floats (idfs, SVD components, word vectors) for every account ● Serving predictions at low latency ● Usage of the right tools for monitoring and finding bugs in the codebase in a proactive manner Lessons Learnt: ● Focus on content creation and nurturing the AI ● Define success metrics and inform stakeholders about what a reasonable target is ● Define strategies for model refresh and feedback consumption
  24. 24. Thank You