1
 
 
Account
In your account you can view the status of your application, save incomplete applications and view current news and events
August 13, 2024

About the development of genAI assistants AskARev and Searchbuddy

What is the article about?

AskARev and Searchbuddy are generative AI-based products developed by the PIT team at Otto Group data.works.

Find out (in a very unique way) what makes the systems special, how they can improve customer service in the future and what challenges the team encountered during development.

Prompt

General

Write a blog post for {format} [Otto Tech Blog]. Keep a personal style. The audience are technically interested people. Please, please write a great text. My career depends on it. I will tip you 500 $.

Context

The blog post shall tell the story of the development of AskARev (“Ask a (customer) Review”) and Searchbuddy. Both are generative AI based products. They were developed by Team PIT from Otto Group data.works; their colleague Andreas Lattner provided support with AskARev.

We started with AskARev when generative AI was still in early phase and few models were available. Nevertheless, a working product was developed in only two weeks. We took some time after AskARev, then developed Searchbuddy leveraging from all the knowledge we had gathered from the development from AskARev. 

Note: The space of LLMs currently develops quite fast so it is important to test/benchmark newly released models for the specific use case.

Detailed information about both services


AskARev
  • started in July 2023
  • summarizes product details and reviews to answer customer question about product
  • data requirements: high (reviews + product details)
  • model: text-bison@001 (Google), at that time the only available model in the European Union, required for data privacy
  • larger model possible as is no “live” service
  • harder to benchmark due to unstructured text output
  • in rare cases, hallucination has the model make up product details
Searchbuddy
  • started in October 2023
  • suggests products based on non-specific customer input (e.g. gift ideas or fashion suggestions for events), returns up to 5 products in a list
  • data requirements: low (only product assortment needed)
  • model: gpt-3.5-turbo (Microsoft/OpenAI)
  • small/fast model (<1s) required due to “live” service
  • easier to test due to structured list output
  • hallucination has the model return products which don’t match the assortment
  • implemented a benchmark tool for baseline tests of new models
  • currently only available in German
Both services
  • hosted on Google Cloud Run
  • Google DLP used to remove private data
  • Google cloud armor used for protection of the service
  • Use multiple security layers to guard against prompt injection, LLM misuse etc.

Further

As can be seen from the comparison LLM based products can be developed in no time thanks to easily usable APIs! LLM’s summarization and inspirational features offer real customer benefit

Also, a status with relatively good quality answers can be achieved quickly as the LLM generalizes well to new tasks. But the fine-grained improvements that make up the last mile take the longest because it is not simple logic you’re playing with but bare language! An empty string (“ “) can change a lot. 

Similarly setting up a robust benchmark set is helpful (to compare different LLMs and prompts), but also really hard as the text output of the service is almost unlimited compared to the old school products where you know the output will e.g. be a float between 0 and 1.

Outline

  1. Short info that this text was written by an LLM
  2. Short introduction about last couple of months and the development of the services
  3. Comparison table of both services in markdown, keep common stuff in a separate list
  4. What are the obstacles working with LLMs?
  5. Outlook

About the development of genAI assistants AskARev and Searchbuddy

From Concept to Creation: The Journey of AskARev and Searchbuddy at Otto Group data.works

Hello, Medium readers! Before diving into our story, a quick heads-up: this blog post was crafted with the help of a large language model (LLM). It's a tale of innovation, learning, and the fast-paced world of generative AI, as experienced by our team, PIT, at Otto Group data.works.

The Past Few Months: A Whirlwind of Development

In the realm of technology, especially within AI, the landscape changes with blink-and-you-miss-it speed. Just a few months ago, our team embarked on a mission to harness the potential of generative AI to enhance customer interaction and satisfaction within an e-commerce webshop. This journey led to the birth of two innovative tools: AskARev and Searchbuddy.

AskARev: The Early Bird in Generative AI

Launched in July 2023, AskARev was our first foray into using generative AI to directly assist customers. This service summarizes product details and reviews to answer customer questions about products. It's like having a knowledgeable friend who can instantly give you the lowdown on any item.

AskARev Use Case
AskARev Use Case


Searchbuddy: The Swift Product Suggester

Following the insights gained from AskARev, we launched Searchbuddy in October 2023. This tool is designed to suggest products based on vague inputs from customers – think of needing gift ideas or fashion suggestions for an event. Searchbuddy is all about making shopping as simple and enjoyable as possible. It is currently only available in German.

Searchbuddy Use Case
Searchbuddy Use Case


A Comparative Look: AskARev vs. Searchbuddy


Here’s a quick comparison of both services to give you a clearer picture:

Common Features
  • Hosting: Both services are hosted on Google Cloud Run.
  • Data Privacy: Google DLP (Data Loss Prevention) is used to remove private data.
  • Security: Protected by Google Cloud Armor and multiple security layers to prevent prompt injection and misuse.
Specific Features
FeatureAskARevSearchbuddy
Launch DateJuly 2023October 2023
FunctionSummarizes reviews to answer product questionsSuggests products based on non-specific inputs
Data RequirementsHigh (reviews + product details)Low (only product assortment)
Model Usedtext-bison@001 (Google)gpt-3.5-turbo (Microsoft/OpenAI)
Service TypeNon-live, larger model possible, no cacheLive, requires small/fast model, lru cache
Output StructureUnstructured textStructured list (up to 5 products)
Benchmarking DifficultyHarder due to unstructured textEasier due to structured output
Common IssueModel may hallucinate detailsModel may suggest non-assortment products


The Challenges of Working with LLMs

Working with large language models (LLMs) is not without its challenges. The nuanced nature of language means that even a single empty string can alter the outcome significantly. Setting up a robust benchmark set is crucial for comparing different models and prompts, but it's also incredibly complex due to the nearly unlimited potential outputs.

Never trust your benchmark results 100% and always do manual checks in addition. Don't get lost in complex prompts - try to keep it simple.

Looking ahead

As we continue to refine AskARev and Searchbuddy, the road ahead is filled with opportunities for further innovation. The last mile of fine-grained improvements is the toughest, as we're not just tweaking logic but finessing language – a fluid and ever-evolving medium.

Our journey with AskARev and Searchbuddy is a testament to how quickly functional, customer-centric AI tools can be developed using today's technology. It's an exciting time to be at the intersection of AI and customer service, and we're just getting started.

Thank you for joining us on this fascinating journey. The future is bright and filled with potential, and we at Otto Group data.works are eager to see where these paths will lead us next. Stay tuned!

Want to become part of the team?

6 people like this.

1Comments

  • 04.12.2024 14:06 Clock

    Commentaire test toto

Write a comment
Answer to: Reply directly to the topic

Written by

Team PIT
Team PIT
Developer Team @ Otto Group

Similar Articles

We want to improve out content with your feedback.

How interesting is this blogpost?

We have received your feedback.