Web data processing tool PoC creation for an e-commerce company
E-commerce & Retail development services
Dedicated Team
Product Owner, Full stack tech lead, Data Engineer, Computer Vision engineer, NLP Engineer, UX designer, ML Solutions Architect, Delivery Manager
Web-based application Tech Stack: Node.js, ReactJS, Python, Google, LAION, BERT language model, Google Cloud, Firestore, BigQuery
CHALLENGE
The client is an e-commerce aggregator from Europe. The company needed a suite of solutions for enhancing listing optimization, primarily a web data processing tool for scaling the business and providing better service to the existing customers. That’s why the client decided to work with Brightgrove on a data processing tool.
HOW WE HELPED
PoC creation
The Brightgrove team was fully responsible for the E2E creation of a PoC to help the client identify if the offered technical solution could cover his business requirements.
The PoC goal was to develop a concept recommender engine. This would help the client select correct product variations on e-commerce platforms based on user data, activity, market trends, reviews, and comments.
Meanwhile, the main data source for accurate recommendations is the analysis of the competitor marketplaces metadata, such as product media and text data.
The outcome of PoC is a basis for developing a complete recommender engine solution embedded into client infrastructure for internal use or SAAS.
Taxonomy and categories implementation
Taxonomy is used to organize products into categories and subcategories and to define the relationships between them. For this PoC, taxonomy was used as the intermediate representation of data to make it possible to connect language models and computer vision models.
In the context of a recommender engine for an e-commerce platform and the client’s business model, taxonomy refers to the hierarchical classification of products and their description. Taxonomy is a way to categorize or classify products in a structured manner based on their attributes, characteristics, or features.
Language models and image-to-taxonomy encoding
For the PoC and further development, we needed to research language models capable of encoding text data—product descriptions and other metadata—to taxonomy representation.
We needed to evaluate various language models by several criteria:
- Model size
- Performance
- Easy to adapt to new data (fine-tuning)
In the case of the PoC, light models like tiny-GPT or tiny-BERT were used for a more limited number of categories.
For the implementation of image-to-text (taxonomy) encoding, we needed an engine capable of searching similar media based on their semantics rather than visual features. With a stable image similarity search engine, the system can analyze new image data from an existing annotated image database and collect taxonomy tags from similar media.
Taxonomy tags will be aggregated based on the similarity score, and the taxonomy data will be generated automatically for new images.
Product deployment and further development
The PoC was built as a set of microservices or separate containerized applications integrated into the client’s Google Cloud infrastructure.
For further development, the PoC will be used to create detailed software requirements specifications and new ideas for improvements in the approach used and data collection and annotation setup. The future product roadmap was also created as a part of the PoC complex.
RESULTS AND ACHIEVEMENTS
- Created the application UI fully from scratch using the client’s existing brand style
- Performed complex research of language models to find the most suitable solution from the engineering and cost perspectives
- Created application architecture that was adapted to the client’s Google infrastructure
- Wrote documentation that covers each step of the development and is easy to understand even for non-technical specialists
- Outlined a solid product roadmap for the next development phases with the consideration of dependencies, resource allocation, product milestones
What happens now
Our active preparation for the following product development phase is currently in progress. We’re looking forward to starting the main development phase as soon as the customer approves all the requirements.