Improve catalog quality by AI-based attribute extraction and normalization

The Data platform department collaborates with several data science teams, especially Rakuten Institute of Technology research scientists to improve catalog quality by extracting attributes from titles and descriptions, normalizing key attributes, and cleaning product titles and descriptions. Key initiatives taken in this program is to extract important attributes like brand, color, size, pattern, material, gender, sleeves, neckline, etc. for fashion category; brand, model, technical specs, color, etc. for electronic goods, to name a few. Normalization of several key attributes are also of utmost importance for improving search results by a large extent. Normalization ensures that attributes like brand, size, color, etc. are consistent across various product listing so that search results are accurately listed for users to choose from. Several advanced natural language processing techniques are being used to make this initiative a success