Principal, Data Science [Catalog]
Coupang is one of the largest and fastest growing e-commerce platforms on the planet. We are on a mission to revolutionize everyday lives for our customers, employees and partners. We solve problems no one has solved before to create a world where people ask, “How did we ever live without Coupang?” Coupang is a global company with offices in Beijing, Los Angeles, Seattle, Seoul, Shanghai, and Silicon Valley.
As our Principal, Data Science for Catalog, you will be responsible for operational reporting and insights to make our consumer experience world-class.
Our goal is to build the best e-commerce experience for our customers. We get millions of products from sellers and we want to build a consistent experience by automatically detecting features from catalog, and enriching the catalog with structured information. We use machine learning to develop models to extract missing data from text, detect inaccuracies and fix them automatically. We strive to build efficient workflows allowing humans to apply their judgment only when necessary.
On a daily basis, we solve problems from different kinds product categories ranging from cell phone cases to fashion, consume various sources of data such as catalog, reviews, views etc. to continually enhance the catalog. And we do all of this at scale that is growing at a rapid pace.
As a data scientist you will use your knowledge to build algorithms that help us with automatic understanding of text (NLP/Information extraction), and robust scalable and maintainable machine learning models. You will bring scientific rigour to problem-solving and provide key inputs to business and engineering teams on overall strategy for catalog. You will work with our top engineers to put your solutions into production systems that impact how our customers shop.
- Extract product data from unstructured data by designing and testing new algorithms and techniques, thereby improving discovery of products.
- Analyze large amounts of data to discover patterns and build robust models to extract valuable information from various sources (e.g. product catalog, customer reviews, clicks etc.) that vary in quality of data and structure.
- Automatically classify products into customer facing category with high accuracy.
- Normalize variations (by language and spelling) for attributes like as brand, size or color.
- Identify products that are identical or similar from millions of incoming selection of products.
- Identify illegal products that are not allowed on the website.
- Define product information that is important for customer experience.
- You will put such algorithms and techniques to improve what customer sees on website, resolving many use cases automatically and identify cases that need inputs from catalog experts.
- You will help improve customer experience on the website by enabling them to see high-quality data for products, discover items that are not otherwise visible and help merchants to improve their business.
- Masters degree in Computer Science (Machine learning, data mining, NLP, information retrieval), Statistics or related field.
- 2+ years of experience in machine learning, data mining, big data
- Good working knowledge of R/Python
- Experience with distributed frameworks like Spark/MapReduce/Hadoop.
- Excellent problem-solving skills with out of box solutions.
- Ability to decompose informal business problems into problem statements and build solutions.
- Ph. D degree in Computer Science (Machine learning, data mining, information retrieval), Statistics or related field.
- Proven practical experience in machine learning, data mining or statistics with track record of publications.
- Desire to guide junior engineers and data scientists.
- Strong verbal and written communication skills.