.

Are words enough?: a study on text-based representations and retrieval models for linking pins to online shops

User-generated content offers opportunities to learn about people’s interests and hobbies. We can leverage this infor- mation to help users find interesting shops and businesses find interested users. However this content is highly noisy and unstructured as posted on social media sites and blogs. In this work we evaluate different textual representations and retrieval models that aim to make sense of social media data for retail applications. Our task is to link the text of pins (from Pinterest.com) to online shops (formed by cluster- ing Amazon.com’s products). Our results show that docu- ment representations that combine latent concepts with sin- gle words yield the best performance.

Check out our paper

Are words enough?: a study on text-based representations and retrieval models for linking pins to online shops.

Proceedings of the 2013 International Workshop on Mining Unstructured Big Data using Natural Language Processing in conjunction with The 24th ACM International Conference on Information and Knowledge Management (CIKM 2013)

Ivan Vulic, Susana Zoghbi and Sien Moens

You May Also Like

Antler Program T-5days

We are about to start the Antler program in the Nordics in just 5 days and I can’t wait. For the past 6 weeks, we’ve had the pre-ramp phase, where we already started chatting with other founders and Antler coaches, mentors and partners. For those that don’t know, Antler is a 12-week program that helps you meet like-minded founders and work on ideas in a systematic way. At the end of the program, you present your progress to an Investment Committee to see if your startup will …

Waste Analytics System for Greyparrot.ai

At Greyparrot.ai, I helped build a world-class waste analytics platform. I led the Deep Learning team to create a system that automatically recognizes all trash items going through recycling facilities, helping recover all valuable materials and converting them to assets. Cameras were installed on top of conveyor belts that transport waste. Our tech accurately recognized around 50 distinct object categories and hundreds of brands, empowering robotic manipulation and data-driven insights. As Head …