KDD Cup 2024 Workshop:

A Multi-task Online Shopping Challenge for Large Language Models

Held in conjunction with KDD'24 Aug 25, 2024 - Aug 29, 2024, Barcelona, Spain


INTRODUCTION

The objective of this workshop is to discuss the winning solutions of the Amazon KDD Cup 2024: A Multi-task Online Shopping Challenge for LLMs. In this challenge, we introduce the Shopping MMLU dataset, a collection of questions written in natural language covering various aspects of knowledge in online shopping: Shopping Concept Understanding, Shopping Knowledge Reasoning, User Behavior Alignment, and Multi-lingual Abilities. The challenge is hosted with the aim of encouraging the development of a new paradigm to incorporate deep learning into online shopping --- conversational online shopping with LLMs. In this way, a versatile LLM, instead of task-specific models, is trained for a broad range of online shopping tasks to reduce task-specific engineering efforts. In addition, such an LLM can act as an interactive shop assistant that can provide real-time feedback to customer questions. More details of this challenge are available here: [Challenge Link] and [OpenReview Link]


SCHEDULE

August 28, 2024, 11:00AM–16:30PM (CEST), Barcelona, Spain.

Centre de Convencions Internacional de Barcelona

This will be a hybrid session. Zoom link will be provided later.

  Opening
  11:00-11:30

Introduction by organizers.  
Moderator: Yilun Jin, Ph.D. Student at Hong Kong University of Science and Technology
Speaker: Dr. Zheng Li, Senior Applied Scientist at Amazon

   Invited Speaker
  11:30-12:10

    Prof. Huan Liu Professor Huan Liu
    Regents Professor, Arizona State University

    Title: Social Media Mining: Embracing LLMs and Beyond
    Abstract: Social media data differs from conventional data in many ways. It is big, noisy, linked, multimodal, and user generated. Through the lens of social media, unprecedented opportunities emerge for researchers in knowledge discovery and data mining. The prevalent usage of Large Language Models (LLMs) has added new challenges. We will use some of the challenges to illustrate the immediate need for (1) pondering what we should do to make LLMs more trustworthy and usable, and (2) contemplating a future beyond LLMs.

   Invited Speaker
  12:10-12:50

    Prof. Jingbo Shang Professor Jingbo Shang
    Associate Professor, University of California, San Diego

    Title: Incubating Text Classifiers Following User Instruction with Nothing but LLM
    Abstract: Automated knowledge extraction and discovery methods can address the diverse needs of different users. A fundamental open problem is how much user effort automated methods require to obtain useful knowledge. My research introduces a novel paradigm, extremely weak supervision (XWS), aimed at minimizing user effort. XWS involves only brief natural-language input from users to define tasks, such as a list of topics for classifying news articles, akin to the guidelines given to human annotators. In this talk, using text classification as an example, we will present a series of XWS methods. These methods encompass three primary approaches to pseudo-labeling: (1) mining-based X-Class, (2) generation-based Incubator, and (3) a hybrid approach, Text Grafting.

   Lunch Break
  12:50-14:00

   Poster Session
  16:00-16:30


ACCEPTED PAPERS

In total, we have accepted 9 submissions including 7 oral presentations and 2 poster presentationss. The accepted submission are publicly available at OpenReview page: https://openreview.net/group?id=KDD.org/2024/Workshop/Amazon_KDD_Cup
Winning Amazon KDD Cup'24
  Track 1   Track 2   Track 3   Track 4   Track 5
Authors: Chris Deotte, Ivan Sorokin, Ahmet Erdem, Benedikt Schifferer, Gilberto Titericz Jr, Simou Jegou,

Second Place Overall Solution for Amazon KDD Cup 2024
  Track 1   Track 2   Track 3   Track 4   Track 5
Authors: Pengyue Jia, Jingtong Gao, Xiaopeng Li, Zixuan Wang, Yiyao Jin, Xiangyu Zhao

LLaSA: Large Language and E-Commerce Shopping Assistant
  Track 1   Track 4
Authors: Shuo Zhang, Boci Peng, Xinping Zhao, Boren Hu, Yun Zhu, Yanjia Zeng, Xuming Hu

Tailoring LLMs for Online Shopping: A Strategy for Multi-Task Learning and Domain-Specific Performance Enhancement
  Track 5
Authors: Liu Yankai, Hao Yifan, Guo Ruipu, Cui Zhaojun

EC-Guide: A Comprehensive E-Commerce Guide for Instruction Tuning and Quantization by ZJU-AI4H
  Track 2
Authors: Zhaopeng Feng, Zijie Meng, Zuozhu Liu.

Enhancing User Behavior Alignment by Input-Level Model Cooperation and Model-Level Parameter Optimization
  Track 3
Authors: Yingyi Zhang, Zhipeng Li, Zhewei Zhi, Xianneng Li

Fine-Tuning Large Language Models for Multitasking in Online Shopping Using Synthetic Data
Authors: Fernando Sebastián Huerta, Carla Martín Monteso, Carlos de Leguina León, Leyre Sánchez Viñuela, Julián Rojo García, Rubén Sordo López, Roberto Lara Martin.

More diverse more adaptive: Comprehensive Multi-task Learning for Improved LLM Domain Adaptation in E-commerce
Authors: Piao Tong, Pei Tang, Zhipeng Zhang, Jiaqi Li, Qiao Liu, Zufeng Wu.

Optimizing Few-Shot Learning: From Static to Adaptive in Qwen2-7B
Authors: Wenhao Liu, Tianxing Bu, Erchen Yu, Dailin Li, Ding Ai, Zhenyi Lu, Haoran Luo.

SUBMISSION GUIDELINES

The objective of this workshop is to discuss the winning submissions of the Amazon KDD Cup 2024: Multi-task Online Shopping Challenge for LLMs. Submissions to the workshop are single-blind (author names and affiliations should be listed). A team will have a guaranteed opportunity for an in-person oral/poster presentation if the team ranks in top-5 in any one of the 5 tracks. Other submissions will be evaluated by a committee based on their novelty and insights. The deadline for the submissions is August 2, 2024 August 4, 2024 (Anywhere on Earth time). Accepted submissions will be notified latest by August 6, 2024. Please note that the KDD Cup workshop will have no proceedings and the authors retain full rights to submit or post the paper at any other venue.

Link to the submission website: https://openreview.net/group?id=KDD.org/2024/Workshop/Amazon_KDD_Cup

Submissions describing solutions to 1 or 2 tracks are limited to a maximum of 4 pages, including all content and references, and must be in PDF format. However, submissions describing solutions to more than 2 tracks can have up to 6 pages. Please use ACM Conference templates (two column format). One recommended setting for Latex file is: \documentclass[sigconf, review]{acmart}. Template guidelines are here: https://www.acm.org/publications/proceedings-template.

In addition, authors can provide an optional one page supplement at the end of their submitted paper (it needs to be in the same PDF file) focused on reproducibility. After the submission deadline, the names and order of authors cannot be changed.

Our dataset paper is available at ArXiv, and the Shopping MMLU dataset is available at github. Please cite our paper if you find our work helpful.

If you have any questions, please contact us at yilun.jin@connect.ust.hk and amzzhe@amazon.com.

DATA

The data and its license is available at github.
If you plan to use this dataset for your own research, please cite this paper.

@article{jin2024shopping,
title={Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models},
author={Jin, Yilun and Li, Zheng and Zhang, Chenwei and Cao, Tianyu and Gao, Yifan and Jayarao, Pratik and Li, Mao and Liu, Xin and Sarkhel, Ritesh and Tang, Xianfeng and others},
journal={arXiv preprint arXiv:2410.20745},
year={2024}
}