PalmX Logo PalmX 2025

PalmX 2025: The First Shared Task on Benchmarking LLMs on Arabic Culture

This shared task at ArabicNLP 2025 focuses on evaluating Arabic cultural understanding through two subtasks: General Culture and Islamic Culture. Using a high-quality, MCQ dataset in MSA, the task promotes the development of culturally aware models tailored to the diverse heritage of the Arab world.

Shared Task Subtasks

Two complementary tasks addressing core challenges in Arabic cultural understanding.

1

General Culture Evaluation

Open Track

Objective

The objective of this subtask is to evaluate the ability of large language models to comprehend and reason about the diverse aspects of Arabic general culture. This includes traditional customs, local etiquette, cuisine, historical events, famous figures, geography, arts, and dialectal expressions across different Arab countries. The subtask challenges models to accurately select culturally appropriate answers from multiple-choice questions written in both Modern Standard Arabic, thereby testing their cultural grounding and contextual understanding.

Dataset

  • Training: A total of 2,000 multiple-choice question-answer (MCQ) pairs for fine-tuning or developing few-shot learning systems.
  • Development: A total of 500 similar MCQ pairs for intermediate evaluation and hyperparameter tuning.
  • Blind Test: A total of 2,000 previously unseen MCQ questions balanced across various countries and domains for final performance assessment.

Evaluation

  • Primary: Accuracy (percentage of correctly answered questions)
2

General Islamic Evaluation

Open Track

Objective

This subtask aims to assess language models' understanding of key elements of Islamic culture, which plays a foundational role in Arabic societies. It covers topics such as Islamic rituals and practices (e.g., prayers and fasting), Quranic knowledge, Hadith literature, historical developments in Islam, and religious holidays. Models are expected to answer multiple-choice questions that reflect both religious literacy and contextual sensitivity, ensuring their ability to handle culturally and theologically significant content with accuracy and respect.

Dataset

  • Training: A total of 600 multiple-choice question-answer (MCQ) pairs for fine-tuning or developing few-shot learning systems.
  • Development: A total of 300 similar MCQ pairs for intermediate evaluation and hyperparameter tuning.
  • Blind Test: A total of 1,000 previously unseen MCQ questions.

Evaluation

  • Primary: Accuracy (percentage of correctly answered questions)

Important Dates

Key milestones for the ArabicNLP 2025 Shared Task.

Training Data Release

June 10, 2025

Release of training/dev data and evaluation scripts.

Registration Deadline

July 20, 2025

Final registration deadline and test set release.

Submission Deadline

July 25, 2025

Test submission deadline via CodaLab.

Results Announcement

July 30, 2025

Final results released to participants.

Paper Submission

August 15, 2025

System description papers due.

Workshop

November 5–9, 2025

ArabicNLP 2025 Workshop in Suzhou, China.

Participation Guidelines

How to participate in the ArabicNLP 2025 Shared Task.

Registration Process

  1. Team Formation: Participants may register as individuals or teams (recommended)
  2. Registration Form: Fill out the registration form
  3. Submission Portal: Register through the CodaLab platform (link will be provided)
  4. Data Agreement: All participants must agree to the data license terms
  5. Track Selection: Teams may participate in one or more subtasks

Submission Requirements

  1. System Outputs: Submit predictions for the test set in specified format
  2. System Description: Paper describing methodology (4-8 pages)
  3. Reproducibility: Code submission encouraged but not mandatory
  4. Evaluation: Results will be verified by organizers

Contact

Team Registration Questions

If you have any questions regarding your team's registration, please email us at:

palmx2025@gmail.com

Updates and General Inquiries

For more updates or inquiries, join the PalmX Google group:

Organizing Committee

Researchers in Arabic NLP.

Fakhraddin Alwajih, The University of British Columbia (Canada)

Abdellah EL Mekki, The University of British Columbia (Canada)

Hamdy Mubarak, Qatar Computing Research Institute (Qatar)

Majd Hawasly, Computing Research Institute (Qatar)

Abubakr Mohamed, Qatar Computing Research Institute (Qatar)

Muhammad Abdul-Mageed (LinkedIn), The University of British Columbia (Canada)

UBC Logo
Zindi Logo