PROPHET

PROPHET: An Inferable Future Forecasting Benchmark with Causal Intervened Likelihood Estimation

A Benchmark for Evaluating AI Systems in Predicting Real-World Events 🚀

Work in Progress

Overview

PROPHET is a benchmark designed to evaluate the future forecasting capabilities of AI systems, focusing on inferability—ensuring predictions are grounded in sufficient supporting evidence. Unlike existing benchmarks, PROPHET rigorously filters questions using proposed Causal Intervened Likelihood (CIL), a statistical measure that quantifies how news articles support answers.

Key Contributions

CIL (Causal Intervened Likelihood)

$$CIL_i = P(y=\hat{y}|do(X_i=1)) - P(y=\hat{y}|do(X_i=0))$$
- A causal inference-based metric to estimate whether a question is answerable using retrieved news articles.
- Models the impact of "intervening" on news events (e.g., "What if this event did not happen?") to measure their supportiveness to the prediction.
- Validated via experiments showing strong correlation with model performance.
PROPHET Benchmark
- Inferable Dataset: inferable (L1) and non-inferable (L2) questions sourced from Metaculus and Manifold platforms.
- Real-World Focus: Questions span diverse domains (politics, finance, climate) paired with 100+ news articles per question.
- Reproducibility: Static news retrieval ensures consistent evaluation.
- Updating Dataset: We will keep updating by collecting trending data.

Usage

First version dataset is available for reproducibility.
Ideal for researchers developing RAG systems, causal reasoning models, or forecasting tools.

Paper is on: https://arxiv.org/abs/2504.01509

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
IMG		IMG
data_2024-8		data_2024-8
.DS_Store		.DS_Store
README.md		README.md
paper.pdf		paper.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PROPHET

Overview

Key Contributions

Usage

About

Releases

Packages

TZWwww/PROPHET

Folders and files

Latest commit

History

Repository files navigation

PROPHET

Overview

Key Contributions

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages