HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture

International Institute of Information Technology, Hyderabad
*Equal Contributions

Abstract

We present a transformer architecture-based foundation model for tasks at high-energy particle colliders such as the Large Hadron Collider. We train the model to classify jets using a self-supervised strategy inspired by the Joint Embedding Predictive Architecture . We use the JetClass dataset containing 100M jets of various known particles to pre-train the model with a data-centric approach — the model uses a fraction of the jet constituents as the context to predict the embeddings of the unseen target constituents. Our pre-trained model fares well with other datasets for standard classification benchmark tasks. We test our model on two additional downstream tasks: top tagging and differentiating light-quark jets from gluon jets. We also evaluate our model with task-specific metrics and baselines and compare it with state-of-the-art models in high-energy physics.



Schematic Diagram for HEP-JEPA

HEP-JEPA Framework

Schematic diagram illustrating the working of the HEP-JEPA model. The model has a structure similar to vision transformers. In the first step, the entire jet is divided into patches using a particle jet tokeniser. These tokens are then masked to form the context and target blocks. Each block is fed into the respective encoder to generate the embeddings. The context embedding, along with the special mask tokens, is used by the predictor to predict the embedding of the masked target blocks.

Experiments

Few-Shot Learning Evaluations

The model was evaluated on the JetClass dataset, where it consistently outperformed models trained from scratch, particularly in low-label regimes:

  • Two regimes, frozen (pretrained backbone not updated) and fine-tuned were evaluated
  • Evaluated at label fractions: 0.05%, 0.5%, 2%, 10%, and 100%
  • Compared pre-trained HEP-JEPA model with model trained from scratch

Downstream Task Evaluations

The model was tested on two critical tasks:

  • Top tagging using the Top Tagging Reference dataset
  • Quark-gluon jet differentiation using the quark-gluon tagging dataset

Ablation Studies

The study explored various design choices, including:

  • Masking strategies (random vs. contiguous token selection)
  • Number of target tokens to predict
  • Physics bias in the attention mechanism
  • Integration of register tokens
  • Impact of physics-inspired data augmentations

Results and Findings

Key Findings

  • Physics bias improved performance by approximately 2%
  • Register tokens increased performance by around 2%
  • The contiguous masking strategy with one target token performed best
  • Physics-inspired augmentations did not significantly improve performance

JetClass Metrics

% of Labels (Size) Model Accuracy
0.05% (5K) From Scratch 0.505
HEP-JEPA, Fine-Tuning 0.564
0.5% (50K) From Scratch 0.586
HEP-JEPA, Fine-Tuning 0.624
2% (2M) From Scratch 0.668
HEP-JEPA, Fine-Tuning 0.669
10% (10M) From Scratch 0.683
HEP-JEPA, Fine-Tuning 0.685
100% (100M) From Scratch 0.698
HEP-JEPA, Fine-Tuning 0.698

Validation Loss Performance

Validation loss vs. training step for the two benchmark odels training in a few-shot learning setting for jet classification n the JetClass dataset with 0.5% labels (i.e., 50000 training sam- les). One model is trained from scratch, whereas the pre-trained EP-JEPA model is fine-tuned.

Validation loss vs. Training step

The validation loss falls quickly or the HEP-JEPA model — it achieves the same minimum valida- ion loss as the model trained from scratch three times faster

Visualization

We visualise the representation learned by HEP-JEPA on 50k samples of JetClass sampled uniformly from each class. We construct the embedding for a sample by concatenating the max and mean pooling of the outputs of the context encoder and apply t-SNE on the pooled embedding.

Validation loss vs. Training step

We observe that events that contain lepton(s) are pushed to the right, while hadronic events are more towards the left.


Related Links

Several recent works have explored foundation models and self-supervised learning in high-energy physics (HEP).

OmniLearn and Particle Transformer use transformer-based architectures for HEP tasks, relying on supervised learning with simulated data and generative modelling.

Masked Particle Modelling (MPM) and OmniJet-α adapt masked modeling and generative pre-training from natural language processing to collider physics.

Concurrent to our work, J-JEPA adapts the JEPA paradigm for the task of top tagging — the authors pre-train the model on 1% of the top jet and light jet samples from JetClass and evaluate downstream performance on the Top Tagging Reference dataset. However, unlike our data-centric approach, the authors generate context / target tokens through clustering subjets. We also show more comprehensive evaluations on the entire JetClass dataset and downstream applications with better performance on the tasks.

Contrastive learning methods, such as those in Dillon et al. (2022), follow frameworks like SimCLR, but require carefully selected negative samples. In contrast, Joint Embedding Predictive Architectures (JEPA) have shown promising results in images, videos, and point clouds by learning in latent space without a decoder.

For an extensive survey on foundation models in HEP, see this and this.


Sponsors

We are looking for sponsors and collaborators!

If you would like to sponsor or collaborate on future iterations of this work (or other works from our lab) please reach out to subhadip.mitra<AT>iiit.ac.in or jai.bardhan<AT>alumni.iiit.ac.in


BibTeX

@misc{bardhan2025hepjepafoundationmodelcollider,
      title={HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture}, 
      author={Jai Bardhan and Radhikesh Agrawal and Abhiram Tilak and Cyrin Neeraj and Subhadip Mitra},
      year={2025},
      eprint={2502.03933},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2502.03933}, 
}