CrunchDAO Docs V3
  • Crunch Hub
    • The Crunch-Hub
      • Activity Graphs
  • Competitions
    • Competitions
      • DataCrunch Competition
      • Broad Institute Autoimmune Disease
        • Crunch 1 – Oct 28 to Feb 9 – Predict gene expression
        • Crunch 2 – Nov 18 to Mar 21 – Predicting Unseen Genes
        • Crunch 3 – Dec 9 to Apr 30 – Identifying Gene
        • Full Specifications
        • Lectures
      • ADIA Lab Causal Discovery
      • ADIA Lab Market Prediction Competition
    • Rallies
      • Mid+One
      • DataCrunch Rally
      • X-Alpha Rally
    • Participate
    • Teams
      • Managing
      • Referendums
      • Leaderboard
      • Rewards
    • Data
    • Code Interface
    • Leaderboard
      • Duplicate Predictions
    • Resources Limit
    • Whitelisted Libraries
    • Known Issues
  • CRUNCH Token practical
    • Release Map
  • Credits
    • Avatar
  • Other
    • Glossary
Powered by GitBook
On this page
  • Grouping
  • Keeping
  1. Competitions
  2. Leaderboard

Duplicate Predictions

How is the duplicate badge triggered.

PreviousLeaderboardNextResources Limit

Last updated 1 month ago

Some competitions have a duplicate detection feature that flags all models that have a prediction correlation above and makes them ineligible for prizes.

The correlation is computed using .

Grouping

The prediction are first grouped between:

  • user, to avoid duplication between multiple models

  • team, to avoid duplicate between all models of every team member

The correlation function is then called for each prediction against each other.

Keeping

Predictions are then grouped into correlated pairs to isolate them.

The model in a pair is always kept. Other models are treated as copies.

Example

  • If A & B are considered duplicates, and C & D are also considered duplicates, then A and C will be retained and B and D will have the duplicate badge.

  • However, if B & C are also considered duplicates, then only A will be retained. B, C, D will have the duplicate badge.

  • In some cases, even if A and D are not considered to be duplicates, because of the link from A to B to C to D, the duplicate badge is still displayed.

pandas.DataFrame.corr(method="spearman")