At the end of December 2023, the LHCb experiment released all its data from Run 1 of the Large Hadron Collider. This data, collected by the experiment in 2011 and 2012, contains approximately 800 terabytes of information obtained from proton–proton collisions. The data has been made available in a pre-filtered format, suitable for a wide range of physics studies for research and education purposes.
LHCb data across Runs 1 and 2 has already been used for over 700 scientific publications, including numerous significant findings. All results from the collaboration have already been made publicly accessible in open-access papers and the numerical results from the graphs can be consulted in the HEPData database. With the new release, the data used by the researchers to produce these results is now accessible. The data has been released in the framework of CERN’s Open Data Policy, which reflects values that have been enshrined in the CERN Convention for more than sixty years and applies to all of CERN’s activities.
The collaboration has pre-processed the data by reconstructing experimental signatures, such as the trajectories of charged particles, from the raw information delivered by the complex detector system. The data is filtered, classified according to a large number of processes and decays, and made available in the same format that is used internally by LHCb physicists. The data can be downloaded from the CERN Open Data portal.
To aid the user’s understanding, the samples come with extensive documentation and metadata, as well as a glossary explaining several hundred specialised terms used in the pre-processing. The data can be analysed using dedicated LHCb algorithms, which are available as open-source software.
All data sets have digital identifiers (DOIs) for reference and citation. The experiment also welcomes feedback on how the data is used and invites users to discuss and post questions in the CERN Open Data Forum.