Discover Skore, The Newest Member of The Scikit-Learn Family

Written by Hugo Vassard | May 8, 2025 6:00:00 AM

In early 2025, comfortably sitting on my sofa, scrolling through LinkedIn, I came across a post announcing the release of a brand-new open-source python lib: Skore! Strange: it's a brand-new lib, yet it already seems strangely familiar...

The peculiar spelling of the name and this newcomer's orange and blue logo will have fooled no data scientist: there is indeed a link with scikit-learn, hence this feeling of déjà-vu. The author of this announcement is Probabl, the new entity that now oversees scikit-learn. This company, which is responsible for maintaining the library, is not making its first announcement, as a few months ago it unveiled to the community the very first certification program for this essential machine-learning library.

Curious to discover what this new tool had to offer, I set out to conquer the few tutorials offered in the documentation (which, by the way, is as excellent as the iconic sci-kit-learn documentation). So here's a summary of my findings on this new tool, promoted as the “scikit-learn sidekick”.

Benefits of This Package

In one sentence, Skore is the ally of any data scientist looking to develop quickly, follow AI best practices, and avoid the pitfalls that litter the path of machine learning model development.

To achieve this, Skore offers the following features:

A mechanism to persist results, thereby avoiding unnecessary calculations
A set of classes to reduce the time spent evaluating model quality
A warning system to avoid methodological errors

Let’s now explore these three concepts that make up version 0.7 of Skore in more detail.

1 - First: A Persistence Mechanism

The first concept is skore.Project, which represents the project you’re working on. This project stores the information of your choice in a key/value pair format. To begin, you need to create a project:

This creates a my_project.skore/ subfolder in the directory structure. You can then save almost anything you want, such as an integer:

You can then retrieve the value by its name:

You can save various objects in the project, such as:

A string
A list
A dictionary
A DataFrame (pandas or polars)
A plot (matplotlib, plotly, ...)
A pre-trained model
A preprocessing pipeline
A skrub.TableReport (Exploratory Data Analysis report from the skrub library)
Custom objects

After saving a few objects, it becomes difficult to remember which key corresponds to which item, especially since the my_project.skore/ folder structure does not reflect the saved content (picture 1). Luckily, the .keys() method allows you to list the saved items:

Picture 1: Folder structure once the 3 items are saved

A valuable feature: you can save a new value for an existing key without worrying about losing the previous value, as a history is kept:

As you may have noticed with the previous output, you can add a note when saving an item, which acts as a memo.

2 - Simplified model evaluation

Well, so far, it’s nice, but not revolutionary. I find the rest more interesting, especially the set of classes that allow easy and fast evaluation of model quality.

The first class is skore.EstimatorReport. Here’s an example where this class evaluates a logistic regression model on a binary classification task:

Usually, to evaluate model quality, you would need to calculate the metrics yourself (e.g., recall or precision) and plot graphs like the famous ROC curve. However, this is time-consuming and repetitive. It's true that we build up some snippets throughout our experiments (you may already have some of your own), but it's still a bit tedious.

With Skore, the EstimatorReport class simplifies the process. When you provide the model and evaluation dataset, it generates all the relevant evaluation results. For example, with the .help() method, you can see the available metrics (picture 2):

Picture 2: Displayed help (result of .help() method)

If this had been a regression task rather than a classification task, this output would have been different! The metrics displayed would have been the R² coefficient or the RMSE, for example. In any case, these results are easy to display and use :

And if we prefer a summary :

Picture 3: Displayed summary DataFrame

You can also generate quality graphs with just a few lines of code:

Picture 4: ROC curve, output of report.metrics.roc()

In the case of multi-class classification, the metrics and graphs show results for each class, which is very convenient. The only disappointment is the lack of a confusion matrix, which might be added in a future version.

To take things a step further, Skore also offers a caching mechanism to avoid unnecessary recalculations, as well as other interesting classes:

ComparisonReport : for comparing EstimatorReport
CrossValidationReport : like an EstimatorReport, but for cross-validation. The result is roughly the same as above, but with details for each split, as shown below:

Picture 5: DataFrame displaying precision for each cross-validation split

Picture 6: ROC curve for each cross-validation split

3 - Last But Not Least: A Warning System to Avoid Pitfalls

With simplified evaluation, you can iterate and train models to improve performance. However, avoiding methodological errors is crucial. This is where Skore’s warning system comes in. It alerts you to potential errors or bad practices in ML.

In the current version (0.7), the warnings are related to the well-known train_test_split() function from sklearn, which is used to create our training and test datasets. That said, this seemingly straightforward method already contains a number of pitfalls to be avoided (as proof, one of the Skore tutorials tells us that the train_test_split() method page is the most visited page in scikit-learn documentation). That’s why Skore proposes a wrapper that is able to raise warnings when using the train_test_split() method.

Here’s a very first example with some simple code :

Do you have any ideas on how to improve this little piece of code? Here's a hint: what happens if you run it several times? If you're giving up, here's the answer, thanks to the warning displayed by Skore when executing the previous code:

Yes, that's right. Here's another example, this time manipulating a dataset containing employee information (job title, gender, department, hire date, etc.):

Any clue about the methodological error here? It’s a bit trickier than the previous ones. Here’s a hint: it involves dates. Check out Skore’s response.

Yes, a dataset containing dates must be handled with care. As far as I'm concerned, a reminder on the subject wasn't out of place, as I can't say that it really jumped out at me.

Other warnings (which I won't go into here) exist, notably in relation to unbalanced datasets. I'll leave you to discover them here. Eventually, you may have noticed in the previous examples that Skore's wrapper for the train_test_split() method allows you to use keyword arguments when passing X and y :

Here, we just used “X=” and “y=” to pass parameters to the method, which is not possible with the sklearn method:

The authorized sklearn syntax requires positional arguments (the parameters must be passed in the right order, heading us almost every time to the documentation to avoid mistakes):

It's not much, but it can prevent a bug when X and y are accidentally inverted in the parameters, so I'll take it!

Conclusion

So that's it for this overview of this new library, which complements scikit-learn by facilitating the evaluation of ML models.

In my opinion, Skore is clearly intended for use in the experimental phase (when you're exploring and testing things in notebooks in a somewhat homemade way), rather than for use in an MLOPS workflow where you're looking to industrialize model training and evaluation. Nonetheless, Skore is an interesting tool for both beginners (who will benefit greatly from the warning system) and pros (who will appreciate the various classes for obtaining an evaluation report).

This library is still in its infancy (it's quite difficult to obtain relevant results on search engines by typing “Skore”), but it's very promising. For the time being, it doesn't have many features, but they're interesting, new versions are regularly released (0.8 just introduced feature importance), and the documentation is top-notch.

This newcomer is reminiscent of Skrub, another library in the scikit-learn ecosystem released at the end of 2023, which facilitates the data pre-processing phase. Combined with Skore, these two “scikit-learn sidekicks” are undoubtedly the ideal allies to facilitate ML model development. This winning pair is also the subject of a very good Skore tutorial presenting a data science project carried out from start to finish with Skrub, scikit-learn, and Skore.

Thanks for reading this article! If you are interested in training models, building chatbots, or leveraging ML or LLMs in your organization, reach out to sales@ipponusa.com.

View full post