Searching and categorizing datasets for auditability

Categorizing your datasets effectively helps you manage, analyze, and compare different planning problems and scenarios within the platform.

Dataset names and tags

Our platform provides two ways to add metadata to datasets:

Name: Each dataset can be assigned a descriptive name.
Tags: Datasets can be labeled with one or more tags to facilitate filtering and organization.

Both the name and tags can be provided when submitting a dataset (via the Platform UI or as part of the json file) or edited later on the Plans Overview page in the Platform UI.

Searching and filtering datasets

You can search for datasets by name using the search bar in Plans Overview.

Click Add filter to drill down further:

ID filtering: Find a specific dataset by filtering on its unique dataset ID.
Tag filtering: Only display datasets that have a certain tag, or exclude a certain tag from the results.
Status filtering: Only display datasets in a specific status. See the Dataset lifecycle for a complete overview of all possible statuses.
Configuration profile filtering: Only display datasets that ran with a certain configuration profile.
Date range filtering: To only show datasets from a certain date range, filter on the field "Started at".
Deleted datasets: Any dataset that is deleted is kept in Trash for a certain amount of time. Filter on "Deleted: Yes"
Created from filtering: Filter datasets based on how they were created:
- Request: Created from a new API call.
- Input: Created from an existing dataset input (for example, a re-solve).
- Patch: Created from a patch update (see Real-time planning with /from-patch (preview)).
Model version filtering: Filter datasets based on the model version used to generate them, including:
- Model SDK version
- Model version
- Model build time
- Model build branch
- Model branch

It’s possible to combine multiple filters.

On the Plans Overview page, you can choose which columns to show for each dataset, including any of the metrics defined by the model. This allows you to analyze the evolution of these metrics over time or quickly spot anomalies within the filtered datasets.

Best practices for using tags

We recommend using tags to:

Segment your data: Assign different tags to represent distinct segments in your data, like regions or departments. For example, give each region you plan in a separate tag identifying that region. This allows you to later search for and compare all plans in a specific region efficiently.
Distinguish planning types: Use tags to differentiate between nightly planning, real-time planning and reference plans. This helps track how often real-time plan adjustments are necessary or to compare nightly planning with actual executions.
Separate simulations from operational plans: Clearly mark simulation datasets (e.g. goal alignment experiments or test scenarios) separately from datasets to be used in actual operations. This ensures that test results don’t interfere with live planning data.

To keep datasets of your production systems distinct from development or staging environments, we recommend using different tenants and discourage using tags. This ensures a clear separation of environments, preventing test or experimental data from affecting production operations.

By consistently categorizing your datasets using meaningful names and well-structured tags, you can streamline your workflow and make data-driven decisions more effectively.