Skip to content

Issue with data formatting for TimeSeriesDataLoader ("outcome" input) #326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
codingkimchi opened this issue Mar 19, 2025 · 0 comments
Open

Comments

@codingkimchi
Copy link

Description

Hello,
I've been trying out the TimeSeriesDataLoader with my custom dataset and I found that when my "outcome" input is set to None, the model.fit(loader) process would fail, and throw the following error:

[/usr/local/lib/python3.11/dist-packages/synthcity/plugins/core/models/ts_model.py](https://localhost:8080/#) in dataloader(self, static_data, temporal_data, observation_times, outcome)
    426         stratify = None
    427         _, out_counts = torch.unique(outcome, return_counts=True)
--> 428         if out_counts.min() > 1:
    429             stratify = outcome.cpu()
    430 

RuntimeError: min(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.

To reproduce this, I tried to follow the recommended tutorial notebook, tutorial6_time_series_data_preparation.ipynb

Tutorial 6 Linked here: https://github.com/vanderschaarlab/synthcity/blob/main/tutorials/tutorial6_time_series_data_preparation.ipynb

So, using the same tutorial notebook, I set "outcome" and "static_data" to None, after doing the data preparation, to test this out. The same error occured.

I only managed to try this for the 'timevae' model so far, I'm not sure if the error is present for the other time series models as well.

I read the synthcity documentation on the inputs to the TimeSeriesDataLoader module, and it states that the "outcome" and "static_data" inputs are optional, and default to None.

Image

Synthcity documentation linked here (please refer to TimeSeriesDataLoader specifically):
https://synthcity.readthedocs.io/en/latest/generated/synthcity.plugins.core.dataloader.html

How to Reproduce

Please refer to the github gist here which demonstrates how I produced the error.
https://gist.github.com/codingkimchi/fbac549c8eafc0cf977b52df10d9e5fe

Expected Behavior

I expected 'timevae' to train even with no specified 'outcome' input.

Notes

Apologies in advance if this issue was addressed before. I saw some advice on a separate Github Issue that all inputs to TimeSeriesDataLoader must be the same length, but this issue was not addressed, to the best of my knowledge!

@codingkimchi codingkimchi changed the title Issue with data formatting for TimeSeriesDataLoader (outcome and static_data inputs) Issue with data formatting for TimeSeriesDataLoader ("outcome" input) Mar 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant