Datapipes¤
pdearena.data.datapipes_common
¤
RandomTimeStepConditionedPDETrainData
¤
Bases: IterDataPipe
Randomized data for training conditioned PDEs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dp |
IterDataPipe
|
Data pipe that returns individual PDE trajectories. |
required |
n_input_scalar_components |
int
|
Number of input scalar components. |
required |
n_input_vector_components |
int
|
Number of input vector components. |
required |
n_output_scalar_components |
int
|
Number of output scalar components. |
required |
n_output_vector_components |
int
|
Number of output vector components. |
required |
trajlen |
int
|
Length of a trajectory in the dataset. |
required |
reweigh |
bool
|
Whether to rebalance the dataset so that longer horizon predictions get equal weightage despite there being fewer actual such datapoints in a trajectory. Defaults to True. |
True
|
Source code in pdearena/data/datapipes_common.py
RandomizedPDETrainData
¤
Bases: IterDataPipe
Randomized data for training PDEs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dp |
IterDataPipe
|
Data pipe that returns individual PDE trajectories. |
required |
n_input_scalar_components |
int
|
Number of input scalar components. |
required |
n_input_vector_components |
int
|
Number of input vector components. |
required |
n_output_scalar_components |
int
|
Number of output scalar components. |
required |
n_output_vector_components |
int
|
Number of output vector components. |
required |
trajlen |
int
|
Length of a trajectory in the dataset. |
required |
time_history |
int
|
Number of time steps of inputs. |
required |
time_future |
int
|
Number of time steps of outputs. |
required |
time_gap |
int
|
Number of time steps between inputs and outputs. |
required |
Source code in pdearena/data/datapipes_common.py
251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 |
|
TimestepConditionedPDEEvalData
¤
Bases: IterDataPipe
Data for evaluation of time conditioned PDEs
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dp |
IterDataPipe
|
Data pipe that returns individual PDE trajectories. |
required |
trajlen |
int
|
Length of a trajectory in the dataset. |
required |
delta_t |
int
|
Evaluates predictions conditioned at that delta_t. |
required |
Tip
Make sure delta_t
is less than half of trajlen
.
Source code in pdearena/data/datapipes_common.py
ZarrLister
¤
Bases: IterDataPipe
Customized lister for zarr files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
root |
Union[str, Sequence[str], IterDataPipe]
|
Root directory. Defaults to ".". |
'.'
|
Yields:
Type | Description |
---|---|
str
|
Path to the zarr file. |
Source code in pdearena/data/datapipes_common.py
build_datapipes(pde, data_path, limit_trajectories, usegrid, dataset_opener, lister, sharder, filter_fn, mode, time_history=1, time_future=1, time_gap=0, onestep=False, conditioned=False, delta_t=None, conditioned_reweigh=True)
¤
Build datapipes for training and evaluation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pde |
PDEDataConfig
|
PDE configuration. |
required |
data_path |
str
|
Path to the data. |
required |
limit_trajectories |
int
|
Number of trajectories to use. |
required |
usegrid |
bool
|
Whether to use spatial grid as input. |
required |
dataset_opener |
Callable[..., IterDataPipe]
|
Dataset opener. |
required |
lister |
Callable[..., IterDataPipe]
|
List files. |
required |
sharder |
Callable[..., IterDataPipe]
|
Shard files. |
required |
filter_fn |
Callable[..., IterDataPipe]
|
Filter files. |
required |
mode |
str
|
Mode of the data. ["train", "valid", "test"] |
required |
time_history |
int
|
Number of time steps in the past. Defaults to 1. |
1
|
time_future |
int
|
Number of time steps in the future. Defaults to 1. |
1
|
time_gap |
int
|
Number of time steps between the past and the future to be skipped. Defaults to 0. |
0
|
onestep |
bool
|
Whether to use one-step prediction. Defaults to False. |
False
|
conditioned |
bool
|
Whether to use conditioned data. Defaults to False. |
False
|
delta_t |
Optional[int]
|
Time step size. Defaults to None. Only used for conditioned data. |
None
|
conditioned_reweigh |
bool
|
Whether to reweight conditioned data. Defaults to True. |
True
|
Returns:
Name | Type | Description |
---|---|---|
dpipe |
IterDataPipe
|
IterDataPipe for training and evaluation. |
Source code in pdearena/data/datapipes_common.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
|
pdearena.data.oned.datapipes.kuramotosivashinsky1d
¤
KuramotoSivashinskyDatasetOpener
¤
Bases: IterDataPipe
DataPipe to load the Kuramoto-Sivashinsky dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dp |
IterDataPipe
|
List of |
required |
mode |
str
|
Mode to load data from. Can be one of |
required |
preload |
bool
|
Whether to preload all data into memory. Defaults to True. |
True
|
allow_shuffle |
bool
|
Whether to shuffle the data, recommended when preloading data. Defaults to True. |
True
|
resolution |
int
|
Which resolution to load. Defaults to full data resolution. |
-1
|
usegrid |
bool
|
Whether to output spatial grid or not. Defaults to False. |
False
|
Yields:
Type | Description |
---|---|
Tuple[Tensor, Tensor, Optional[Tensor], Optional[Tensor]]
|
Tuple containing particle scalar field, velocity vector field, and optionally buoyancy force parameter value and spatial grid. |
Source code in pdearena/data/oned/datapipes/kuramotosivashinsky1d.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
|
pdearena.data.twod.datapipes.navierstokes2d
¤
NavierStokesDatasetOpener
¤
Bases: IterDataPipe
DataPipe to load Navier-Stokes dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dp |
IterDataPipe
|
List of |
required |
mode |
str
|
Mode to load data from. Can be one of |
required |
limit_trajectories |
int
|
Limit the number of trajectories to load from individual |
None
|
usegrid |
bool
|
Whether to output spatial grid or not. Defaults to False. |
False
|
Yields:
Type | Description |
---|---|
Tuple[Tensor, Tensor, Optional[Tensor], Optional[Tensor]]
|
Tuple containing particle scalar field, velocity vector field, and optionally buoyancy force parameter value and spatial grid. |
Source code in pdearena/data/twod/datapipes/navierstokes2d.py
pdearena.data.twod.datapipes.shallowwater2d
¤
ShallowWaterDatasetOpener
¤
Bases: IterDataPipe
DataPipe for loading the shallow water dataset
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dp |
IterDataPipe
|
datapipe with paths to load the dataset from. |
required |
mode |
str
|
"train" or "valid" or "test" |
required |
limit_trajectories |
Optional[int]
|
number of trajectories to load from the dataset |
None
|
usevort |
bool
|
whether to use vorticity or velocity. If False, velocity is returned. |
False
|
usegrid |
bool
|
whether to use grid or not. If False, no grid is returned. |
False
|
sample_rate |
int
|
sample rate for the data. Default is 1, which means no sub-sampling. |
1
|
Note
We manually manage the data distribution across workers and processes. So make sure not to use torchdata
's dp.iter.Sharder with this data pipe.
Source code in pdearena/data/twod/datapipes/shallowwater2d.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
|
pdearena.data.threed.datapipes
¤
build_maxwell_datapipes(pde, data_path, limit_trajectories, usegrid, dataset_opener, lister, sharder, filter_fn, mode, time_history=1, time_future=1, time_gap=0, onestep=False)
¤
Build datapipes for training and evaluation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pde |
PDEDataConfig
|
PDE configuration. |
required |
data_path |
str
|
Path to the data. |
required |
limit_trajectories |
int
|
Number of trajectories to use. |
required |
usegrid |
bool
|
Whether to use spatial grid as input. |
required |
dataset_opener |
Callable[..., IterDataPipe]
|
Dataset opener. |
required |
lister |
Callable[..., IterDataPipe]
|
List files. |
required |
sharder |
Callable[..., IterDataPipe]
|
Shard files. |
required |
filter_fn |
Callable[..., IterDataPipe]
|
Filter files. |
required |
mode |
str
|
Mode of the data. ["train", "valid", "test"] |
required |
time_history |
int
|
Number of time steps in the past. Defaults to 1. |
1
|
time_future |
int
|
Number of time steps in the future. Defaults to 1. |
1
|
time_gap |
int
|
Number of time steps between the past and the future to be skipped. Defaults to 0. |
0
|
onestep |
bool
|
Whether to use one-step prediction. Defaults to False. |
False
|
Returns:
Name | Type | Description |
---|---|---|
dpipe |
IterDataPipe
|
IterDataPipe for training and evaluation. |