Utilities
entsoe.utils.extract_records ¶
extract_records(
data: BaseModel | list[BaseModel],
domain: Optional[str] = None,
ignore_fields: Optional[List[str]] = ["m_rid", "time_series.m_rid"],
deduplicate: bool = True,
) -> List[Dict[str, int | float | str | None]]
Convert a Pydantic model or list of Pydantic models to a list of flattened records suitable for pandas DataFrame.
This function now handles both single BaseModel instances and lists of BaseModel instances, flattening all data into a unified list of records. When multiple BaseModel instances are provided, their records are combined while preserving the individual metadata and structure of each model.
Parameters:
-
data(BaseModel | list[BaseModel]) –Single Pydantic model instance or list of Pydantic model instances. When a list is provided, all models are processed and their records are combined into a single result list.
-
domain(Optional[str], default:None) –Optional key to extract a specific domain from each model. When specified, only the data under this key is extracted from each BaseModel instance.
-
ignore_fields(Optional[List[str]], default:['m_rid', 'time_series.m_rid']) –Optional list of field names to exclude from the flattened records. Fields are matched by their full dotted path (e.g., "m_rid", "time_series.m_rid"). Defaults to ["m_rid", "time_series.m_rid"]. Pass None to disable field filtering.
-
deduplicate(bool, default:True) –Whether to remove duplicate records while preserving order. Defaults to True.
Returns:
-
List[Dict[str, int | float | str | None]]–List of flattened dictionaries (records) from all BaseModel instances.
Raises:
-
KeyError–If specified domain is not found in the data
-
TypeError–If data is not a BaseModel or list of BaseModels
Note
If mixed BaseModel types are detected in a list, a warning is logged as this may result in inconsistent record structures.
entsoe.utils.add_timestamps ¶
add_timestamps(
records: List[Dict[str, Any]],
start_field: str = "period.time_interval.start",
resolution_field: str = "period.resolution",
position_field: str = "period.point.position",
timestamp_field: str = "timestamp",
interval_type: Literal["start", "end"] = "start",
) -> List[Dict[str, Any]]
Add calculated timestamps to records from extract_records().
This function takes the output from extract_records() and adds a timestamp field to each record by calculating it from the period information fields.
Field matching supports both exact matches and suffix matches. For example, if you specify "period.resolution" as the resolution_field, it will match both "period.resolution" (exact) and "time_series.period.resolution" (suffix).
Parameters:
-
records(List[Dict[str, Any]]) –List of dictionaries from extract_records()
-
start_field(str, default:'period.time_interval.start') –Dotted path to the start time field in the records. Can be a full path or suffix that matches the end of a key. Defaults to "period.time_interval.start"
-
resolution_field(str, default:'period.resolution') –Dotted path to the resolution field in the records. Can be a full path or suffix that matches the end of a key. Defaults to "period.resolution"
-
position_field(str, default:'period.point.position') –Dotted path to the position field in the records. Can be a full path or suffix that matches the end of a key. Defaults to "period.point.position"
-
timestamp_field(str, default:'timestamp') –Name of the new field to add for the calculated timestamp. Defaults to "timestamp"
-
interval_type(Literal['start', 'end'], default:'start') –Whether to use the start or end of the time interval. - "start": Use the beginning of the interval (default) - "end": Use the end of the interval (adds one resolution unit)
Returns:
-
List[Dict[str, Any]]–List of dictionaries with added timestamp field. Original records are not modified.
Raises:
-
KeyError–If required fields are not found in a record
-
ValueError–If field values are invalid for timestamp calculation
Examples:
>>> records = [
... {
... 'time_series.period.time_interval.start': '2024-08-19T22:00Z',
... 'time_series.period.resolution': 'PT60M',
... 'time_series.period.point.position': 1,
... 'value': 100
... }
... ]
>>> # Works with partial field names
>>> result = add_timestamps(records,
... start_field="period.time_interval.start",
... resolution_field="period.resolution",
... position_field="period.point.position")
>>> print(result[0]['timestamp'])
2024-08-19T22:00:00+00:00