accim.data.postprocessing package
Submodules
accim.data.postprocessing.main module
- class accim.data.postprocessing.main.Table(datasets: list | None = None, source_concatenated_csv_filepath: str | None = None, source_frequency: str | None = None, frequency: str | None = None, frequency_agg_func: str | None = None, standard_outputs: bool | None = None, output_cols_to_keep: list | None = None, concatenated_csv_name: str | None = None, idf_path: str | None = None, level: list | None = None, level_agg_func: list | None = None, level_excluded_zones: list | None = None, block_zone_hierarchy: dict | None = None, split_epw_names: bool = False, normalised_energy_units: bool = True, rename_cols: bool = True, energy_units_in_kwh: bool = True, drop_nan: bool = False, name_export_rows_with_NaN: str | None = None, name_export_rows_not_corr_agg: str | None = None)[source]
Bases:
objectGenerates a table or dataframe using the EnergyPlus simulation results CSV files available in the current folder.
- Parameters:
datasets (list) – A list of strings. The strings are the names of the CSV files that you want to work with, at the working directory.
source_concatenated_csv_filepath (str) – A string used as the filepath to read the previously concatenated csv file with the argument concatenated_csv_name.
source_frequency (str) – Used to inform accim about the frequency of the input CSVs. If there are multiple frequencies in a single CSV, the columns for the frequencies different to the selected one will be discarded. String can be ‘timestep’, ‘hourly’, ‘daily’, ‘monthly’ or ‘runperiod’.
frequency (str) – Rows will be aggregated based on this frequency. String can be ‘timestep’, ‘hourly’, ‘daily’, ‘monthly’ or ‘runperiod’. For instance, if ‘daily’, hourly or timesteply rows will be aggregated in days. String can be ‘timestep’, ‘hourly’, ‘daily’, ‘monthly’ or ‘runperiod’.
frequency_agg_func (str) – Aggregates the rows based on the defined frequency by sum or mean. Can be ‘sum’ or ‘mean’.
standard_outputs (bool) – Used to consider only standard outputs from accim. It can be True or False.
output_cols_to_keep (list) – A list of columns from EnergyPlus Output:Variable. Used to remove all other columns from EnergyPlus Output:Variable objects except the ones specified in this list.
concatenated_csv_name (str) – Used as the name for the concatenated csv file.
idf_path (path-like) – The path of the IDF used for the simulations.
drop_nan (bool) – If True, drops the rows with NaNs before exporting the CSV using concatenated_csv_name.
level (list) – A list of strings. Strings can be ‘block’ and/or ‘building’. Used to create columns with block or building values.
level_agg_func (list) – A list of strings. Strings can be ‘sum’ and/or ‘mean’. Used to create the columns for levels preciously stated by summing and/or averaging.
level_excluded_zones (list) – A list of strings. Strings must be the zones excluded from level computations. Used to try to match the cities in the EPW file name with actual cities. To be used if sample_EPWs have not been previously renamed with rename_epw_files().
split_epw_names (bool) – It splits the EPW name into Country_City_RCPscenario-Year format. To be used if sample_EPWs do have been previously renamed with rename_epw_files().
normalised_energy_units (bool) – A bool, can be True or False. Used to show Wh or Wh/m2 units.
rename_cols (bool) – A bool, can be True or False. Used to keep the original name of EnergyPlus outputs or rename them for understanding purposes.
energy_units_in_kwh (bool) – A bool, can be True or False. If True, energy units will be in kWh or kWh/m2, otherwise these will be in Wh or Wh/m2.
name_export_rows_with_NaN (str) – This parameter shouldn’t be generally used. A string used as a name to export a xlsx file with the rows with NaNs. Used only to check the rows with NANs.
name_export_rows_not_corr_agg (str) – This parameter shouldn’t be generally used. A string used as a name to export a xlsx file with the rows not correctly aggregated. Used only to check the aggregations are correct.
- Variables:
df – The pandas DataFrame instance. It is modified when method
format_tableis called.df_backup – The full pandas DataFrame instance resulting from class
Table. It is not modified, so can be used to revert the DataFrame instance to its initial state.cols_for_multiindex – The list of columns (or variables) that change in the dataset. These represent the variables that might be interesting to study, and therefore, the variables that are suggested to used in arguments
vars_to_gather,vars_to_gather_colsorvars_to_gather_rows.wrangled_df_unstacked – The resulting pandas DataFrame after calling the method
wrangled_tablewithreshaping='unstack'wrangled_df_stacked – The resulting pandas DataFrame after calling the method
wrangled_tablewithreshaping='stack'wrangled_df_multiindex – The resulting pandas DataFrame after calling the method
wrangled_tablewithreshaping='multiindex'wrangled_df_pivoted – The resulting pandas DataFrame after calling the method
wrangled_tablewithreshaping='pivot'
- custom_order(ordered_list: list | None = None, column_to_order: str | None = None)[source]
Used to order the string values of a column in a custom order.
- Parameters:
ordered_list (list) – A list os strings. Used to order the string values of a column in a custom order.
column_to_order (str) – A string. It should be the column whose string values should be ordered.
- enter_vars_to_gather(vars_to_gather=None)[source]
Function used by accim to gather variables to be combined in columns.
- Parameters:
vars_to_gather (list) – The list of strings containing the variables.
- format_table(type_of_table: str = 'all', custom_cols: list | None = None)[source]
It filters the columns.
- Parameters:
type_of_table – To get previously set out tables. Can be ‘energy demand’ or ‘comfort hours’.
custom_cols – A list of strings.
The strings will be used as a filter, and the columns that match will be selected.
- gather_vars_query(vars_to_gather: list | None = None)[source]
Used to inform the user of the variables suitable to be analysed and the available options from a certain gathered variables
- Parameters:
vars_to_gather (list) – A list of variables.
- generate_fig_data(vars_to_gather_cols: list | None = None, vars_to_gather_rows: list | None = None, detailed_cols: list | None = None, detailed_rows: list | None = None, custom_cols_order: list | None = None, custom_rows_order: list | None = None, data_on_y_axis_baseline_plot: list | None = None, baseline: str | None = None, colorlist_baseline_plot_data: list | None = None, data_on_x_axis: str | None = None, data_on_y_main_axis: list | None = None, data_on_y_sec_axis: list | None = None, colorlist_y_main_axis: list | None = None, colorlist_y_sec_axis: list | None = None, best_fit_deg_y_main_axis: list | None = None, best_fit_deg_y_sec_axis: list | None = None, best_fit_deg: list | None = None, rows_renaming_dict: dict | None = None, cols_renaming_dict: dict | None = None)[source]
Generates list of data to be plotted.
- Parameters:
vars_to_gather_cols – A list of strings. The list should be the variables you want to show in subplot columns.
vars_to_gather_rows – A list of strings. The list should be the variables you want to show in subplot rows.
detailed_cols – A list of strings. The list should be the specific data you want to show in subplots columns. Used to filter.
detailed_rows – A list of strings. The list should be the specific data you want to show in subplots rows. Used to filter.
custom_cols_order – A list of strings. The list should be the specific order for the items shown in subplot columns.
custom_rows_order – A list of strings. The list should be the specific order for the items shown in subplot rows.
data_on_y_axis_baseline_plot – A list of strings. Used to select the data you want to show in the graph. Should be a list of the column names you want to plot in each subplot.
baseline – A string, used only in data_on_y_axis_baseline_plot. The baseline should be one of the combinations in vars_to_gather_cols. It will be plotted in x-axis, while the reference combination for comparison in y-axis.
colorlist_baseline_plot_data – A list of strings. Should be the colors using the matplotlib color notation for the columns entered in data_on_y_axis_baseline_plot in the same order.
data_on_x_axis – A string. The column name you want to plot in the x-axis.
data_on_y_main_axis – A list with nested lists and strings. Used to select the data you want to show in the scatter plot main y-axis. It needs to follow this structure: [[‘name_on_y_main_axis’, [list of column names you want to plot]]
data_on_y_sec_axis – A list with nested lists and strings. Used to select the data you want to show in the scatter plot secondary y-axis. It needs to follow this structure: [[‘name_on_1st_y_sec_axis’, [list of column names you want to plot], [‘name_on_2nd_y_sec_axis’, [list of column names you want to plot], etc]
colorlist_y_main_axis – A list with nested lists and strings. It should follow the same structure as data_on_y_main_axis, but replacing the column names with the colors using the matplotlib notation.
colorlist_y_sec_axis – A list with nested lists and strings. It should follow the same structure as data_on_y_sec_axis, but replacing the column names with the colors using the matplotlib notation.
rows_renaming_dict – A dictionary. Should follow the pattern {‘old row name 1’: ‘new row name 1’, ‘old row name 2’: ‘new row name 2’}
cols_renaming_dict – A dictionary. Should follow the pattern {‘old col name 1’: ‘new col name 1’, ‘old col name 2’: ‘new col name 2’}
- scatter_plot(vars_to_gather_cols: list | None = None, vars_to_gather_rows: list | None = None, detailed_cols: list | None = None, detailed_rows: list | None = None, custom_cols_order: list | None = None, custom_rows_order: list | None = None, data_on_x_axis: str | None = None, data_on_y_main_axis: list | None = None, data_on_y_sec_axis: list | None = None, colorlist_y_main_axis: list | None = None, colorlist_y_sec_axis: list | None = None, best_fit_deg_y_main_axis: list | None = None, best_fit_deg_y_sec_axis: list | None = None, rows_renaming_dict: dict | None = None, cols_renaming_dict: dict | None = None, sharex: bool = True, sharey: bool = True, supxlabel: str | None = None, figname: str | None = None, figsize: float = 1, ratio_height_to_width: float = 1, dpi: int = 500, confirm_graph: bool = False, set_facecolor: any = (0, 0, 0, 0.1), best_fit_background_linewidth: float = 1, best_fit_linewidth: float = 0.5, best_fit_linestyle: any = (0, (5, 10)))[source]
Used to plot a scatter plot.
- Parameters:
vars_to_gather_cols (list) – A list of strings. The list should be the variables you want to show in subplot columns.
vars_to_gather_rows (list) – A list of strings. The list should be the variables you want to show in subplot rows.
detailed_cols (list) – A list of strings. The list should be the specific data you want to show in subplots columns. Used to filter.
detailed_rows (list) – A list of strings. The list should be the specific data you want to show in subplots rows. Used to filter.
custom_cols_order (list) – A list of strings. The list should be the specific order for the items shown in subplot columns.
custom_rows_order (list) – A list of strings. The list should be the specific order for the items shown in subplot rows.
data_on_x_axis (str) – A string. The column name you want to plot in the x-axis.
data_on_y_main_axis (list) – A list with nested lists and strings. Used to select the data you want to show in the scatter plot main y-axis. It needs to follow this structure: [[‘name_on_y_main_axis’, [list of column names you want to plot]]]
data_on_y_sec_axis (list) – A list with nested lists and strings. Used to select the data you want to show in the scatter plot secondary y-axis. It needs to follow this structure: [[[‘name_on_1st_y_sec_axis’, [list of column names you want to plot]], [‘name_on_2nd_y_sec_axis’, [list of column names you want to plot]], etc]
colorlist_y_main_axis (list) – A list with nested lists and strings. It should follow the same structure as data_on_y_main_axis, but replacing the column names with the colors using the matplotlib notation.
colorlist_y_sec_axis (list) – A list with nested lists and strings. It should follow the same structure as data_on_y_sec_axis, but replacing the column names with the colors using the matplotlib notation.
best_fit_deg_y_sec_axis (list) – A list with nested lists and strings. It should follow the same structure as data_on_y_sec_axis, but replacing the column names with the polynomial degree for the best fit lines.
best_fit_deg_y_main_axis (list) – A list with nested lists and strings. It should follow the same structure as data_on_y_main_axis, but replacing the column names with the polynomial degree for the best fit lines.
rows_renaming_dict (dict) – A dictionary. Should follow the pattern {‘old row name 1’: ‘new row name 1’, ‘old row name 2’: ‘new row name 2’}
cols_renaming_dict (dict) – A dictionary. Should follow the pattern {‘old col name 1’: ‘new col name 1’, ‘old col name 2’: ‘new col name 2’}
sharey (bool) – True to share the x-axis across all subplots
sharex (bool) – True to share the y-axis across all subplots
supxlabel (str) – A string. The label shown in the x-axis.
figname (str) – A string. The name of the saved figure without extension.
figsize (float) – A float. It is the figure size.
ratio_height_to_width (float) – A float. By default, is 1 (squared). If 0.5 is entered, the figure will be half higher than wide.
dpi (int) – An integer. The number of dpis for image quality.
confirm_graph (bool) – A bool. True to skip confirmation step.
set_facecolor (any) – Usage is similar to matplotlib.axes.Axes.set_facecolor
best_fit_linestyle (any) – Anything in matplotlib linestyle notation. Use to change the style of the best fit lines.
best_fit_linewidth (float) – A float. Used to change the width of the best fit lines.
best_fit_background_linewidth (float) – A float. Used to change the width of the background best fit lines. Must be greater than best_fit_linewidth.
- scatter_plot_with_baseline(vars_to_gather_cols: list | None = None, vars_to_gather_rows: list | None = None, detailed_cols: list | None = None, detailed_rows: list | None = None, custom_cols_order: list | None = None, custom_rows_order: list | None = None, data_on_y_axis_baseline_plot: list | None = None, baseline: str | None = None, colorlist_baseline_plot_data: list | None = None, best_fit_deg: list | None = None, rows_renaming_dict: dict | None = None, cols_renaming_dict: dict | None = None, supxlabel: str | None = None, supylabel: str | None = None, figname: str | None = None, figsize: int = 1, markersize: int = 1, dpi: int = 500, confirm_graph: bool = False, set_facecolor: any = (0, 0, 0, 0.1), best_fit_background_linewidth: float = 1, best_fit_linewidth: float = 0.5, best_fit_linestyle: any = (0, (5, 10)))[source]
Used to plot a scatter plot with baseline.
- Parameters:
vars_to_gather_cols (list) – A list of strings. The list should be the variables you want to show in subplot columns.
vars_to_gather_rows (list) – A list of strings. The list should be the variables you want to show in subplot rows.
detailed_cols (list) – A list of strings. The list should be the specific data you want to show in subplots columns. Used to filter.
detailed_rows (list) – A list of strings. The list should be the specific data you want to show in subplots rows. Used to filter.
custom_cols_order (list) – A list of strings. The list should be the specific order for the items shown in subplot columns.
custom_rows_order (list) – A list of strings. The list should be the specific order for the items shown in subplot rows.
data_on_y_axis_baseline_plot (list) – A list of strings. Used to select the data you want to show in the graph. Should be a list of the column names you want to plot in each subplot.
baseline (str) – A string, used only in data_on_y_axis_baseline_plot. The baseline should be one of the combinations in vars_to_gather_cols. It will be plotted in x-axis, while the reference combination for comparison in y-axis.
colorlist_baseline_plot_data (list) – A list of strings. Should be the colors using the matplotlib color notation for the columns entered in data_on_y_axis_baseline_plot in the same order.
best_fit_deg (list) – A list with nested lists and strings. It should follow the same structure as data_on_y_axis_baseline_plot, but replacing the column names with the polynomial degree for the best fit lines.
rows_renaming_dict (dict) – A dictionary. Should follow the pattern {‘old row name 1’: ‘new row name 1’, ‘old row name 2’: ‘new row name 2’}
cols_renaming_dict (dict) – A dictionary. Should follow the pattern {‘old col name 1’: ‘new col name 1’, ‘old col name 2’: ‘new col name 2’}
supxlabel (str) – A string. The label shown in the x-axis.
supylabel (str) – A string. The label shown in the y-axis.
figname (str) – A string. The name of the saved figure without extension.
figsize (float) – A float. It is the figure size.
markersize (int) – An integer. The size of the markers.
dpi (int) – An integer. The number of dpis for image quality.
confirm_graph (bool) – A bool. True to skip confirmation step.
best_fit_linestyle (any) – Anything in matplotlib linestyle notation. Use to change the style of the best fit lines.
best_fit_linewidth (float) – A float. Used to change the width of the best fit lines.
best_fit_background_linewidth (float) – A float. Used to change the width of the background best fit lines. Must be greater than best_fit_linewidth.
- time_plot(vars_to_gather_cols: list | None = None, vars_to_gather_rows: list | None = None, detailed_cols: list | None = None, detailed_rows: list | None = None, custom_cols_order: list | None = None, custom_rows_order: list | None = None, data_on_y_main_axis: list | None = None, data_on_y_sec_axis: list | None = None, colorlist_y_main_axis: list | None = None, colorlist_y_sec_axis: list | None = None, rows_renaming_dict: dict | None = None, cols_renaming_dict: dict | None = None, sharex: bool = True, sharey: bool = True, figname: str | None = None, figsize: float = 1, ratio_height_to_width: float = 1, dpi: int = 500, confirm_graph: bool = False, set_facecolor: any = (0, 0, 0, 0.1))[source]
Used to plot a timeplot.
- Parameters:
vars_to_gather_cols (list) – A list of strings. The list should be the variables you want to show in subplot columns.
vars_to_gather_rows (list) – A list of strings. The list should be the variables you want to show in subplot rows.
detailed_cols (list) – A list of strings. The list should be the specific data you want to show in subplots columns. Used to filter.
detailed_rows (list) – A list of strings. The list should be the specific data you want to show in subplots rows. Used to filter.
custom_cols_order (list) – A list of strings. The list should be the specific order for the items shown in subplot columns.
custom_rows_order (list) – A list of strings. The list should be the specific order for the items shown in subplot rows.
data_on_y_main_axis (list) – A list with nested lists and strings. Used to select the data you want to show in the scatter plot main y-axis. It needs to follow this structure: [[‘name_on_y_main_axis’, [list of column names you want to plot]]]
data_on_y_sec_axis (list) – A list with nested lists and strings. Used to select the data you want to show in the scatter plot secondary y-axis. It needs to follow this structure: [[[‘name_on_1st_y_sec_axis’, [list of column names you want to plot]], [‘name_on_2nd_y_sec_axis’, [list of column names you want to plot]], etc]
colorlist_y_main_axis (list) – A list with nested lists and strings. It should follow the same structure as data_on_y_main_axis, but replacing the column names with the colors using the matplotlib notation.
colorlist_y_sec_axis (list) – A list with nested lists and strings. It should follow the same structure as data_on_y_sec_axis, but replacing the column names with the colors using the matplotlib notation.
rows_renaming_dict (dict) – A dictionary. Should follow the pattern {‘old row name 1’: ‘new row name 1’, ‘old row name 2’: ‘new row name 2’}
cols_renaming_dict (dict) – A dictionary. Should follow the pattern {‘old col name 1’: ‘new col name 1’, ‘old col name 2’: ‘new col name 2’}
sharey (bool) – True to share the x-axis across all subplots
sharex (bool) – True to share the y-axis across all subplots
figname (str) – A string. The name of the saved figure without extension.
figsize (float) – A float. It is the figure size.
ratio_height_to_width (float) – A float. By default, is 1 (squared). If 0.5 is entered, the figure will be half higher than wide.
dpi (int) – An integer. The number of dpis for image quality.
confirm_graph (bool) – A bool. True to skip confirmation step.
set_facecolor (any) – Usage is similar to matplotlib.axes.Axes.set_facecolor
- wrangled_table(reshaping: str | None = None, vars_to_gather: list | None = None, baseline: str | None = None, comparison_mode: list = ['others compared to baseline'], comparison_cols: list | None = None, check_index_and_cols: bool = False, vars_to_keep: list | None = None, rename_dict: dict | None = None, transpose: bool = False, excel_filename: str | None = None)[source]
Creates a table based on the arguments.
- Parameters:
reshaping (str) – A string. Can be ‘pivot’, ‘unstack’ or ‘multiindex’, to perform these actions.
vars_to_gather (list) – A list of the variables to be transposed from rows to columns.
baseline (str) – The already transposed column you want to use as a baseline for comparisons. If omitted, you will be asked which one to use.
comparison_mode (list) – A list of strings. Can be ‘others compared to baseline’ and/or ‘baseline compared to others’. Used to customise the comparison of variables.
comparison_cols (list) – A list of strings. ‘absolute’ to get the difference or ‘relative’ to get the percentage of reduction.
check_index_and_cols (bool) – A boolean. True to check index and cols, False to skip.
vars_to_keep (list) – A list of strings. To remove all variables from the multiindex except those to be kept.
excel_filename (str) – A string. If entered, the wrangled_df will be exported to excel with that string as name.
transpose (bool) – True to transpose the dataframe
rename_dict (dict) – Renames all data in the dataframe based on the format {‘old_string’: ‘new_string’}
accim.data.postprocessing.utils module
Classes and functions to perform data analytics after simulation runs.
- accim.data.postprocessing.utils.genCSVconcatenated(datasets: list | None = None, source_frequency: str | None = None, frequency: str | None = None, output_cols_to_keep: list | None = None, datasets_per_chunk: int = 50, concatenated_csv_name: str | None = None, drop_nan: bool = True)[source]
Function to generate concatenated CSV files from a large number of CSV files resulting from simulation runs. Useful in cases there are many CSVs, which could cause memory errors.
- Parameters:
datasets (list) – List of strings containing the names of the CSV files to be concatenated. If omitted, all CSV files are concatenated.
source_frequency (str) – Used to inform accim about the frequency of the input CSVs. Strings can be ‘timestep’, ‘hourly’, ‘daily’, ‘monthly’ or ‘runperiod’.
frequency (str) – Rows will be aggregated based on this frequency. Strings can be ‘timestep’, ‘hourly’, ‘daily’, ‘monthly’ or ‘runperiod’.
datasets_per_chunk (int) – The number of CSV files for chuck to be concatenated.
concatenated_csv_name (str) – A string used as the name for the concatenated csv file.
drop_nan (bool) – True to drop nan values.
- accim.data.postprocessing.utils.preview_Table_cols(datasets: list = [])[source]
Function to return the list of EnergyPlus Output:Variable outputs columns from the first CSV file in the path, suitable to be computed in the class Table, or the datasets, if entered. It is useful to know the full list of columns in the CSV files from simulation, so that columns can be filtered using the argument output_cols_to_keep in the class Table.
- Parameters:
datasets (list) – The list of CSV files to be concatenated.
- Returns:
The list of columns within the CSV files.