diva.tools module

class diva.tools.GetLinkQuestionForMissing[source]

Bases: object

diva.tools.add_download_button_after_prompts(title, data_csv, prompt, config, key)[source]
diva.tools.add_download_feedback_button(prompt, config, fig, key='out_prompt')[source]

Adds a feedback button alongside a generated graph for user interaction.

This function adds buttons for downloading the graph data, providing positive feedback, or providing negative feedback. The feedback is logged to a file based on the user’s input.

Parameters:

configdict

A dictionary containing the configuration information for the graph, which may include visual properties and settings.

figdict

A Plotly figure object containing the graph data to be displayed and interacted with.

keystr

A unique key for the Streamlit button elements to avoid conflicts when rendering multiple buttons.

Returns:

None

diva.tools.add_feedback(prompt, answer, feedback)[source]

Logs positive feedback for a generated graph.

This function records positive feedback for a graph in a CSV log file. It appends the current timestamp, session ID, graph configuration, and feedback type (“Good”) to the log file. If the log file does not exist, it creates a new one.

Parameters:

configdict

A dictionary containing the configuration information for the graph, which is logged along with feedback.

Returns:

None

diva.tools.add_feedback_msg_button(prompt, response)[source]

Adds feedback buttons for a given prompt and response pair.

This function adds two feedback buttons for users to provide feedback on the given prompt and response: one for positive feedback (”👍”) and one for negative feedback (”👎”). Clicking these buttons will log the feedback.

Parameters:

promptstr

The prompt that was used to generate the response.

responsestr

The response generated for the prompt.

Returns:

None

diva.tools.add_header()[source]
diva.tools.add_zipcode_to_city_name(location)[source]

Adds the country code to the city name if it’s not already included.

This function first retrieves detailed information about a location, including the city name. It then checks if the location is already in the list of countries. If the location is not in the list, it appends the country code to the city name in the format city (country_code).

Parameters:

location (str): The name of the location (city or country) to which the country code will be added.

Returns:

str: The location name, possibly updated to include the country code if it’s not already in the list of countries.

diva.tools.append_if_not_in(list_: list, elem)[source]

Appends an element (elem) to a list (list_) if the element is not already in the list.

diva.tools.change_date_str_format(date: str, new_format: str)[source]

Takes a date in str format, and a new format as inputs, and tries to convert the date into the new format.

Parameters

date: str

a date in a str format

new_format: str

a new format given as str. The format should be supported by datetime.strftime

Returns

str

If the conversion succeeds, the date is in the new str format. Else, it is the same date as the one passed in.

diva.tools.chech_if_empty_prompt(text)[source]
diva.tools.check_token_validity(keycloak_openid)[source]
diva.tools.clean_answer(text: str, characters: str | list[str] | tuple, remove_accent=True) str[source]

Removes characters from a text.

Parameters

text: str | list[str]

text from which removing characters.

characters: iterable

an iterable object contains the characters (str) to remove

remove_accent:

default to True. Also replaces the accentuated characters in the text by non-accentuated characters

Returns

str

The text cleared from the specified characters (and from accentuated characters if parameter set as True)

diva.tools.clean_prompt(prompt)[source]
diva.tools.convert_to_dict_config(config)[source]

Converts a configuration object to a dictionary of parameters.

This function extracts parameters from the configuration object and converts them into a dictionary format. The parameters include start times, end times, locations, the element of interest, graph type, and aggregation type.

Parameters:

configobject

The configuration object containing various settings.

Returns:

dict

A dictionary containing the configuration parameters.

diva.tools.datetime_from_str(date: str) datetime[source]

Takes a date in str format %Y-%m-%d as input and returns it in datetime

diva.tools.dic_month_id_to_name()[source]
diva.tools.display_config(dict_params) str[source]

Generates a descriptive text about the graph configuration to be displayed before the graph.

This function creates a text description of the graph based on the provided parameters. The description includes details such as the element of interest, start and end times, location, and additional time indications if multiple time ranges are provided. The text aims to provide context about what the graph represents and encourages user interaction with the graph.

Parameters:

dict_paramsdict

A dictionary containing configuration parameters, including ‘climate_variable’, ‘starttime’, ‘endtime’, and ‘location’.

Returns:

str

A randomly selected descriptive sentence about the graph configuration.

diva.tools.drop_duplicated_columns(data)[source]
diva.tools.enumeration(list_: list, end='and') str[source]

Takes a list of elements, for instances strings, and makes a text enumeration of the elements. e.g. [‘cats’, ‘dogs’] -> ‘cats and dogs’ e.g. [‘cats’, ‘dogs’, ‘birds’] -> ‘cats, dogs and birds’. The default separator between elements is the comma, and ‘and’ before the last element.

Parameters

list_: list

a list of elements

end:

default to ‘and’. Separator between the ultimate and penultimate element of the list.

Returns

str:

A text enumeration

diva.tools.export_results(dir, now, data, df_perf, perf, type_)[source]
diva.tools.extract_data_from_plotlyFig(fig, location)[source]

Extracts data from a Plotly figure and converts it to a CSV format.

This function extracts the x and y data points from each trace in a Plotly figure and compiles them into a DataFrame. The DataFrame is then converted into CSV format for easy export.

Args:

figdict

A Plotly figure represented as a dictionary containing ‘data’ (traces) and ‘layout’ (layout information) attributes.

Returns:

tuple(str, bytes)

A tuple containing: - title (str): The filename for the CSV, derived from the figure’s title and the current timestamp. - data_csv (bytes): The CSV data encoded as bytes, suitable for download or storage.

diva.tools.find_best_match(text: str, list_: list | Series | ndarray, verbose=False, extra=0, prefilter=True, remove_accents=True) tuple[str | None, int | None][source]

This function finds the best match of a text in a list/array/series. It is optimized to efficiently (time wise) and accurately find the best match between the retrieved location parameters (text) and the locations listed in the shapefiles (list_) The function assigns a score to every element in list_, the highest score meaning the best match with the text. To save time, a filtering process is applied at an early stage to exclude the list/array/series (list_) elements for which the first letters do not match the first letters of the text. Run time with table of 86 000 rows < 0.01 s

Parameters

text: str

element for which the function searches the best match in list_

list_: list | pd.Series | np.ndarray,

where the best match is searched

verbose: bool

whether to print details. Mainly for debugging

extra: int

any integer can be added to add a bonus or a malus to the score. If a malus is added, it makes more difficult to get a positive score, ensuring that the returned value is either None or very close to the text passed in argument

prefilter: bool

default to True. The prefilter excludes some list_ elements that do not show any correspondence with the first letters in text.

remove_accents: bool

default to True. Removes the accents in the list_ elements.

Returns

tuple[str, int] or tuple[None, None]

tuple[None, None] if the best score <= 0. tuple[str, int] if the best score > 0 with str the best match and int the index of the best match (int) in the list/array/series

diva.tools.find_country(display_name)[source]
diva.tools.format_date(year: int, month: int, day: int) str[source]

Formats dates, given in inputs as integers for year, month and day, into a str format YYYY-MM-DD.

diva.tools.from_dict_to_str(params)[source]
diva.tools.full_pipeline_config(module_config: ModuleConfig, module_chat: ModuleChat, user_prompt: str, clear: bool = True, config: bool = True)[source]

Runs pipeline to get graph parameters (config) from the user_prompt.

Parameters

module_config: instance of ModuleConfig module_chat: instance of ModuleChat user_prompt: str clear: bool

default to True. Whether to clear config historic (and chat historic) (True) or not (False)

config: bool

default to True. Whether to search for the config parameters in the user prompt (True) or not (False)

diva.tools.generate_connection_page(auth_url)[source]
diva.tools.generate_css_disconnect_button()[source]
diva.tools.generate_map()[source]

Generates and displays an interactive map using Pydeck.

This function creates a map visualization using the Pydeck library, with the following properties: - Map style: Road - Initial view settings:

  • Latitude: 47

  • Longitude: 12.5

  • Zoom level: 4

  • Pitch: 50

This map visualization is intended to be displayed within a Streamlit application.

Returns: None

diva.tools.generate_session_id()[source]

Generate a unique ID for the ongoing session.

This function checks if a session ID already exists in the Streamlit session state. If it does not exist, a new unique session ID is generated using a combination of the process ID, a random number, and the current timestamp, hashed with SHA-256. The generated session ID is then stored in the Streamlit session state for later use.

Returns:

str

The generated session ID, either newly created or retrieved from the session state.

diva.tools.generate_waiting_for_graph_msg()[source]

Generates a random message indicating that a graph is being generated.

This function selects a random message from a predefined list that informs the user that the graph is in the process of being created. It is useful for providing feedback during long-running operations like graph generation.

Returns:

str: A randomly chosen message from the list indicating that the graph is being generated.

diva.tools.get_10_colors()[source]
diva.tools.get_10_colors_gray()[source]
diva.tools.get_city_zipcode(city)[source]

Retrieves the country and LAU ID (Local Administrative Unit) for a given city.

This function searches for the provided city name in a shapefile of European cities. If the city is found, it retrieves the corresponding country code and LAU ID from the shapefile, then uses a world map CSV to retrieve the full country name based on the country code.

Parameters:

city (str): The name of the city for which the country and LAU ID are to be retrieved.

Returns:

tuple: A tuple containing:
  • str or None: The name of the country corresponding to the city. Returns None if the city is not found.

  • str or None: The LAU ID of the city. Returns None if the city is not found.

diva.tools.get_code_of_function(function_name)[source]
diva.tools.get_description_of_graph(llm, context)[source]

Generates a concise description of a graph based on the provided context.

This function utilizes a language model (LLM) to generate a brief and direct description of the graph, focusing solely on the essential elements of the context. The description avoids introductory phrases and is tailored to be succinct.

Parameters:

llmobject

The language model instance used to generate the description of the graph.

contextstr

The context information regarding the graph, which serves as input for generating the description.

Returns:

str

The generated description of the graph, concise and based on the provided context.

diva.tools.get_infos_from_location(location)[source]

Retrieves detailed information about a location, including latitude, longitude, and bounding box.

This function uses the Nominatim geocoding service to obtain geographic details about a given location. It returns a dictionary with the original location name, the name used in the query, the type of address, center coordinates, and bounding box coordinates.

Parameters:

locationstr

The name of the location for which information is to be retrieved.

Returns:

dict or None

A dictionary containing the location’s name, address type, center coordinates, and bounding box. Returns None if the location cannot be found.

diva.tools.get_list_diva_functions()[source]
diva.tools.get_month_names()[source]
diva.tools.image_to_base64(image_path)[source]
diva.tools.init_state()[source]

Initializes the state for the Streamlit application.

This function sets up the initial state variables for the Streamlit application if they are not already present. It sets the current tab to a default value and initializes the index for the tab.

Returns:

None

diva.tools.interpolate_color(start_color, end_color, factor: float)[source]
diva.tools.is_time(text: str = '') bool[source]

Takes a text as input and determines whether this text contains a time expression. Specifically, is_times searches for textual month and seasons written in the text, or for digits presents in the text. If any of the two if found, the text is assumed to be a time expressions and True is returned. This function was not created to be very accurate, but more with the goal of excluding probable time expression from the list of retrieved parameters that should not be time (e.g. locations which should contain neither months, nor seasons, nor digits).

Parameters

text: str

text to assess

Returns

bool

True if a time expression, False otherwise

diva.tools.langage_to_iso(language)[source]

Converts language names to their corresponding ISO 639-1 codes.

This function translates a given language name into its ISO 639-1 language code. It supports a selection of common languages.

Parameters:

languagestr

The name of the language.

Returns:

str

The ISO 639-1 code of the language. Returns None if the language is not recognized.

diva.tools.last_day_of_month(month: int, year: int) int[source]

Takes a year and a month as integers and returns an integer representing the last day of this month

diva.tools.lemmatize(text: str | list[str]) str | list[str][source]

Lemmatizes text (e.g. removes markers of plural). It uses nltk word_tokenize. The first call takes ~2s (the time for inner models to load). The other call have a negligible run time. Capitalized words are not lemmatized.

Parameters

text: str | list[str]

text to lemmatize. Can be a str or a list of str. If a str, the str sequence is split into words, and the lemmatization is applied on each word individually.

Returns

str | list[str]

str if text is of type str, list[str] if test is of type list[str]

diva.tools.load_data(dir: str, type_: str)[source]
diva.tools.load_shapefile_cities()[source]

Loads a shapefile of European cities and a world CSV file.

This function uses the GeoPandas library to read a shapefile of European cities and loads a world map from a CSV file containing geographic data. The shapefile contains geometrical data for European cities, while the world CSV provides additional geographic information.

Returns:

tuple: A tuple containing two elements:
  • GeoDataFrame: The GeoDataFrame containing the European cities shapefile.

  • DataFrame: The DataFrame containing the world map data from the CSV.

diva.tools.log_out_keycloak(keycloak_openid)[source]
diva.tools.no_truncated_sentence(text: str) str[source]

Removes unterminated sentences at the end of a text. Uses nltk sent_tokenize to split the text in sentences. If there is a single unterminated sentence in the text, it is not removed.

Parameters

text: str

the text to verify

Returns

str

the text minus the unterminated sentence, if any and if the number of sentences in text > 1

diva.tools.put_in_order_dates(start, end)[source]

Orders two dates such that the earlier date comes first.

This function takes two dates and returns them in ascending order. If the dates are the same, they are returned in the same order they were provided.

Parameters:

startdatetime or str

The start date.

enddatetime or str

The end date.

Returns:

tuple

A tuple containing the two dates in ascending order.

diva.tools.remove_accents(text: str) str[source]

Removes accents in a text passed as argument. Returns a text without accents.

diva.tools.send_user_event(event_type)[source]
diva.tools.set_max_new_tokens(prompt: str) int[source]

Estimates the max number of new tokens needed to answer a prompt. Returns an integer between 128 and 1024

diva.tools.show_energy_consumption(nrg)[source]
diva.tools.str_from_datetime(date: datetime) str[source]

Takes a date in datetime format as input and returns it in str format %Y-%m-%d

diva.tools.str_to_list(text)[source]
diva.tools.translate_chat_msgs(role, msg, language)[source]
diva.tools.translate_from_en(text, source_lang)[source]
diva.tools.translate_to_en(text, source_lang)[source]

Translates a given text to English, with additional handling for graph type terminology.

This function first checks if the source language is not English. If the source language is not English, it loads a mapping file to translate specific graph type terms to their English equivalents. The mapping is applied to the text before translating the entire text to English. If the source language is already English, the original text is returned without modification.

Parameters:

text (str): The text to be translated to English. source_lang (str): The language of the input text. If it is “en”, the text is returned as is.

Returns:

str: The translated text in English, with graph type terms mapped if applicable.

diva.tools.update_logging_history(logging_history, logging, prompt)[source]
diva.tools.update_logs(UserPrompt, RephrasedPrompt, LLM_Response)[source]

Updates the log file with message history including user prompts, rephrased prompts, and LLM responses.

This function records information about user prompts, their rephrased versions, and the responses generated by the LLM (Language Model). It logs this information along with the current timestamp and session ID. If the log file already exists, the information is appended. Otherwise, a new file is created.

Parameters:

UserPromptstr

The original prompt from the user.

RephrasedPromptstr

The rephrased version of the user’s prompt.

LLM_Responsestr

The response generated by the LLM.

Returns:

None

diva.tools.verify_authentification_keycloak()[source]

Verifies user authentication with Keycloak.

This function initiates the authentication process using Keycloak’s OpenID Connect protocol. It constructs the necessary URL for user login and redirects the user to Keycloak’s login page if they are not already authenticated. If the user provides valid credentials, an access token is obtained and stored in the session state. The function checks if the access token is successfully stored in the session to confirm the user’s authentication status.

Parameters:

None

Returns:

bool: Returns True if the user is successfully authenticated and the access token is stored in the session.

Returns False if the authentication fails or if no valid access token is found.

diva.tools.write_like_chatGPT(text)[source]

Simulates typing of the provided text, one word at a time.

This function yields words from the text one at a time with a delay between each word, simulating a typing effect similar to that of ChatGPT.

Parameters:

textstr

The text to be “typed out”.

Yields:

str

Words from the text with a space appended, one at a time.