diva package¶
Subpackages¶
- diva.chat package
- Submodules
- diva.chat.interface_chatbot_scopes module
- diva.chat.interface_classification module
- diva.chat.interface_disclaimer module
- diva.chat.interface_gen_text_answer module
- diva.chat.interface_memory_rephrasing module
- diva.chat.module_chat module
ModuleChat
ModuleChat.add_disclaimer()
ModuleChat.chat
ModuleChat.chatbot_scope_strategy
ModuleChat.clear_history()
ModuleChat.create_user_prompt()
ModuleChat.generate_text_answer()
ModuleChat.is_memory_needed()
ModuleChat.is_prompt_in_scope()
ModuleChat.memory_needed
ModuleChat.prompt
ModuleChat.prompt_classification()
ModuleChat.prompt_history
ModuleChat.prompt_rephrasing()
ModuleChat.set_config_creation_strategy()
ModuleChat.set_demand_of_info()
- diva.chat.service_chatbot_scopes module
- diva.chat.service_disclaimer module
- diva.chat.service_gen_text_answer module
- diva.chat.service_is_memory_needed module
- diva.chat.service_prompt_classification module
- diva.chat.service_prompt_rephrasing module
- Module contents
- diva.config package
- Subpackages
- Submodules
- diva.config.config_object module
Config
Config.add_not_in_shp()
Config.aggregation_operator
Config.aggregation_type
Config.anomaly
Config.climate_variable
Config.creation_time
Config.end_time
Config.graph_type
Config.last
Config.location
Config.not_in_shp
Config.set_agg_operator()
Config.set_aggregation_type()
Config.set_anomaly()
Config.set_climate_variable()
Config.set_creation_time()
Config.set_end_time()
Config.set_graph_type()
Config.set_last()
Config.set_location()
Config.set_missings()
Config.set_start_time()
Config.set_time_expression()
Config.set_user_prompt()
Config.start_time
Config.time_expression
- diva.config.interface_ask_missings module
- diva.config.interface_completion module
- diva.config.interface_creation module
- diva.config.interface_get_missings module
- diva.config.interface_instruction module
- diva.config.module_config module
- diva.config.service_ask_missings module
- diva.config.service_completion module
- diva.config.service_get_missings module
- Module contents
- diva.data package
- diva.finetuning package
- diva.graphs package
- Submodules
- diva.graphs.graph_generator module
GraphTexts
IGraphGenerator
IGraphGenerator.NB_PARAM_ELM
IGraphGenerator.aggreg_type
IGraphGenerator.climate_variable
IGraphGenerator.comparison_loc
IGraphGenerator.comparison_time
IGraphGenerator.context
IGraphGenerator.cross_tenses
IGraphGenerator.data
IGraphGenerator.delta_time
IGraphGenerator.delta_time_multi
IGraphGenerator.endtime
IGraphGenerator.generate()
IGraphGenerator.get_collection()
IGraphGenerator.graph_context
IGraphGenerator.graph_type
IGraphGenerator.heatmap_not_supported
IGraphGenerator.initialise_data_collection()
IGraphGenerator.langage
IGraphGenerator.list_not_found_loc
IGraphGenerator.location
IGraphGenerator.out_of_bounds
IGraphGenerator.params
IGraphGenerator.params_after_processing
IGraphGenerator.process_params()
IGraphGenerator.shapefile_europe
IGraphGenerator.shapefile_europe_cities
IGraphGenerator.starttime
IGraphGenerator.time_out_of_bounds
- diva.graphs.service_graph_generation module
ServiceGeneratePlotlyGraph
ServiceGeneratePlotlyGraph.add_tense_data()
ServiceGeneratePlotlyGraph.checks()
ServiceGeneratePlotlyGraph.colors
ServiceGeneratePlotlyGraph.colors_gray
ServiceGeneratePlotlyGraph.compute_diff_ref()
ServiceGeneratePlotlyGraph.generate()
ServiceGeneratePlotlyGraph.generate_barplot()
ServiceGeneratePlotlyGraph.generate_boxplot()
ServiceGeneratePlotlyGraph.generate_distribution()
ServiceGeneratePlotlyGraph.generate_lineplot()
ServiceGeneratePlotlyGraph.generate_lineplot_mean()
ServiceGeneratePlotlyGraph.generate_map()
ServiceGeneratePlotlyGraph.generate_rainbow()
ServiceGeneratePlotlyGraph.generate_warming_stripes()
ServiceGeneratePlotlyGraph.get_data_source()
ServiceGeneratePlotlyGraph.get_interp_colors_anomaly()
ServiceGeneratePlotlyGraph.get_mean_val_on_ref_period()
ServiceGeneratePlotlyGraph.get_xy_pos_note_under_graph()
ServiceGeneratePlotlyGraph.graph_elm_for_translate()
ServiceGeneratePlotlyGraph.matching_case()
- Module contents
- diva.gui package
- diva.llm package
Submodules¶
diva.parameters module¶
diva.tools module¶
- diva.tools.add_download_feedback_button(prompt, config, fig, key='out_prompt')[source]¶
Adds a feedback button alongside a generated graph for user interaction.
This function adds buttons for downloading the graph data, providing positive feedback, or providing negative feedback. The feedback is logged to a file based on the user’s input.
Parameters:¶
- configdict
A dictionary containing the configuration information for the graph, which may include visual properties and settings.
- figdict
A Plotly figure object containing the graph data to be displayed and interacted with.
- keystr
A unique key for the Streamlit button elements to avoid conflicts when rendering multiple buttons.
Returns:¶
None
- diva.tools.add_feedback(prompt, answer, feedback)[source]¶
Logs positive feedback for a generated graph.
This function records positive feedback for a graph in a CSV log file. It appends the current timestamp, session ID, graph configuration, and feedback type (“Good”) to the log file. If the log file does not exist, it creates a new one.
Parameters:¶
- configdict
A dictionary containing the configuration information for the graph, which is logged along with feedback.
Returns:¶
None
- diva.tools.add_feedback_msg_button(prompt, response)[source]¶
Adds feedback buttons for a given prompt and response pair.
This function adds two feedback buttons for users to provide feedback on the given prompt and response: one for positive feedback (”👍”) and one for negative feedback (”👎”). Clicking these buttons will log the feedback.
Parameters:¶
- promptstr
The prompt that was used to generate the response.
- responsestr
The response generated for the prompt.
Returns:¶
None
- diva.tools.add_zipcode_to_city_name(location)[source]¶
Adds the country code to the city name if it’s not already included.
This function first retrieves detailed information about a location, including the city name. It then checks if the location is already in the list of countries. If the location is not in the list, it appends the country code to the city name in the format city (country_code).
Parameters:¶
location (str): The name of the location (city or country) to which the country code will be added.
Returns:¶
str: The location name, possibly updated to include the country code if it’s not already in the list of countries.
- diva.tools.append_if_not_in(list_: list, elem)[source]¶
Appends an element (elem) to a list (list_) if the element is not already in the list.
- diva.tools.change_date_str_format(date: str, new_format: str)[source]¶
Takes a date in str format, and a new format as inputs, and tries to convert the date into the new format.
Parameters¶
- date: str
a date in a str format
- new_format: str
a new format given as str. The format should be supported by datetime.strftime
Returns¶
- str
If the conversion succeeds, the date is in the new str format. Else, it is the same date as the one passed in.
- diva.tools.clean_answer(text: str, characters: str | list[str] | tuple, remove_accent=True) str [source]¶
Removes characters from a text.
Parameters¶
- text: str | list[str]
text from which removing characters.
- characters: iterable
an iterable object contains the characters (str) to remove
- remove_accent:
default to True. Also replaces the accentuated characters in the text by non-accentuated characters
Returns¶
- str
The text cleared from the specified characters (and from accentuated characters if parameter set as True)
- diva.tools.convert_to_dict_config(config)[source]¶
Converts a configuration object to a dictionary of parameters.
This function extracts parameters from the configuration object and converts them into a dictionary format. The parameters include start times, end times, locations, the element of interest, graph type, and aggregation type.
Parameters:¶
- configobject
The configuration object containing various settings.
Returns:¶
- dict
A dictionary containing the configuration parameters.
- diva.tools.datetime_from_str(date: str) datetime [source]¶
Takes a date in str format %Y-%m-%d as input and returns it in datetime
- diva.tools.display_config(dict_params) str [source]¶
Generates a descriptive text about the graph configuration to be displayed before the graph.
This function creates a text description of the graph based on the provided parameters. The description includes details such as the element of interest, start and end times, location, and additional time indications if multiple time ranges are provided. The text aims to provide context about what the graph represents and encourages user interaction with the graph.
Parameters:¶
- dict_paramsdict
A dictionary containing configuration parameters, including ‘climate_variable’, ‘starttime’, ‘endtime’, and ‘location’.
Returns:¶
- str
A randomly selected descriptive sentence about the graph configuration.
- diva.tools.enumeration(list_: list, end='and') str [source]¶
Takes a list of elements, for instances strings, and makes a text enumeration of the elements. e.g. [‘cats’, ‘dogs’] -> ‘cats and dogs’ e.g. [‘cats’, ‘dogs’, ‘birds’] -> ‘cats, dogs and birds’. The default separator between elements is the comma, and ‘and’ before the last element.
Parameters¶
- list_: list
a list of elements
- end:
default to ‘and’. Separator between the ultimate and penultimate element of the list.
Returns¶
- str:
A text enumeration
- diva.tools.extract_data_from_plotlyFig(fig, location)[source]¶
Extracts data from a Plotly figure and converts it to a CSV format.
This function extracts the x and y data points from each trace in a Plotly figure and compiles them into a DataFrame. The DataFrame is then converted into CSV format for easy export.
Args:¶
- figdict
A Plotly figure represented as a dictionary containing ‘data’ (traces) and ‘layout’ (layout information) attributes.
Returns:¶
- tuple(str, bytes)
A tuple containing: - title (str): The filename for the CSV, derived from the figure’s title and the current timestamp. - data_csv (bytes): The CSV data encoded as bytes, suitable for download or storage.
- diva.tools.find_best_match(text: str, list_: list | Series | ndarray, verbose=False, extra=0, prefilter=True, remove_accents=True) tuple[str | None, int | None] [source]¶
This function finds the best match of a text in a list/array/series. It is optimized to efficiently (time wise) and accurately find the best match between the retrieved location parameters (text) and the locations listed in the shapefiles (list_) The function assigns a score to every element in list_, the highest score meaning the best match with the text. To save time, a filtering process is applied at an early stage to exclude the list/array/series (list_) elements for which the first letters do not match the first letters of the text. Run time with table of 86 000 rows < 0.01 s
Parameters¶
- text: str
element for which the function searches the best match in list_
- list_: list | pd.Series | np.ndarray,
where the best match is searched
- verbose: bool
whether to print details. Mainly for debugging
- extra: int
any integer can be added to add a bonus or a malus to the score. If a malus is added, it makes more difficult to get a positive score, ensuring that the returned value is either None or very close to the text passed in argument
- prefilter: bool
default to True. The prefilter excludes some list_ elements that do not show any correspondence with the first letters in text.
- remove_accents: bool
default to True. Removes the accents in the list_ elements.
Returns¶
- tuple[str, int] or tuple[None, None]
tuple[None, None] if the best score <= 0. tuple[str, int] if the best score > 0 with str the best match and int the index of the best match (int) in the list/array/series
- diva.tools.format_date(year: int, month: int, day: int) str [source]¶
Formats dates, given in inputs as integers for year, month and day, into a str format YYYY-MM-DD.
- diva.tools.full_pipeline_config(module_config: ModuleConfig, module_chat: ModuleChat, user_prompt: str, clear: bool = True, config: bool = True)[source]¶
Runs pipeline to get graph parameters (config) from the user_prompt.
Parameters¶
module_config: instance of ModuleConfig module_chat: instance of ModuleChat user_prompt: str clear: bool
default to True. Whether to clear config historic (and chat historic) (True) or not (False)
- config: bool
default to True. Whether to search for the config parameters in the user prompt (True) or not (False)
- diva.tools.generate_map()[source]¶
Generates and displays an interactive map using Pydeck.
This function creates a map visualization using the Pydeck library, with the following properties: - Map style: Road - Initial view settings:
Latitude: 47
Longitude: 12.5
Zoom level: 4
Pitch: 50
This map visualization is intended to be displayed within a Streamlit application.
Returns: None
- diva.tools.generate_session_id()[source]¶
Generate a unique ID for the ongoing session.
This function checks if a session ID already exists in the Streamlit session state. If it does not exist, a new unique session ID is generated using a combination of the process ID, a random number, and the current timestamp, hashed with SHA-256. The generated session ID is then stored in the Streamlit session state for later use.
Returns:¶
- str
The generated session ID, either newly created or retrieved from the session state.
- diva.tools.generate_waiting_for_graph_msg()[source]¶
Generates a random message indicating that a graph is being generated.
This function selects a random message from a predefined list that informs the user that the graph is in the process of being created. It is useful for providing feedback during long-running operations like graph generation.
Returns:¶
str: A randomly chosen message from the list indicating that the graph is being generated.
- diva.tools.get_city_zipcode(city)[source]¶
Retrieves the country and LAU ID (Local Administrative Unit) for a given city.
This function searches for the provided city name in a shapefile of European cities. If the city is found, it retrieves the corresponding country code and LAU ID from the shapefile, then uses a world map CSV to retrieve the full country name based on the country code.
Parameters:¶
city (str): The name of the city for which the country and LAU ID are to be retrieved.
Returns:¶
- tuple: A tuple containing:
str or None: The name of the country corresponding to the city. Returns None if the city is not found.
str or None: The LAU ID of the city. Returns None if the city is not found.
- diva.tools.get_description_of_graph(llm, context)[source]¶
Generates a concise description of a graph based on the provided context.
This function utilizes a language model (LLM) to generate a brief and direct description of the graph, focusing solely on the essential elements of the context. The description avoids introductory phrases and is tailored to be succinct.
Parameters:¶
- llmobject
The language model instance used to generate the description of the graph.
- contextstr
The context information regarding the graph, which serves as input for generating the description.
Returns:¶
- str
The generated description of the graph, concise and based on the provided context.
- diva.tools.get_infos_from_location(location)[source]¶
Retrieves detailed information about a location, including latitude, longitude, and bounding box.
This function uses the Nominatim geocoding service to obtain geographic details about a given location. It returns a dictionary with the original location name, the name used in the query, the type of address, center coordinates, and bounding box coordinates.
Parameters:¶
- locationstr
The name of the location for which information is to be retrieved.
Returns:¶
- dict or None
A dictionary containing the location’s name, address type, center coordinates, and bounding box. Returns None if the location cannot be found.
- diva.tools.init_state()[source]¶
Initializes the state for the Streamlit application.
This function sets up the initial state variables for the Streamlit application if they are not already present. It sets the current tab to a default value and initializes the index for the tab.
Returns:¶
None
- diva.tools.is_time(text: str = '') bool [source]¶
Takes a text as input and determines whether this text contains a time expression. Specifically, is_times searches for textual month and seasons written in the text, or for digits presents in the text. If any of the two if found, the text is assumed to be a time expressions and True is returned. This function was not created to be very accurate, but more with the goal of excluding probable time expression from the list of retrieved parameters that should not be time (e.g. locations which should contain neither months, nor seasons, nor digits).
Parameters¶
- text: str
text to assess
Returns¶
- bool
True if a time expression, False otherwise
- diva.tools.langage_to_iso(language)[source]¶
Converts language names to their corresponding ISO 639-1 codes.
This function translates a given language name into its ISO 639-1 language code. It supports a selection of common languages.
Parameters:¶
- languagestr
The name of the language.
Returns:¶
- str
The ISO 639-1 code of the language. Returns None if the language is not recognized.
- diva.tools.last_day_of_month(month: int, year: int) int [source]¶
Takes a year and a month as integers and returns an integer representing the last day of this month
- diva.tools.lemmatize(text: str | list[str]) str | list[str] [source]¶
Lemmatizes text (e.g. removes markers of plural). It uses nltk word_tokenize. The first call takes ~2s (the time for inner models to load). The other call have a negligible run time. Capitalized words are not lemmatized.
Parameters¶
- text: str | list[str]
text to lemmatize. Can be a str or a list of str. If a str, the str sequence is split into words, and the lemmatization is applied on each word individually.
Returns¶
- str | list[str]
str if text is of type str, list[str] if test is of type list[str]
- diva.tools.load_shapefile_cities()[source]¶
Loads a shapefile of European cities and a world CSV file.
This function uses the GeoPandas library to read a shapefile of European cities and loads a world map from a CSV file containing geographic data. The shapefile contains geometrical data for European cities, while the world CSV provides additional geographic information.
Returns:¶
- tuple: A tuple containing two elements:
GeoDataFrame: The GeoDataFrame containing the European cities shapefile.
DataFrame: The DataFrame containing the world map data from the CSV.
- diva.tools.no_truncated_sentence(text: str) str [source]¶
Removes unterminated sentences at the end of a text. Uses nltk sent_tokenize to split the text in sentences. If there is a single unterminated sentence in the text, it is not removed.
Parameters¶
- text: str
the text to verify
Returns¶
- str
the text minus the unterminated sentence, if any and if the number of sentences in text > 1
- diva.tools.put_in_order_dates(start, end)[source]¶
Orders two dates such that the earlier date comes first.
This function takes two dates and returns them in ascending order. If the dates are the same, they are returned in the same order they were provided.
Parameters:¶
- startdatetime or str
The start date.
- enddatetime or str
The end date.
Returns:¶
- tuple
A tuple containing the two dates in ascending order.
- diva.tools.remove_accents(text: str) str [source]¶
Removes accents in a text passed as argument. Returns a text without accents.
- diva.tools.set_max_new_tokens(prompt: str) int [source]¶
Estimates the max number of new tokens needed to answer a prompt. Returns an integer between 128 and 1024
- diva.tools.str_from_datetime(date: datetime) str [source]¶
Takes a date in datetime format as input and returns it in str format %Y-%m-%d
- diva.tools.translate_to_en(text, source_lang)[source]¶
Translates a given text to English, with additional handling for graph type terminology.
This function first checks if the source language is not English. If the source language is not English, it loads a mapping file to translate specific graph type terms to their English equivalents. The mapping is applied to the text before translating the entire text to English. If the source language is already English, the original text is returned without modification.
Parameters:¶
text (str): The text to be translated to English. source_lang (str): The language of the input text. If it is “en”, the text is returned as is.
Returns:¶
str: The translated text in English, with graph type terms mapped if applicable.
- diva.tools.update_logs(UserPrompt, RephrasedPrompt, LLM_Response)[source]¶
Updates the log file with message history including user prompts, rephrased prompts, and LLM responses.
This function records information about user prompts, their rephrased versions, and the responses generated by the LLM (Language Model). It logs this information along with the current timestamp and session ID. If the log file already exists, the information is appended. Otherwise, a new file is created.
Parameters:¶
- UserPromptstr
The original prompt from the user.
- RephrasedPromptstr
The rephrased version of the user’s prompt.
- LLM_Responsestr
The response generated by the LLM.
Returns:¶
None
- diva.tools.verify_authentification_keycloak()[source]¶
Verifies user authentication with Keycloak.
This function initiates the authentication process using Keycloak’s OpenID Connect protocol. It constructs the necessary URL for user login and redirects the user to Keycloak’s login page if they are not already authenticated. If the user provides valid credentials, an access token is obtained and stored in the session state. The function checks if the access token is successfully stored in the session to confirm the user’s authentication status.
Parameters:¶
None
Returns:¶
- bool: Returns True if the user is successfully authenticated and the access token is stored in the session.
Returns False if the authentication fails or if no valid access token is found.
- diva.tools.write_like_chatGPT(text)[source]¶
Simulates typing of the provided text, one word at a time.
This function yields words from the text one at a time with a delay between each word, simulating a typing effect similar to that of ChatGPT.
Parameters:¶
- textstr
The text to be “typed out”.
Yields:¶
- str
Words from the text with a space appended, one at a time.