Event detection

class EventDetection

class factfinder.src.event_detection.EventDetection[source]

Bases: object

This class is aimed to generate events and their connections. It is based on the application of semantic clustering method (BERTopic) on the texts in the context of urban spatial model

_get_roads(city_name, city_crs) GeoDataFrame[source]

Get the road network of a city as road links and roads. :param city_name: The name of the city. :type city_name: string :param city_crs: The spatial reference code (CRS) of the city. :type city_crs: int

Returns:

GeoDataFrame with the city’s road links and roads.

Return type:

links (GeoDataFrame)

_get_buildings() GeoDataFrame[source]

Get the buildings of a city as a GeoDataFrame :param links: GeoDataFrame with the city’s road links and roads. :type links: GeoDataFrame :param filepath: The path to the GeoJSON file with building data. The default is set to ‘population.geojson’. :type filepath: string

Returns:

GeoDataFrame with the city’s buildings.

Return type:

buildings (GeoDataFrame)

_collect_population() dict[source]

Collect population data for each object (building, street, link).

_preprocess() GeoDataFrame[source]

Preprocess the data

_create_model(min_event_size)[source]

Create a topic model with a UMAP, HDBSCAN, and a BERTopic model.

_event_from_object(messages, topic_model, target_column: str, population: dict, object_id: float, event_level: str)[source]

Create a list of events for a given object (building, street, link, total).

_get_events(min_event_size) GeoDataFrame[source]

Create a list of events for all levels.

_get_event_connections() GeoDataFrame[source]

Create a list of connections between events.

_rebalance(connections, events, levels, event_population: int, event_id: str)[source]

Rebalance the population of an event.

_rebalance_events() GeoDataFrame[source]

Rebalance the population of events.

_filter_outliers()[source]

Filter outliers.

_prepare_messages()[source]

Prepare messages for export.

run(target_texts: GeoDataFrame, filepath_to_population: str, city_name: str, city_crs: int, min_event_size: int)[source]

Returns a GeoDataFrame of events, a GeoDataFrame of connections between events, and a GeoDataFrame of messages.