Development of extreme value modeling for large sized clustered data

Reference No. 2024a004
Type/Category Grant for General Research- Short-term Visiting Researcher
Title of Research Project Development of extreme value modeling for large sized clustered data
Principal Investigator Takuma Yoshida(KagoshimaUniversity・Associate Professor)
Research Period May 13,2024. ~ May 17,2024.
Keyword(s) of Research Fields Extreme value theory; Sparse modeling; Large sized clustered data
Abstract for Research Report The purpose of this study is to develop a new extreme value statistical method for clustered data. Clustered data refers to data that fall into a particular group, such as region or affiliation. Extreme value analysis is a method to predict the probability of occurrence of a large value, such as a maximum value. For example, there are about 1,300 rainfall stations in Japan, and each station can be regarded as a cluster. We usually make the extreme value model of precipitation for each station for risk assessment. At this point, it is natural to assume that the data obtained at each station is influenced by or related to the information from nearby stations. This research aims to develop extreme value statistical modeling that takes into account the relationship between clusters, and in particular, to establish a method for grouping clusters when large clusters are considered.

As a grouping method, we will use the sparse penalization technique.
In extreme value modeling, we build the model for each cluster using the maximum likelihood method.
However, in our study, we apply the penalized maximum likelihood method with fused lasso type penalty to each pair of clusters in the model, which leads to grouping. The penalties are imposed where pair of clusters have high dependence.

The advantage of this method is that the number of groups after grouping and the clusters belonging to each group are determined automatically, which distinguishes it from existing methods. In addition, the data in grouped clusters are represented by a common model, which improves the interpretability and is also expected to provide high-performance prediction. This study is concerned with the assessment of heavy rainfall disaster risk by station (cluster). The grouped clusters in this study can be used to develop unified disaster countermeasures, which will be useful for considering efficient disaster prevention in each group.

We will make the source code of our proposed method available on GitHub.
Organizing Committee Members (Workshop)
Participants (Short-term Joint Usage)
Takuma Yoshida(Graduate School of Science and Engineering, Kagoshima University・Associate Professor)
Shuichi Kawano(Faculty of Mathematics, Kyushu University・Professor)