WSUI: An Efficient Weighted Sequential Pattern Mining with Index-Based Structures and Frequency-Based Weighting

Các tác giả

  • Quang Nguyễn Đại học Công nghiệp Thành phố Hồ Chí Minh
  • Thiet Pham

Từ khóa:

Sequential pattern mining, weighted patterns, pseudo-IDList, upper bounds, frequency-based weighting

Tóm tắt

Sequential pattern mining discovers commonly found ordered patterns but has regarded all items as equals without considering the importance of each. Weighted sequential pattern mining (WSPM) overcomes such drawbacks by taking the importance of items into consideration with corresponding weights. However, currently available approaches are faced with high memory and computational cost problems. Motivated by this, we proposed WSUI, which successfully combines the memory-efficient pseudo-IDList structures with strict upper bounds and automatically frequency-based weight distribution. Our method guarantees pattern completeness by keeping track of all possible ends for i-extensions and taking advantage of pseudo-IDList in s-extensions.  Experiments on three real-world datasets show that WSUI outperforms the state-ofthe-art EWSPM consistently. It provides substantial speed-up while keeping the exhaustive pattern discovery and competitive memory consumption.

Đã Xuất bản

09-12-2025

Số

Chuyên mục

Khoa học máy tính và Khoa học dữ liệu (Computer & Data Science)