DeliciousCluster - solution for large tag sets
Growing number of tags can pretty quickly become a nightmare for active users. Within just one year of heavy use of systems like del.icio.us one can aggregate over 1000 bookmarks with more than 300 tags.
DeliciousCluster is a smart, lightweight, extensible library for clustering user's tags from tagging sources, such as del.icio.us. It takes a flat list of tags (or more precisely a tag multiset) and attempts to deliver a hierarchical tag multiset, hence limiting the number of tags rendered at the highest level.
As an example, DeliciousCluster uses simple text similarity metrics to perform clustering. However, the library allows to easily replace this naive approach with more thorough similarity measures and clustering algorithms.
This project has been based on the code contributed within the TagsTreeMaps? project under the Corrib license owned by DERI, NUI Galway.
The TagsTreeMaps? project delivered a component for more efficient filtering of tagging space, using del.icio.us as an proof-of-concept service. The clustering algorithm has been however tightly integrated into the TagsTreeMaps? prototype making it virtually impossible in other projects. We have extracted and reworked parts of the algorithm to improve importing and clustering procedure and to simplify the use of this algorithms in external projects.
