Tagging

From P2P-Fusion

Jump to: navigation, search
Tagging
Metadata: tags

A tag is a relevant kexword or term associated with some object or piece of information. Tagging is the process of assigning tags. The resulting structure is called a folksonomy.

Contents

Description

Tagging is a form of classification where users assign tags to entities. Unlike controlled vocabularies, tagging is unconstrained, users can tag an entity with whatever they feel relevant. This flexibility is both the strength and the weakness of tagging: folksonomies are very easy to use, and can be used in a lot of innovative ways, but they have little or no internal structure, and cannot be processed automatically the way classical taxonomies can.

While taxonomies are built top-down, and typically offer few paths (often only one) to an object, folksonomies are built bottom-up (if they are built at all), and there are many ways to reach an object. Tagging has typically high relevance, but lacks both precision and recall. Therefore it is an ideal tool for discovery of new, interesting content, but not really good in finding some specified thing, which is one of the main goals of categorization.

On a small scale, however, the deficiencies in finding are not serious, therefore tagging can be used for navigation and search in a user's own content (files, bookmarks, favourite books/musics etc). This is often utilised to incite users to tag, and many social software applications are built on this dual purpose of tagging: the user's primary objective in tagging is to help himself navigate, but the tags are made public, and can be used by others for discovery. Social bookmarking sites are an example of this.

Folksonomies created by such systems are usually divided into two groups: broad folksonomies, where multiple users can "own" the same object (for example, when tagging URLs or favourite music), and narrow folksonomies, where users tag different sets of objects (for example, images they made, or blog posts they wrote).

While categories and ontologies map concepts onto graphs, tagging maps them onto sets. Folksonomies reflect our mental models more accurately than category systems.

Because of its simplicity, low costs and ease of use, tagging is more efficient - even if less effective - in most real life situations than more rigid classification methods, especially when those situations involve untrained users. Tagging also lends itself better to collaboration. Therefore, the mass amateurisation of publishing led to tagging overtaking more formal classifications on the internet (except for very simple, predefined category systems, which are still widely used alongside tagging).

Data

Input

The input is the name of the tag (or a set of tags). Some systems only allow one-word tags, others allow spaces too. Tags are almost invariably created by the users, though the software might offer suggestions or autocompletion (which help avoiding typos and make tagging more consistent). A new approach used by the Google Image Labeler is to have to users assign tags to an image in a relatively short time frame, and only keep the tags which were used by both. This prefers common and more obvious tags.

Storage

The data to be stored is simply the tag name or list of tag names. In P2P environments, where the entities to be tagged are usually files, and the size of the tag data is negligible compared to the file size, tags can be stored together with the files they describe.

Tags themselves might be handled as objects that can be tagged; this results in hierarchical folksonomies, similar to categories.

Output

The output of the tool is the set of tags that belong to the entity the user is viewing. This is done automatically, without any explicit request or input from the user. In the case of broad taxonomies, there are two sets: tags assigned by the user, and tags assigned by others.

Another kind of output is when the user requests information about a tag, and the entities labeled with that tag are shown. Some systems can also show intersections of tags.

Tags can be used by recommendation systems to compute semantic distance.

Dependencies

Tools used

Tagging does not depend on any other tools, though recommendation can be used to suggest tags to the user.

Tools using this tool

Tags can be used in recommendation systems to compute semantic distance betwen entities.

Groups can also be based on users tagging themselves.

Management

In narrow taxonomies, users can control who can tag entities belonging to them.

Interface

Tags of a certain object are typically displayed as a simple list. Clicking on a tag shows all the entities labeled with that tag.

A frequently used visualisation method for tags is the tag cloud: a weighted list of tags, where more frequently used tags are written with larger fonts. Tag clouds provide an overview of all the tags used (or, in large-scale systems where there are too many tags to display all, a top-list of the most frequently or most recently used tags), where users can see what tags are currently popular.

Prevalence

Technical aspects

Implementing tagging is starightforward. As with most other social processing tools, spamming is a potential threat, but presently spammers seem to favor other targets.

Social aspects

The freedom and flexibility of free text makes tagging a very effective and user-friendly tool, but it also has several drawbacks. Because the vocabulary is not controlled, synonims (different words with the same meaning) and homonyms (different meanings for the same word) arise. A term can be written as one word or with spaces, in plural or in singular, or in different languages. Synonims harm recall; homonyms harm precision.

The damage might be reduced by using stemming algorithms to make the phrasing of tags more uniform; using correlation to identify synonims; and using clustering methods to differentiate between multiple meanings of a word; however, neither of these solutions tend to be effective. Allowing users to identify synonims and homonyms may work in a closed, active community, but is hard to sustain in large-scale systems.

There is also the problem of deciding how specific a tag should be. For example, a video made in a certain place can be tagged with the name of the country, or the city, or the district, or the street. In an ontology, all could be deduced form the last one; in a folksonomy, this is usually not possible. Community rules and hierarchic tagging can help with this problem, but both require lively and dedicated communities.

Community

A side effect of tagging is that it can spontaneously create communities: people using the same keywords tend to have a common interest. This effect can be amplified by allowing users to tag themselves, thereby using tags to create groups.

Norms and rules

Usually, tagging is not constrained in any way, but there are some counterexamples. At Slashdot, the software recognizes some tags (like "dupe" and "typo"), and tagging is moderated to enforce norms. Wikipedia uses several norms and rules to turn tagging into a category system.

Incentives and sanctions

There are two different cases from an incentive-centered point of view (though there is no clear demarcation line between the two): when users tag things primarily for themseles (e.g. social bookmarking), and when they tag for others (e.g. blogs). In the first case, the main incentive for the user is to create navigation aids for himself. In the second case, they might use tagging as a tool to popularize their work, or they may work as a member of a community.

When users are tagging for others, they also have an incentive to make their system of tags somewhat coherent. This can be assisted and exploited in several ways, for example

Existing examples

  • blogs use tags to tematically classify their content so that visitors can easily find the subject they are interested in. These tags can then be reused by many higher-levek services: blog farms might aggregate tag-related data to evaluate or recommend blogs; technorati uses them for search.
  • social bookmarking sites like del.icio.us, furl use tagging to classify URLs. They offer bookmarking services to the user which are available form everywhere, easy to publish and easy to navigate. For others they provide an efficient browsing tool.
  • image repositories like flickr use tags to help users navigating images.
  • e-mail clients, pioneered by gmail use them as a generalization of folders to order messages..
  • book (LibraryThing) and articles (CiteULike) collections
  • One of the inventive uses of tagging can be seen at 43things.
  • Slashdot is experimenting with moderated tagging. [1]
  • MediaWiki (the software behind Wikipedia] uses tagging to create and maintain categories.

P2P file sharing examples

Application in Fusion

For audio/video files

Tagging has almost limitless possibilities: it can be used to annotate content, genre, theme, language of a film, and also more specific things such as the main actors. Users might also be able to tag tags and create a hierarchy, though such things are difficult to maintain.

In P2P systems

Beyond what is written above, users can tag themselver to form simple groups.

Related tools

  • annotation is a wider set of tools that includes tagging.

External links

Personal tools