research-article

Evidence of quality of textual features on the web 2.0

Authors:
Flavio Figueiredo

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil
View Profile

,
Fabiano Belém

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil
View Profile

,
Henrique Pinto

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil
View Profile

,
Jussara Almeida

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil
View Profile

,
Marcos Gonçalves

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil
View Profile

,
David Fernandes

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil

Federal University of Minas Gerais, Belo Horizonte - Minas Gerais, Brazil
View Profile

,
Edleno Moura

Federal University of Amazonas, Manaus - Amazonas, Brazil

Federal University of Amazonas, Manaus - Amazonas, Brazil
View Profile

,
Marco Cristo

FUCAPI, Manaus - Amazonas, Brazil

FUCAPI, Manaus - Amazonas, Brazil
View Profile

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementNovember 2009Pages 909–918https://doi.org/10.1145/1645953.1646070

Published:02 November 2009Publication History

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Pages 909–918

ABSTRACT

The growth of popularity of Web 2.0 applications greatly increased the amount of social media content available on the Internet. However, the unsupervised, user-oriented nature of this source of information, and thus, its potential lack of quality, have posed a challenge to information retrieval (IR) services. Previous work focuses mostly only on tags, although a consensus about its effectiveness as supporting information for IR services has not yet been reached. Moreover, other textual features of the Web 2.0 are generally overseen by previous research.

In this context, this work aims at assessing the relative quality of distinct textual features available on the Web 2.0. Towards this goal, we analyzed four features (title, tags, description and comments) in four popular applications (CiteULike, Last.FM, Yahoo! Video, and Youtube). Firstly, we characterized data from these applications in order to extract evidence of quality of each feature with respect to usage, amount of content, descriptive and discriminative power as well as of content diversity across features. Afterwards, a series of classification experiments were conducted as a case study for quality evaluation. Characterization and classification results indicate that: 1) when considered separately, tags is the most promising feature, achieving the best classification results, although its absence in a non-negligible fraction of objects may affect its potential use; and 2) each feature may bring different pieces of information, and combining their contents can improve classification.

References

Liblinear: A library for large linear classification. J. Mach. Learn. Res., 9:1871--1874, 2008. Google ScholarDigital Library
E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne. Finding High-Quality Content in Social Media. In Proc. WSDM, 2008. Google ScholarDigital Library
K. Bischoff, F. Claudiu-S, N. Wolfgang, and P. Raluca. Can All Tags Be Used for Search? In Proc. CIKM, 2008. Google ScholarDigital Library
S. Boll. MultiTube - Where Web 2.0 and Multimedia Could Meet. IEEE Multimedia, 14(1), 2007. Google ScholarDigital Library
L. Chen, P. Wright, and W. Nejdl. Improving music genre classification using collaborative tagging data. In Proc. WSDM, 2009. Google ScholarDigital Library
D. Fernandes, E. de Moura, B. Ribeiro-Neto, A. da Silva, and M. Gonçalves. Computing Block Importance for Searching on Web Sites. In Proc. CIKM, 2007. Google ScholarDigital Library
S. Golder and B. Huberman. Usage Patterns of Collaborative Tagging Systems. Journal of Information Science, 32(2), 2006. Google ScholarDigital Library
L. A. Goodman. Snowball Sampling. Annals of Math. Statistics, 32(1), 1961.Google Scholar
T. Haveliwala, A. Gionis, D. Klein, and P. Indyk. Evaluating strategies for similarity search on the web. In Proc. WWW, 2002. Google ScholarDigital Library
M. L. E. Hu, A. Sun, H. Lauw, and B. Vuong. Measuring article quality in wikipedia: models and evaluation. In Proc. CIKM, 2007. Google ScholarDigital Library
T. Joachims, C. Nedellec, and C. Rouveirol. Text categorization with support vector machines: learning with many relevant. In Europ. Conf. on Machine Learning. Springer, 1998. Google ScholarDigital Library
X. Li, L. Guo, and Y. Zhao. Tag-based Social Interest Discovery. In Proc. WWW, 2008. Google ScholarDigital Library
C. Marlow, M. Naaman, D. Boyd, and M. Davis. Position Paper, Tagging, Taxonomy, Flickr, Article, To read. In Collaborative Web Tagging Workshop (WWW'06), 2006.Google Scholar
C. Marshall. No Bull, No Spin: A comparison of tags with other forms of user metadata. In Proc. JCDL, 2009. Google ScholarDigital Library
G. Mishne. Using blog properties to improve retrieval. Proc. of ICWSM, 2007.Google Scholar
D. Ramage, P. Heymann, C. Manning, and H. Garcia-Molina. Clustering the tagged web. In Proc. WSDM, 2009. Google ScholarDigital Library
M. Rege, M. Dong, and J. Hua. Graph Theoretical Framework for Simultaneously Integrating Visual and Textual Features for Efficient Web Image Clustering. In Proc. WWW, 2008. Google ScholarDigital Library
R. Schenkel, T. Crecelius, M. Kacimi, S. Michel, T. Neumann, J. Parreira, and G. Weikum. Efficient Top-k Querying Over Social-Tagging Networks. In Proc. SIGIR, 2008. Google ScholarDigital Library
B. Sigurbjornsson and R. van Zwol. Flickr Tag Recommendation Based on Collective Knowledge. In Proc. WWW, 2008. Google ScholarDigital Library
F. Suchanek, M. Vojnovic, and D. Gunawardena. Social Tags: Meanings and Suggestions. In Proc. CIKM, 2008. Google ScholarDigital Library

Index Terms

Evidence of quality of textual features on the web 2.0
1. Information systems
  1. World Wide Web
    1. Web applications
    2. Web services

Recommendations

Assessing the quality of textual features in social media

Social media is increasingly becoming a significant fraction of the content retrieved daily by Web users. However, the potential lack of quality of user generated content poses a challenge to information retrieval services, which rely mostly on textual ...
Read More
Tag recommendation by machine learning with textual and social features

Tags are very popular in social media (like Youtube, Flickr) and provide valuable and crucial information for social media. But at the same time, there exist a great number of noisy tags, which lead to many studies on tag suggestion and recommendation ...
Read More
Characterizing use and quality of textual attributes in Web 2.0 applications
WebMedia '09: Proceedings of the XV Brazilian Symposium on Multimedia and the Web

Despite the large amount of multimedia content in Web 2.0 applications, most of its services in Information Retrieval (IR) use only attributes associated with textual content (eg, labels or tags). However, because they are typically generated by users, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
November 2009
2162 pages
ISBN:9781605585123
DOI:10.1145/1645953
General Chairs:
David Cheung
University of Hong Kong, Hong Kong
,
Il-Yeol Song
Drexel University, USA
,
Program Chairs:
Wesley Chu
UCLA, USA
,
Xiaohua Hu
Drexel University, USA
,
Jimmy Lin
University of Maryland, USA
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 November 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
social media
textual features
web 2.0
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 23
  Total Citations
  View Citations
- 585
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Evidence of quality of textual features on the web 2.0

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Assessing the quality of textual features in social media

Tag recommendation by machine learning with textual and social features

Characterizing use and quality of textual attributes in Web 2.0 applications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Evidence of quality of textual features on the web 2.0

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Assessing the quality of textual features in social media

Tag recommendation by machine learning with textual and social features

Characterizing use and quality of textual attributes in Web 2.0 applications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media