research-article

Analyzing used-car web listings via text mining

Authors:
Ayhan Demiriz

Gebze Technical University, Gebze, Kocaeli, TURKEY

Gebze Technical University, Gebze, Kocaeli, TURKEY
View Profile

,
Fatma Cantaş

Sakarya University, Sakarya, TURKEY

Sakarya University, Sakarya, TURKEY
View Profile

IML '17: Proceedings of the 1st International Conference on Internet of Things and Machine LearningOctober 2017Article No.: 21Pages 1–7https://doi.org/10.1145/3109761.3109782

Published:17 October 2017Publication History

IML '17: Proceedings of the 1st International Conference on Internet of Things and Machine Learning

Pages 1–7

ABSTRACT

Used car trade is one of the major components of the world economies. It is not uncommon to sell a car by placing an internet advertisement irrespective of the geography in these days. A typical content of an advertisement is usually composed of two parts namely the structured and the free text data. The structured data may include some information about the asking price, make, model, year, mileage of the car and the contact info. In most cases, seller may give important clues about the car's current conditions in the free text data where the title (head) of the advertisement can be included as free text too. This paper reports preliminary results from a text mining study conducted on 75K used car internet listings collected from two major car listing web sites in Turkey. As expected, the words and the phrases related to the description of the car are observed to be frequent. The leading concepts in the free text are found to be regarding how to describe the current condition of a car, for example "no crash history".

References

Ahmet Afsin Akin and Mehmet Dündar Akin. 2007. Zemberek, an open source NLP framework for Turkic languages. Structure 10 (2007), 1--5.Google Scholar
L. Lee B. Pang. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, 271. Google ScholarDigital Library
L. Lee B. Pang and S. Vaithyanathan. 2002. Thumbs up?: sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing, Vol. 10. Association for Computational Linguistics, 79--86. Google ScholarDigital Library
William B. Cavnar and John M. Trenkle. 1994. N-Gram-Based Text Categorization. In In Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval. 161--175.Google Scholar
J. Elder IV G. Miner and T. Hill. 2012. Practical text mining and statistical analysis for non-structured text data applications. Academic Press. Google ScholarDigital Library
B. Kjell. 1994. Authorship attribution of text samples using neural networks and Bayesian classifiers. In Proceedings of IEEE International Conference on Systems, Man and Cybernetics, Vol. 2. 1660--1664. 400086Google ScholarCross Ref
Tong Zhang Sholom M. Weiss, Nitin Indurkhya and Fred J. Damerau. 2005. Text Mining Predictive Methods for Analyzing Unstructured Information. Springer.Google Scholar
E. Stamatatos. 2009. A survey of modern authorship attribution methods. Journal of the American Society for information Science and Technology 60, 3 (2009), 538--556. Google ScholarDigital Library
A. Stolcke. 2002. SRILM-an extensible language modeling toolkit. In Interspeech.Google Scholar
Peter D. Turney. 2002. Thumbs Up or Thumbs Down?: Semantic Orientation Applied to Unsupervised Classification of Reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL '02). Association for Computational Linguistics, Stroudsburg, PA, USA, 417--424. Google ScholarDigital Library
E. Umut. 2009. Sentiment Analysis In Turkish. Master's thesis. Middle East Technical University, Ankara, Turkey.Google Scholar
V. Uzun. 2014. Semantic Text Mining And An Application In Turkish Documents. Master's thesis. Dokuz Eylul University, Izmir, Turkey.Google Scholar
G. K. Zipf. 1949. Human Behavior and the Principle of Least Effort. Addison-Wesley, Reading MA (USA).Google Scholar

Index Terms

Analyzing used-car web listings via text mining
1. Applied computing
  1. Document management and text processing
    1. Document capture
      1. Document analysis
2. Information systems
  1. Data management systems
    1. Information integration
      1. Data cleaning

Recommendations

A smart car control model for brake comfort based on car following

This paper demonstrates a novel car-following model focused on passenger comfort, for example, a rapid deceleration will make passengers uncomfortable. The brake comfort model of car following was set up according to the relationship between vehicle ...
Read More
Classifying Indonesian Online Articles as Advertisement Placement Base Using Text Mining
ICBIM 2017: Proceedings of the International Conference on Business and Information Management

Rapid development in technological aspect resulting in growing level of human needs for the latest news, so that emerged a new trend of publishing and accessing news through online media or usually called online journalism. In addition, the number of ...
Read More
Breaking news detection from the web documents through text mining and seasonality

In recent years, news distribution through the internet has increased significantly and so does our growing dependency on online news sources. As vast numbers of web documents from different news websites are readily available, it is possible to extract ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IML '17: Proceedings of the 1st International Conference on Internet of Things and Machine Learning
October 2017
581 pages
ISBN:9781450352437
DOI:10.1145/3109761
General Chairs:
Hani Hamdan
University of Paris-Saclay, Paris, France
,
Djallel Eddine Boubiche
University of Batna, Algeria
,
Program Chair:
Fanny Klett
German Workforce ADL Partnership Laboratory, Germany
Copyright © 2017 ACM
© 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
n-gram analysis
text mining
used car listings
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 94
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Analyzing used-car web listings via text mining

IML '17: Proceedings of the 1st International Conference on Internet of Things and Machine Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

A smart car control model for brake comfort based on car following

Classifying Indonesian Online Articles as Advertisement Placement Base Using Text Mining

Breaking news detection from the web documents through text mining and seasonality