Loading [MathJax]/extensions/TeX/color_ieee.js
Automatic approaches to clustering occupational description data for prediction of probability of workplace exposure to beryllium | IEEE Conference Publication | IEEE Xplore

Automatic approaches to clustering occupational description data for prediction of probability of workplace exposure to beryllium


Abstract:

We investigated automatic approaches for clustering data that describes occupations related to hazardous airborne exposure (beryllium). The regulatory compliance data fro...Show More

Abstract:

We investigated automatic approaches for clustering data that describes occupations related to hazardous airborne exposure (beryllium). The regulatory compliance data from Occupational Safety and Health Administration includes records containing short free text job descriptions and associated numerical exposure levels. Researchers in public health domain need to map job descriptions to Standard Occupational Classification (SOC) nomenclature for estimating occupational health risks. Previous manual process was time-consuming and did not advance so far to linkage to SOC. We investigated alternative automatic approaches for clustering job descriptions. The clustering results are the first essential step towards discovery of corresponding SOC terms. Our study indicated that the Tolerance Rough Set with Jaccard similarity was a better combination overall. The utility of the algorithm was further verified by applying logistic regression and validating that the predictive power of the automatically generated classifications, in terms of association of “job” with probability of exposure to beryllium above certain threshold, closely approached that of the manually assembled classification of the same 12,148 records.
Date of Conference: 08-10 November 2011
Date Added to IEEE Xplore: 05 January 2012
ISBN Information:
Conference Location: Kaohsiung, Taiwan

Contact IEEE to Subscribe

References

References is not available for this document.