Quantifying the genericness of trademarks using natural language processing: an introduction with suggested metrics

Shackell, Cameron; De Vine, Lance

doi:10.1007/s10506-021-09291-7

Quantifying the genericness of trademarks using natural language processing: an introduction with suggested metrics

Original Research
Published: 02 June 2021

Volume 30, pages 199–220, (2022)
Cite this article

Artificial Intelligence and Law Aims and scope Submit manuscript

605 Accesses
1 Citation
5 Altmetric
Explore all metrics

Abstract

If a trademark (“mark”) becomes a generic term, it may be cancelled under trademark law, a process known as genericide. Typically, in genericide cases, consumer surveys are brought into evidence to establish a mark’s semantic status as generic or distinctive. Some drawbacks of surveys are cost, delay, small sample size, lack of reproducibility, and observer bias. Today, however, much discourse involving marks is online. As a potential complement to consumer surveys, therefore, we explore an artificial intelligence approach based chiefly on word embeddings: mathematical models of meaning based on distributional semantics that can be trained on texts selected for jurisdictional and temporal relevance. After identifying two main factors in mark genericness, we first offer a simple screening metric based on the ngram frequency of uncapitalized variants of a mark. We then add two word embedding metrics: one addressing contextual similarity of uncapitalized variants, and one comparing the neighborhood density of marks and known generic terms in a category. For clarity and validation, we illustrate our metrics with examples of genericized, somewhat generic, and distinctive marks such as, respectively, DUMPSTER, DOBRO, and ROLEX.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Do Online Resources Give Satisfactory Answers to Questions About Meaning and Phraseology?

User Interface for Managing and Refining Related Patent Terms

Detecting Different Forms of Semantic Shift in Word Embeddings via Paradigmatic and Syntagmatic Association Changes

Data availability

All data used in the paper is publicly available.

Code availability

Code for reproducing the results can be obtained by contacting the corresponding author.

Notes

Throughout the paper, the usual convention of capitalization is used to refer to a mark generally (e.g., ASPIRIN). However, when a mark is used as an object of analysis it is shown in bold type to emphasize the orthographic form referred to (e.g., Aspirin or aspirin). Terms used in analysis that are not trademarks are rendered in italics (e.g., dog).
We believe, however, that the methods outlined here are relevant to other jurisdictions around the world and, indeed, to many other aspects of monitoring trademark landscapes.
Orthography more generally refers to all the conventions for writing a language including, for example, hyphenation, punctuation and spelling. These are all indeed relevant to trademark genericness but our focus in this paper is on the presence or absence of an initial capital letter such as Rolex or rolex. The lower case form we refer to as an “orthographic variant” or the “regularized form”. These general terms leave the door open to analysis of other cases such as internal capital letter loss as in iPhone and iphone.
It is important to note we use regularize and regularization in this orthographic sense and not in relation to natural language processing or other statistical methods.
DUMPSTER was cancelled in 2015.
In fact, the PEAVEY score may be due to a rare homonym that is an alternative spelling of “peavy”, a timber handling tool, which may account for the wide separation of Peavey and peavey in Fig. 5. In a real world analysis this would have to be differentiated in the corpora used.
In the word embedding model used in this paper, for example, “Friday” and “friday” have a relatively low similarity at 0.43.
The field of explainable artificial intelligence (XAI) will no doubt be relevant to many legal proceedings in coming years. A useful starting point for those interested is Ribeiro et al. (2016).

References

Abood A, Feltenberger D (2018) Automated patent landscaping. Artif Intell Law 26:103–125. https://doi.org/10.1007/s10506-018-9222-4
Article Google Scholar
Bayer Co. v. United Drug Co., (1921) No. 17: 492, pp. 272 505
Chalkidis I, Kampas D (2019) Deep learning in law: early adaptation and legal word embeddings trained on large corpora. Artif Intell Law 27:171–198. https://doi.org/10.1007/s10506-018-9238-9
Article Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K (2018). BERT: pre-training of deep bidirectional transformers for language understanding. http://arxiv.org/abs/1810.04805
Elliott v. Google Inc (2017) No. 15-15809, 860 1151 (Court of Appeals, 9th Circuit 2017)
Fechter GH, Slavin E (2011) Practical Tips on Avoiding genericide. international trademark association (INTA) Bulletin 66(20)
Firth JR (1957) A synopsis of linguistic theory 1930–1955. In Studies in Linguistic Analysis, pp 1–32. Oxford: Philological Society. Reprinted in F.R. Palmer (ed) Selected Papers of J.R. Firth 1952–1959, Longman, London
Fu R, Guo J, Qin B, Che W, Wang H, Liu T (2014) Learning semantic hierarchies via word embeddings. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, Baltimore, Maryland
Geffet M, Dagan I (2005) The distributional inclusion hypotheses and lexical entailment. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL’05)
He R, McAuley J (2016) Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In: Proceedings of the the 25th international conference on world wide web (WWW ’16)
Landes W, Posner R (1987) Trademark law: an economic perspective. J. law Econ 30(2):265–309
Article Google Scholar
Linford J (2015) A linguistic justification for protecting generic trademarks. Yale JL Tech 17:110–145
Google Scholar
List of generic and genericized trademarks (2020) Wikipedia. https://en.wikipedia.org/wiki/List_of_generic_and_genericized_trademarks. Accessed 12 June 2020
Michel JB, Shen YK, Aiden AP, Veres A, Gray MK, Pickett JP, Aiden EL (2011) Quantitative analysis of culture using millions of digitized books. Science 331(6014):176. https://doi.org/10.1126/science.1199644
Article Google Scholar
Mikolov T, Chen K, Corrado G, Dean, J (2013). Efficient estimation of word representations in vector space. http://arxiv.org/abs/1301.3781
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Paper presented at :advances in neural information processing systems (NIPS 2013)
Pannitto L, Salicchi L, Lenci A (2018) Refining the distributional inclusion hypothesis for unsupervised hypernym identification. Ital J Comput Linguist 4(2):45–56
Article Google Scholar
Pechenick EA, Danforth CM, Dodds PS (2015) Characterizing the google books corpus: strong limits to inferences of socio-cultural and linguistic evolution. PLoS ONE 10(10):e0137041. https://doi.org/10.1371/journal.pone.0137041
Article Google Scholar
Řehůřek R, Sojka P (2010) Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks
Ribeiro MT, Singh S, Guestrin C. (2016). Why should I trust you? Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp.1135–1144
Shwartz V, Goldberg Y, Dagan I. (2016). Improving hypernymy detection with an integrated path-based and distributional method. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Berlin, Germany
Stuhlbarg Int’l Sales Co. v. John D. Brush & Co., No. 99-56676, 240 832 (Court of Appeals, 9th Circuit 2001)
Walsh MG (2013) Protecting your brand against the heartbreak of genericide. Bus Horiz 56(2):159–166
Article Google Scholar
Weeds J, Clarke D, Reffin J, Weir D, Keller B (2014) Learning to Distinguish Hypernyms and Co-Hyponyms. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers, Dublin, Ireland
Weeds J, Weir D, McCarthy D (2004) Characterising measures of lexical distributional similarity. In: Proceedings of the 20th international conference on computational linguistics (COLING 2004)
Younes N, Reips UD (2019) Guideline for improving the reliability of google ngram studies: evidence from religious terms. PLoS ONE 14(3):e0213554. https://doi.org/10.1371/journal.pone.0213554
Article Google Scholar

Download references

Funding

There are no funding sources to declare.

Author information

Authors and Affiliations

Queensland University of Technology, 2 George St, Brisbane, Australia
Cameron Shackell & Lance De Vine

Authors

Cameron Shackell
View author publications
You can also search for this author in PubMed Google Scholar
Lance De Vine
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The research is the sole original work of the two listed authors, Dr CS and Dr LDV. Quantifying the genericness of trademarks using natural language processing: an introduction with suggested metrics.

Corresponding author

Correspondence to Cameron Shackell.

Ethics declarations

Conflict of interest

The authors declared that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shackell, C., De Vine, L. Quantifying the genericness of trademarks using natural language processing: an introduction with suggested metrics. Artif Intell Law 30, 199–220 (2022). https://doi.org/10.1007/s10506-021-09291-7

Download citation

Accepted: 25 May 2021
Published: 02 June 2021
Issue Date: June 2022
DOI: https://doi.org/10.1007/s10506-021-09291-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantifying the genericness of trademarks using natural language processing: an introduction with suggested metrics

Abstract

Access this article

Similar content being viewed by others

Do Online Resources Give Satisfactory Answers to Questions About Meaning and Phraseology?

User Interface for Managing and Refining Related Patent Terms

Detecting Different Forms of Semantic Shift in Word Embeddings via Paradigmatic and Syntagmatic Association Changes

Data availability

Code availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Quantifying the genericness of trademarks using natural language processing: an introduction with suggested metrics

Abstract

Access this article

Similar content being viewed by others

Do Online Resources Give Satisfactory Answers to Questions About Meaning and Phraseology?

User Interface for Managing and Refining Related Patent Terms

Detecting Different Forms of Semantic Shift in Word Embeddings via Paradigmatic and Syntagmatic Association Changes

Data availability

Code availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation