Word Comparison through Semantic Projection (Finnish version)

Welcome to the word comparison experiment, based on the work of Grand et al. [2]. Describe a dimension by thinking of words that create a contrast. Make sure to enter them "pairwise", meaning that the first negative word is the antonym of the first positive word. Finally, enter some words to be compared along the dimension.

Downloading Finnish word2vec vectors:

Examples

Here are some example dimensions to try:

age: old ancient elderly - young youth child
arousal: interesting exciting fun - boring unexciting dull
cost: expensive costly fancy - inexpensive cheap budget
danger: dangerous deadly threatening - safe harmless calm
gender: male masculine man - female feminine woman
intelligence: intelligent smart wise - stupid dumb idiotic
location: indoor indoors inside - outdoor outdoors outside
loudness: loud deafening noisy - soft silent quiet
political: democrat liberal progressive - republican conservative redneck
religiosity: religious spiritual orthodox - atheist secular agnostic
size: large big huge - small little tiny
speed: fast speedy quick - slow sluggish gradual
temperature: hot warm tropical - cold cool frigid
valence: good great happy - bad awful sad
wealth: rich wealthy privileged - poor poverty underprivileged
weight: heavy fat thick - light skinny thin
wetness: wet water ocean - dry country land

Where does the data come from?

The data was collected by the Turku NLP group. The main publication describing the dataset is here:

[1]J. Luotolahti, J. Kanerva, V. Laippala, S. Pyysalo, and F. Ginter. Towards Universal Web Parsebanks. Proceedings of the International Conference on Dependency Linguistics (Depling’15). 2015

More information about the comparison algorithm

To be clear, Aalto University's CMHC lab has nothing to do with the comparison algorithm, we just showcase it on our website. The comparison algorithm was published here:

[2]Gabriel Grand, Idan Asher Blank, Francisco Pereira, and Evelina Fedorenko. Semantic projection recovers rich human knowledge of multiple object features from word embeddings. Nature Human Behavior, 2022.

The algorithm works by first creating a difference vectors between the word2vec vectors of two words (labeled "positive" and "negative" in the interface above). Then, the word2vec vectors of each of the words to compare are projected onto the difference vector. To obtain a more reliable difference vector, we can average across multiple difference vectors, hence you are able to specify multiple "positive" and "negative" words in the interface.

More information about the word2vec algorithm

[3] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR, 2013.

[4] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of NIPS, 2013.

[5] Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. Linguistic Regularities in Continuous Space Word Representations. In Proceedings of NAACL HLT, 2013.

Word Comparison through Semantic Projection (Finnish version)

Step 1: Define a dimension along which to compare

Step 2: Give some words to compare with each other

Examples

Where does the data come from?

More information about the comparison algorithm

More information about the word2vec algorithm