The Weltmodell

Welcome to the Weltmodell, a data-driven commonsense knowledge base automatically extracted from a publicly available corpus of ngrams from over 3.5 million English language books (Akbik and Michael, 2014).

We organize the world into concept-statement pairs, with concepts like coffee that get associated with statements, like things a waiter might bring. We extract this information from very large corpora and count how often we observe each concept and statement. This allows us to calculate typicality in common sense. For instance, we can determine what a waiter will most typically bring (spoiler: waiters bring coffee, but also the menu, various foods and drinks, and finally the check).

Try it out!

What can you travel by? What does a book provide? What cures diseases? What may someone regain?

What may be green?
- ... or blue or steep or scarce or challenging?

What things may float in the air?
- ... or hover in the air or hang in the air or rise in the air ?

What may someone have for dinner?
- or for breakfast?

Also check all knowledge about a specific concept:

What do we know about water or clouds or tea or medicine?

Note: This is an old project done in 2014 and dusted off in 2020 for no specific reason. Check out our original LREC paper. 2014 is an eternity ago in NLP, so today one would do this very differently (in fact, with today's methods one could probably build a much more impressive Weltmodell).

The demo is now maintained by the machine learning group at Humboldt-Universität zu Berlin. Contact us if you have feedback.

The Weltmodell A data-driven commonsense knowledge base

The Weltmodell
A data-driven commonsense knowledge base