Detect the language of a string

bad detection

You can use the whatlanguage Show archive.org snapshot gem to detect the language of a Ruby string.
Note that it also has not been updated in quite a while and that there might be alternatives. However, it still works.

It has problems with short strings, but works quite well on longer texts.

Use it like this:

>> WhatLanguage.new(:all).language('Half the price of a hotel for twice the space')
=> :english

There is also a convenience method on Strings (you may need to require 'whatlanguage/string').

>> 'Wir entwickeln und betreiben anspruchsvolle Webanwendungen'.language
=> :german

Depending on your users' input, consider using less languages for better accuracy:

>> WhatLanguage.new(:all).language('Hello')
=> :russian # nope
>> WhatLanguage.new(:german, :english).language('Hello')
=> :english

WARNING

whatlanguage has a really bad detection:

LANGUAGES = WhatLanguage.new(:english, :german, :french, :italian, :spanish)

LANGUAGES.language("Updated: ElasticSearch - a database alternative?")
=> :french
LANGUAGES.language("Updated a database alternative?")
=> :german
LANGUAGES.language("a database alternative?")
=> :french
LANGUAGES.language("a database")
=> :french
LANGUAGES.language("database")
=> :italian

An alternative could be Compact Language Detection Show archive.org snapshot but it contains native extensions

Arne Hartherz