Show all entries

(no Tag)   Crowdsourcing   Dokumentation   Forschungslabor   Kooperation   Publikation  

Crowdsourcing  (Quote)

Although there are already a lot of relevant linguistic data regarding the fields of investigation of VerbaAlpina (especially in atlases and dictionaries), it is an aim of the project to collect new data. By this new collection, (1) inconsistencies between the existing sources shall be evened out, (2) gaps or rather inaccuracies shall be disposed and (3) antiquated designations or rather devices shall be marked as such. However, the new collection of data shall not be carried out by the traditional methods of field research, but by the means that the social media offer us by now. The corresponding methods are often subsumed under the term crowdsourcing. The reference to the crowd can in some respects be misunderstood, not least because many associate arbitrariness, amateurishness and insufficient reliability with the term. The reservations are not completely unjustified as the corresponding methods are indeed directed at a vague and anonymous crowd of potential interested persons. Fundamental problems arise from two directions: 1) from the scientific provider of the project, 2) from the target group of the project, which can consist in linguistic laymen, but not necessarily need to do so. The offer has to be adequately 'visible' and attractive and the target group has to have sufficient linguistic competence and sufficient knowledge regarding the specific subject. There are different strategies to handle this. One can try for example to increase the offer's attractiveness by designing it in an entertaining way and with interfaces that have play character. The project alliance play4science). The competence can be judged by specific questions of knowledge, but it is unquestionably more reliable to get confirmed and validated the provided data by other speakers from the same places.

(auct. Thomas Krefeld – trad. Susanne Oberholzer)

Tags: Crowdsourcing

Digital humanities  (Quote)

The project VerbaAlpina has been planned from the beginning with regard to suitability for the web as it wants to contribute to the transferring of established arts traditions (more precisely of geolinguistics), to the digital humanities.
This means as follows:
(1) The empirical basis of the research consists in data (cf. Schöch 2013), e.g. in digitally codified and structured units or at least in units that can be structured. The data the project is dealing with are partly already published data which are digitised secondarily (as e.g. the older material out of atlases), but partly also new data which still have to be collected. With regard to the relevant concepts the new data shall be as extensive as possible. Therefore, the method is quantitative and to a great extent inductive.
(2) The research communication takes place on the medial conditions of the internet. This allows to intertwine hypertextually different media (writing, picture, video and sound). Furthermore, the persons who are participating in the project either as researchers (especially as project partner) and/or as informants can communicate and cooperate with each other continuously.
(3) The interested researchers are offered to collaborate on the development of this collaborative research platform based on the project. This perspective is useful and gets the project further at least in two respects: it permits to integrate different sites and to make progress with the combination of information technology and linguistic geography by using public resources, i.e. without being forced to fall back upon the (legally and economically difficult) support of private IT companies.
(4) The knowledge which is relevant for the project can also continuously be accumulated and modified for a fairly long time although the guarantee of a lasting availability is still difficult to realise technically (cf. to this the important research infrastructure of CLARIN-D Anyhow, the publication of the results on real media (books, CDs, DVDs) is no fundamental request anymore. Nevertheless, a secondary print option is set up, a solution the online lexicography offers occasionally, as e.g. the exemplary Tesoro della Lingua Italiana delle Origini.

(auct. Thomas Krefeld – trad. Susanne Oberholzer)

Tags: Kooperation Crowdsourcing

Geocoding  (Quote)

Geocoding is a fundamental ordering criterion of the data which are administrated by VerbaAlpina; degrees of latitude and longitude are used for geocoding. The exactness of this coding varies depending on the data type; VerbaAlpina aims at a coding as exact as possible, to within a metre. In the case of linguistic data from atlases and dictionaries, it is generally only possible to do an approximate coding according to the place name. However, in the case of e.g. archaeological data a geocoding to within a meter is actually possible. Spots, lines (as streets, rivers etc.) and surfaces can be saved. For the geocoding, the so-called WKT format ( is essentially used, which is transferred to a specific MySQL format in the VA database by means of the function geomfromtext() ( and is saved like this. The output in WKT result is done by means of the MySQL function astext().
The reference grid of the geocoding is the network of municipalities in the Alpine region, which can be output as surface or as spots, as required. The basis is the courses of the municipalities’ border from circa 2014, which VerbaAlpina received from its partner "Alpine Convention". A constant update of these data (which can often change due to administrative reforms) is unnecessary because they form merely a geographical reference frame. The spot depiction of the municipality grid is deduced in an algorithmic way from the municipalities’ borders and therefore secondary. The calculated municipality spots represent the geometric midpoints of the municipality surfaces and mark only by case theirs centre. If necessary, all data can be projected individually or in an accumulated way on the calculated municipality spot. This is the case for linguistic data out of atlases and dictionaries.
Additionally, there will be a honeycombed grid which is quasi geocoded: it portrays in fact the approximate position of the municipalities to each other, but it assigns at the same time an idealised surface with each time the same form and size to each municipality territory. By doing so, two alternative methods of mapping are offered to the users. Both have their advantages and disadvantages and both offer a certain suggestive potential because of their figurativeness. The topographic depiction gives a better insight into the concrete spatiality (with its very special ground profile, single transitions, valley courses, inaccessible valley exits etc.) because of its precision. The honeycomb map in comparison allows more abstracted visualisations of the data as it balances the sizes of municipality surfaces and agglomeration resp. scattered settlements. This is especially useful for quantitative maps because perceiving the size of the surface the impression of quantitative weight is instinctively created.

(auct. Thomas Krefeld | Stephan Lücke – trad. Susanne Oberholzer)

Tags: Dokumentation Kooperation Crowdsourcing

Informant  (Quote)

The expression informant is used technically by VerbaAlpina: it unites two different things depending on the source. In the linguistic atlases, as a rule all linguistic data are transparent up to and including the speaker. In the database, these informants can be identified by an individual number (ID). They are furthermore chronocoded by the year of the data collection and geocoded by the place of the data collection. In geocodifiable dictionaries in comparison it is – as a rule – impossible to identify concrete speakers. However, VerbaAlpina assigns fictitious informants to this kind of sources too because of reasons due to the database. Each informant is assigned to a language family. This language assignment passes from the informant himself to all other linguistic data deriving from him.

(auct. Thomas Krefeld – trad. Susanne Oberholzer)

Tags: Kooperation Crowdsourcing

Language contact  (Quote)

There are two completely different types of language contact (to which variety contact also belongs) depending on their status of integration to the linguistic system. They can be fixed and integrated elements of the language, independent of the speaker ('loan words') – on the level of the linguistic system – or – on the level of the speaker – individual phenomena. These can be either habitual or occasional uses, so-called switchings. This reservation has also to be taken into account when interpreting older atlas materials where an informant provides a form close to the standard language or, in bilingual areas, a form of the respective second language. The theoretically fundamental difference is more or less likely in view of the linguistic data, but it is, however, actually never evident. Only the increase of informants, which becomes a quite realistic option with social media, promises us reliable information about this point.

(auct. Thomas Krefeld – trad. Susanne Oberholzer)

Tags: Publikation Crowdsourcing

Sources  (Quote)

In VerbaAlpina, we bring together very different sources. On the one hand, we are dealing with already published sources (atlases, dictionaries, monographs of single places) and with new sources, which have been exploited by the project itself for the first time, on the other hand. Part of these new data are collected by member of staffs, e.g. by Beatrice Colcuc, partly the crowd, i.e. individual and not personally known speakers, contributes these new data. For VerbaAlpina, only sources which deliver already geocoded or at least geocodable linguistic data, are worthy of consideration. These data, however, have to be treated systematically in a different way against the background of the typification. Utterances which are phonetically exactly transcribed are marked as "single instance" by VerbaAlpina. It makes sense to group these single instances according to certain criteria ('to type'). Data which the source offers in orthographic form is regarded as alredy typed: this form of notation

Tags: Dokumentation Kooperation Crowdsourcing