Methodology

Sorting

Show all entries

(no Tag)   Extralinguistic context   Functional areas   Information technology   Linguistics   Web page  


Beta Code  (Quote)



(auct. Thomas Krefeld | Stephan Lücke)

Tags: Linguistics Information technology



Code Page  (Quote)



(auct. Stephan Lücke)

Tags: Linguistics Information technology



Concept Description  (Quote)



(auct. Giorgia Grimaldi | Thomas Krefeld)

Tags: Information technology



Digital Humanities  (Quote)

The project VerbaAlpina has been planned from the beginning with regard to suitability for the web as it wants to contribute to the transferring of established arts traditions (more precisely of geolinguistics), to the digital humanities.
This means as follows:
(1) The empirical basis of the research consists in data (cf. Schöch 2013), e.g. in digitally codified and structured units or at least in units that can be structured. The data the project is dealing with are partly already published data which are digitised secondarily (as e.g. the older material out of atlases), but partly also new data which still have to be collected. With regard to the relevant concepts the new data shall be as extensive as possible. Therefore, the method is quantitative and to a great extent inductive.
(2) The research communication takes place on the medial conditions of the internet. This allows to intertwine hypertextually different media (writing, picture, video and sound). Furthermore, the persons who are participating in the project either as researchers (especially as project partner) and/or as informants can communicate and cooperate with each other continuously.
(3) The interested researchers are offered to collaborate on the development of this collaborative research platform based on the project. This perspective is useful and gets the project further at least in two respects: it permits to integrate different sites and to make progress with the combination of information technology and linguistic geography by using public resources, i.e. without being forced to fall back upon the (legally and economically difficult) support of private IT companies.
(4) The knowledge which is relevant for the project can also continuously be accumulated and modified for a fairly long time although the guarantee of a lasting availability is still difficult to realise technically (cf. to this the important research infrastructure of CLARIN-D http://www.clarin-d.de/en/). Anyhow, the publication of the results on real media (books, CDs, DVDs) is no fundamental request anymore. Nevertheless, a secondary print option is set up, a solution the online lexicography offers occasionally, as e.g. the exemplary Tesoro della Lingua Italiana delle Origini.

(auct. Thomas Krefeld – trad. Susanne Oberholzer)

Tags: Information technology



Digitisation  (Quote)

Within the context of VerbaAlpina, the term digitisation> is not only used to describe the simple use of computers for electronic data processing. The term describes essentially the digital deep development of the material by *structuring* it systematically and transparently and by categorising it.





VerbaAlpina works almost exclusively with the relational data model which organises the data material in principle in the form of tables. The tables consist in rows (= data sets, tuples) and columns (= attributes, fields, properties). Every table can be widened by additional rows and columns in every direction. Between the tables, there are logical relations which allow coherent nexus and corresponding synoptic depictions (the so-called "joins") of two or more tables. At the moment, VerbaAlpina uses the database management system MySQL for the management of the tables. However, the tables are not bound to this system, but can be exported at any time, for example in the form of text with separators that have to be defined unambiguously both for field and data set limits; they are exported together with the row names and the documentation of the logical relations (entity-relationship model). The XML structure that is often used at the moment is not used in the operational activities of VerbaAlpina. But XML is anchored as export format within the interface concept.

Besides the logical structuring of data, the coding of the characters is the second important concept in connection with the term "digitisation". The right handling of this topic is of fundamental importance with regard to the long-term filing of the data material. As far as possible, VerbaAlpina gets its bearings by the encoding table and the guidelines of the Unicode Consortium. In the case of the digitisation of characters that have not been included yet in the Unicode table the digital data capture of a single character takes primarily place by serialisation choosing a sequence of characters out of the Unicode code space x21 to x7E (within the ASCII range). The corresponding allocations are documented in special tables; this procedure allows a conversion in Unicode values which possibly will be available at a later date.


(auct. Stephan Lücke – trad. Susanne Oberholzer)

Tags: Linguistics Information technology



Entity-Relationship  (Quote)

In principle, data can be classed into so-called "entities". These are classes of data that show each a particular kind and number of specific features. So, the cities Trento, Innsbruck and Lucerne can form for example a class "places" which is characterised by the features "place name", "degree of longitude", "degree of latitude", "state" and "number of inhabitants". The single members of such a class differ from each other in the different values of the features that characterise this class.
In a relational database, each entity is ideally saved in an own table with the values of one specific feature in each table column. The table rows contain the the individual members of the data class (entity). In most cases – also in VerbaAlpina -, a relational database represents a collection of different entities (and hence tables) between which there a logical relations. So, the entity "informant" which is defined by the features "age", "sex", "birthplace" and "place of residence" is linked logically to the entity "places" in such a way that the values of the features "birthplace" and "place of residence" have a correspondence in the entity "places". Relations between members of these two entities result from the concordance of the features' values in each entity, which are congruent in their nature. In this case, there could result theoretically an assignment from identical values of the features "birthplace" and "place of residence", by which the geographical coordinates of the birthplace could be assigned indirectly to an informant. Looking at this specific example, one can easily recognize that problems could arise due to homonyms. To avoid such problems, integral numbers are usually applied as identifiers (briefly: "ID") that mark the members of an entity unambiguously.
This system of entities and their logical relations, which was sketched above, is called entity-relationship. The data stock, which is stored in a relational database can hardly be understood and used without any explanation of the dependences between the data within the database. Usually, entity-relationship is illustrated in form of a graphic scheme.
The entity-relationship is subject to permanent adaptations (and hereby changes) during the cyclic development phases of VerbaAlpina (cf. version control). Each filed version of VerbaAlpina will be stored with the the corresponding entity-relationship model of the underlying database version in form of an ER diagram. This diagram is created using the program yEd and saved as (GraphML) and as PDF document. The following chart is based on the entities and links of the database VA_XXX as it was on 20/03/125, but it does not reproduce it completely and has to be understood as illustrating example:





(auct. Stephan Lücke – trad. Susanne Oberholzer)

Tags: Information technology



Geocoding  (Quote)

Geocoding is a fundamental ordering criterion of the data which are administrated by VerbaAlpina; degrees of latitude and longitude are used for geocoding. The exactness of this coding varies depending on the data type; VerbaAlpina aims at a coding as exact as possible, to within a metre. In the case of linguistic data from atlases and dictionaries, it is generally only possible to do an approximate coding according to the place name. However, in the case of e.g. archaeological data a geocoding to within a meter is actually possible. Spots, lines (as streets, rivers etc.) and surfaces can be saved. For the geocoding, the so-called WKT format (https://en.wikipedia.org/wiki/Well-known_text) is essentially used, which is transferred to a specific MySQL format in the VA database by means of the function geomfromtext() (https://dev.mysql.com/doc/refman/5.7/en/gis-wkt-functions.html and is saved like this. The output in WKT result is done by means of the MySQL function astext().
The reference grid of the geocoding is the network of municipalities in the Alpine region, which can be output as surface or as spots, as required. The basis is the courses of the municipalities’ border from circa 2014, which VerbaAlpina received from its partner "Alpine Convention". A constant update of these data (which can often change due to administrative reforms) is unnecessary because they form merely a geographical reference frame. The spot depiction of the municipality grid is deduced in an algorithmic way from the municipalities’ borders and therefore secondary. The calculated municipality spots represent the geometric midpoints of the municipality surfaces and mark only by case theirs centre. If necessary, all data can be projected individually or in an accumulated way on the calculated municipality spot. This is the case for linguistic data out of atlases and dictionaries.
Additionally, there will be a honeycombed grid which is quasi geocoded: it portrays in fact the approximate position of the municipalities to each other, but it assigns at the same time an idealised surface with each time the same form and size to each municipality territory. By doing so, two alternative methods of mapping are offered to the users. Both have their advantages and disadvantages and both offer a certain suggestive potential because of their figurativeness. The topographic depiction gives a better insight into the concrete spatiality (with its very special ground profile, single transitions, valley courses, inaccessible valley exits etc.) because of its precision. The honeycomb map in comparison allows more abstracted visualisations of the data as it balances the sizes of municipality surfaces and agglomeration resp. scattered settlements. This is especially useful for quantitative maps because perceiving the size of the surface the impression of quantitative weight is instinctively created.


(auct. Thomas Krefeld | Stephan Lücke – trad. Susanne Oberholzer)

Tags: Linguistics Information technology Extralinguistic context



Long-term Archiving  (Quote)



(auct. Stephan Lücke)

Tags: Information technology



Modules  (Quote)

Cf. version control

Tags: Information technology



Transcription  (Quote)

The linguistic material is represented graphically in double way in order to fulfil the opposite demands of being faithful to the sources and of easy comparability:

(1) Input version in original transcription
In the VA portal, sources are brought together which come from different discipline's traditions (Romance studies, German studies, Slavonic studies) and which represent different historical stages of dialectological research. Some of the dictionary data have been collected at the beginning of the last century (GPSR) and others only a few years ago (ALD). It is therefore necessary for reasons of the history of science to respect the original transcription to the greatest possible extent. For technical reasons, it is, however, impossible to keep unchanged certain conventions. This is true especially for the vertical combination of base characters ('letters') and diacritical marks, as e.g. if a symbol for stress accent is positioned over a symbol for length over a vowel over a symbol for closure (Beta code). These conventions are transferred to linear sequences of characters in each time defined technical transcriptions, in which, however, exclusively ASCII characters are used (so-called Beta code). For the beta encoding, one can make to most of graphic resemblances between the original diacritic and the ASCII equivalence, which are intuitively understandable, to a certain degree. They are mnemonically favourable.

(2) Output version in IPA
The data output in a uniform transcription is desirable from the point of view of comparability and user-friendliness. Therefore, all Beta Codes are transferred to IPA characters using specific substitution routines. There are a few inevitable incompatibilities for the cases where two different basic characters in IPA correspond to one basic character which is specified by diacritics in the input transcription. This is especially the case for the degrees of vowel height: in the palatal row, the two basic characters <i> and <e> in combination with the diacritic closure dot and one or two opening ticks allow depicting six degrees of vowel height. In Beta encoding these vowels are the following: i – i( – i((– e?-- e – e(– e((. In IPA, there are only four basic characters for these vowels: i – ɪ – e – ɛ.

(auct. Thomas Krefeld – trad. Susanne Oberholzer)

Tags: Linguistics Information technology



Version Control  (Quote)

VerbaAlpina is composed of the following modules:

- VA_DB: data stock in the (MySQL) project database (va_xxx)
- VA_WEB: programme code of the project portals web interface www.verba-alpina.gwi.uni-muenchen.de along with the accompanying WordPress database (va_wp)
- VA-MT: media data files (photographs, films, text documents, sound recordings), that are in the media library of the web interface

All three modules form a consistent whole with mutual nexus and dependencies and can therefore not be separated from each other. During the project term, the actual status of the modules VA-DB and VA_Web will be "frozen" simultaneously at regular intervals in form of an electronic copy. These frozen copies get a version number according to the scheme [calendar year]/[serial number] (e.g. 15/1). The productive version of VA gets the marking XXX.

The production of copies of the VA media center (VA_MT) is unthinkable due to the generally enormous size of media data files. For this reason, no copy of this module is created during the process of version control. That is why elements that once have been filed in the media center cannot be removed from it if only one single VA version is combined with them.

In the project portal, there is the possibility to change between the "productive" VA version (subject to constant changes) and the filed ("frozen") versions. In the portal itself, an appropriate colouring of the background or rather certain user elements will inform if the productive or on of the filed versions of VA is activated at the moment. *Exclusively* the filed versions of VA are citable.


Cover pictures of previous versions of VerbaAlpina

Barn at Fex Platta, in Val Fex near Sils Maria, Upper Engadine (Picture: Thomas Krefeld)

Chalet on Roßsteinalm, above Lenggries (Picture: Thomas Krefeld)

15/1

Autumn in South Tyrol, near Passeier Valley (Picture: Susanne Oberholzer)

15/2

Treatment of Mascherpa cheese, Lombardy (Picture: Formaggio Bitto )

16/1

Alpsee, Immenstadt in Allgäu (Picture: Christina Mutter)

16/2

Hay harvest in Chiemgau (Picture: Groth-Schmachtenberger collection, open-air museum Glentleiten)

17/1

Hay harvest (Picture: Groth-Schmachtenberger collection, open-air museum Glentleiten)

17/2

Hay harvest (Picture: Groth-Schmachtenberger collection, open-air museum Glentleiten)

18/1

Winter landscape on the Plose above Brixen (I) (Picture: Stephan Lücke)

18/2

View accross Seiser Alm to the Odle Peaks (Picture: Stephan Lücke)

19/1

Zillertal Alps (Picture: Thomas Krefeld)

19/2



(auct. Stephan Lücke – trad. Susanne Oberholzer)

Tags: Information technology