Zapewniamy
  • kompleksową diagnostykę wymowy
  • redukcję polskiego akcentu
  • poznanie dźwięków języka angielskiego
  • specjalistyczny trening wymowy
  • trening estetycznej artykulacji
Ciekawostki
  • Do pierwszego roku życia niemowlaki są w stanie rozróżnić wszystkie dźwięki mowy występujące w językach.dalej
  • Brytyjski angielski ma 11 samogłosek (bez samogłoski redukowanej w nieakcentowanych sylabach). Polski ma 6 samogłosek (bez unosowienia). Co więcej, nie ma w polskim samogłoski, którą byśmy mogli bezpośrednio przenieść do wymowy angielskiej. Wszystkie angielskie samogłoski są inne niż ...dalej
  • Bogactwo dźwięków mowy w językach świata jest niewiarygodne. Klasyfikuje się 558 spółgłosek, 260 samogłosek oraz 51 dyftongówdalej

PolGem

Strona główna » Badania » PolGem

wielkość tekstu: A | A | A

 

 

POLGEM - the recorded corpus of Polish geminate consonants

Arkadiusz Rojczyk
Andrzej Porzuczek

Speech Processing Laboratory, University of Silesia in Katowice
(Acknowledged assistance by Monika Piwowarska)

 

The project supported by the National Science Centre Poland grant Acoustic properties of Polish geminate consonants 2017/25/B/HS2/02548 (Principal Investigator: Arkadiusz Rojczyk)

 

1. Introduction

POLGEM is a recorded corpus of Polish orthographic geminates that we make available to all researchers investigating geminates in the world's languages. It contains the recorded productions of 111 Polish words by 54  native speakers of Polish (5994 tokens altogether). The recordings are segmented into individual words in wav files. They may be used for both acoustic analyses and perception experiments.

2. Structure

The Excel file below contains detailed information on the recorded words. They are annotated using the following categories:
Word
Transcription
English translation
Consonant
Consonant class
Voicing
Word position
Lexical/Morphological
Borrowing YES/NO
Borrowing (language)

1

2

3

4

5

6

7

8

9

10

 

 

Word

 

 

Transcription

 

 

 

English translation

Consonant

Consonant class

Voicing

Word position

Lexical/ Morphological

Borrowing YES/NO

Borrowing (language)[1]

Abba

'abba

Abba

b

plo

+

MED

L

Y

SWE

Addis Abeba

a(d)dis a'bɛba

Addis Abeba

d

plo

+

MED

L

Y

AMH

Aleppo

a'lɛppɔ

Aleppo

p

plo

-

MED

L

Y

ITA[2]

allegro

al'lɛgrɔ

allegro

l

app

+

MED

L

Y

ITA

alleluja

allɛ'luja

hallellujah

l

app

+

MED

L

Y

HEB

annały

an'nawɨ

annals

n

nas

+

MED

L

Y

LAT

Annasz

'annaʂ

Annas

n

nas

+

MED

L

Y

HEB

attyka

at'tɨka

attic

t

plo

-

MED

L

Y

GRE

ballada

bal'lada

ballad

l

app

+

MED

L

Y

FRA

bessa

'bɛssa

slump

s

fri

-

MED

L

Y

FRA

bezzębny

bɛz'zɛmbnɨ

toothless

z

fri

+

MED

M

N

 

biennale

bjɛn'nalɛ

biennial

n

nas

+

MED

L

Y

ITA

Budda

'budda

Buddha

d

plo

+

MED

L

Y

SKR

bulla

'bulla

(pope's) bull

l

app

+

MED

L

Y

LAT

confetti

kɔn'fɛtti

confetti

t

plo

-

MED

L

Y

ITA

corrida

kɔr'rida

bullfight

r

tap

+

MED

L

Y

ESP

czczo

tʂtʂɔ

empty (ADV)

aff

-

INI

L

N

 

czczy

tʂtʂɨ

empty

aff

-

INI

L

N

 

czynny

'tʂɨnnɨ

active

n

nas

+

MED

M

N

 

denny

'dɛnnɨ

demersal

n

nas

+

MED

M

N

 

dziecinny

dʑɛ'tɕinnɨ

childish

n

nas

+

MED

M

N

 

dziennik

'dʑɛɲɲik

diary

n

nas

+

MED

M

N

 

dziewanna

dʑɛ'vanna

mullein

n

nas

+

MED

L

N

 

dżdżownica

dʐdʐɔv'nitsa

earthworm

aff

+

INI

L

N

 

dżdżysty

'dʐdʐɨstɨ

drizzly

aff

+

INI

L

N

 

errata

ɛr'rata

errata

r

tap

+

MED

L

Y

LAT

falliczny

fal'litʂnɨ

phallic

l

app

+

MED

L

Y

GRE

flotylla

flɔ'tɨlla

flotilla

l

app

+

MED

L

Y

ESP

gamma

'gamma

gamma

m

nas

+

MED

L

Y

GRE

gehenna

gɛ'hɛnna

Gehenna

n

nas

+

MED

L

Y

HEB

getto

'gɛttɔ

ghetto

t

plo

-

MED

L

Y

ITA

glissando

gli(s)'sandɔ

glissando

s

fri

-

MED

L

Y

FRA

gminny

'gminnɨ

borough (ADJ)

n

nas

+

MED

M

Y

GER

greccy

'rɛtstsɨ

Greek PL

ts

aff

-

MED

M

N

 

helleński

hɛl'lɛɲski

Hellenic

l

app

+

MED

L

Y

GRE

hi(p)pis

'hi(p)pis

hippy

p

plo

-

MED

L

Y

ENG

hippika

hip'pika

equestrianism

p

plo

-

MED

L

Y

GRE

hobby

'hɔbbɨ

hobby

b

plo

+

MED

L

Y

ENG

horror

'hɔrrɔr

horror

r

tap

+

MED

L

Y

ENG

hossa

'hɔssa

boom

s

fri

-

MED

L

Y

FRA

hostessa

hɔs'tɛssa

hostess

s

fri

-

MED

L

Y

FRA

idylla

i'dɨlla

idyll

l

app

+

MED

L

Y

GRE

immanentny

imma'nɛntnɨ

immanent

m

nas

+

MED

M

Y

LAT

immatrykulacja

immatrɨku'latsja

matriculation

m

nas

+

MED

M

Y

LAT

immunitet

immu'ɲitɛt

immunity

m

nas

+

MED

M

Y

LAT

immunologia

immunɔ'lɔgja

immunology

m

nas

+

MED

M

Y

LAT/GRE

inny

'innɨ

other

n

nas

+

MED

M

N

 

Ja(f)fa

'ja(f)fa

Jaffa

f

fri

-

MED

L

Y

HEB

jedźcie

'jɛtɕtɕɛ

ride (IMP, PL)

aff

-

MED

M

N

 

konny

'kɔnnɨ

equestrian

n

nas

+

MED

M

N

 

lasso

'lassɔ

lasso

s

fri

-

MED

L

Y

ESP

Lavazza

la'vatstsa

Lavazza

ts

aff

-

MED

L

Y

ITA

lekki

'lɛkki

light

k

plo

-

MED

L

N

 

lenno

'lɛnnɔ

fief

n

nas

+

MED

L

Y

GER

lobby

'lɔbbɨ

lobby

b

plo

+

MED

L

Y

ENG

lotto

'lɔttɔ

lotto

t

plo

-

MED

L

Y

ITA

Lozanna

lɔ'zanna

Lausanne

n

nas

+

MED

L

Y

FRA

madonna

ma'dɔnna

madonna

n

nas

+

MED

L

Y

ITA

maggi

'magi

stock cube

g

plo

+

MED

L

Y

ITA

mammografia

mammɔ'grafja

mammography

m

nas

+

MED

L

Y

LAT/GRE

manna

'manna

manna, semolina

n

nas

+

MED

L

Y

GRE

Marzanna

ma'ʐanna

a female name

n

nas

+

MED

L

N

 

Marzenna

ma'ʐɛnna

a female name

n

nas

+

MED

L

N

 

Mekka

'mɛkka

Mecca

k

plo

-

MED

L

Y

ARB

mełła

'mɛwwa

(She) milled

w

app

+

MED

 

N

 

mennica

mɛn'ɲitsa

mint (a place)

n

nas

+

MED

L

Y

GER

messa

'mɛssa

mess

s

fri

-

MED

L

Y

GER

miękki

'mjɛŋ(k)ki

soft

k

plo

-

MED

L

N

 

mułła

'muwwa

mullah

w

app

+

MED

L

Y

TUR

najjaśniejszy

najjaɕ'ɲɛjʂɨ

the brightest

j

app

+

MED

M

N

 

narracja

nar'ratsja

narration

r

tap

+

MED

L

Y

LAT

niewinnie

ɲɛ'viɲɲɛ

innocent (ADV)

ɲ

nas

+

MED

M

N

 

obronna

ɔ'brɔnna

defensive

n

nas

+

MED

M

N

 

oddać

'ɔddatɕ

give back

d

plo

+

MED

M

N

 

oddech

'ɔddɛx

breath

d

plo

+

MED

M

N

 

panna

'panna

maiden

n

nas

+

MED

M

N

 

passa

'passa

streak

s

fri

-

MED

L

Y

FRA

pełła

'pɛwwa

(She) weeded

w

app

+

MED

 

N

 

piccolo

'pi(k)kɔlɔ

piccolo

k

plo

-

MED

L

Y

ITA

pizza

'pitstsa

pizza

ts

aff

-

MED

L

Y

ITA

pizzeria

pits'tsɛrja

pizzeria

ts

aff

-

MED

L

Y

ITA

płonna

'pwɔnna

vain

n

nas

+

MED

M

N

 

płynny

'pwɨnnɨ

liquid

n

nas

+

MED

M

N

 

poddać

'pɔddatɕ

submit

d

plo

+

MED

M

N

 

Pyrrus

'pɨrrus

Pyrrhus

r

tap

+

MED

L

Y

GRE

rattan

'ra(t)tan

rattan

t

plo

-

MED

L

Y

ENG

rozzuć

'rɔzzutɕ

take off (sb's shoes)

z

fri

+

MED

M

N

 

salmonella

salmɔ'nɛlla

salmonella

l

app

+

MED

L

Y

ENG

sawanna

sa'vanna

savanna

n

nas

+

MED

L

Y

ESP

senna

'sɛnna

sleepy (FEM)

n

nas

+

MED

M

N

 

Sewilla

sɛ'villa

Seville

l

app

+

MED

L

Y

ESP

skłonny

'skwɔnnɨ

prone

n

nas

+

MED

M

N

 

spaghetti

spa'gɛtti

spaghetti

t

plo

-

MED

L

Y

ITA

ssać

ssatɕ

suck

s

fri

-

INI

L

N

 

ssak

ssak

mammal

s

fri

-

INI

L

N

 

surrealizm

surrɛ'alizm

surrealism

r

tap

+

MED

M

Y

FRA

sutanna

su'tanna

cassock

n

nas

+

MED

L

Y

ITA

świeccy

'ɕfjɛtstsɨ

laymen

ts

aff

-

MED

M

N

 

terror

'tɛrrɔr

terror

r

tap

+

MED

L

Y

LAT

to(r)reador

tɔ(r)rɛ'adɔr

toreador

r

tap

+

MED

L

Y

ESP

Toro Rosso

tɔrɔ 'rɔssɔ

Toro Rosso/Red Bull

s

fri

-

MED

L

Y

ITA

tutti-frutti

tutti 'frutti

tutti-frutti

t

plo

-

MED

L

Y

ITA

uczczą

'utʂtʂõ

(They) will celebrate

aff

-

MED

 

N

 

vendetta

vɛn'dɛtta

vendetta

t

plo

-

MED

L

Y

 

wanna

'vanna

bathtub

n

nas

+

MED

M

Y

GER

willa

'villa

villa

l

app

+

MED

L

Y

LAT

winnica

viɲ'ɲitsa

vineyard

ɲ

nas

+

MED

M

N

 

wwozić

'vvɔʑitɕ

bring in

v

fri

+

INI

M

N

 

zmienna

'zmjɛnna

variable

n

nas

+

MED

M

N

 

zzuć

zzutɕ

take off (shoes)

z

fri

+

INI

M

N

 

 

[1] We used Tokarski (1980) to verify word origins.

[2] Where possible, we indicate the direct loan language rather than the word origin, which, especially in the case of geographical names, may be disputable.

Structure Excel file

3. Speakers

The following Excel file contains speaker data with gender and age:

Speaker Excel file
 

4. Analysis and coding

All recordings were analysed to determine three aspects of each produced token:

  1. Degemination – in contemporary Polish some words undergo degemination, which may emerge in some specific words or be speaker-specific. In order to facilitate future work with our database, the degeminated tokens were coded in the file name with an appended _S label. Of the total of 5994 recorded words, 638 (11%) were degeminated and realised with a singleton. The most frequently degeminated words were piccolo ‘piccolo’ (83%), hippis ‘hippi’, (76%), maggi ‘magi’ (74%),  glissando ‘glissando’ (61%). All these words are borrowings from Italian, English and French. All speakers produced at least one degeminated token, however some speakers were more prone to degeminate than others. For example, Speaker 9 produced only one degeminated word while Speaker 53 produced as many as 34 degeminated words.
  2.  Rearticulation – all recorded words were analysed to detect rearticulation in geminate consonants. Rearticulation occurred in 1231 words, yielding a proportion of 20% in the total of 5994 words. Similar to the report by Rojczyk and Porzuczek (2019), there is an observable between-speaker variability in the rearticulation rate with the range of 3% (Speaker 28) to 54% (Speaker 52).
  3. Mispronunciation – words with audible mispronunciations were coded by appending _MIS to the file name of each mispronounced word. The  number of mispronounced productions was very low, only 15 words were mispronounced, yielding the proportion of 0.2%.  

Analysis Excel file

 

4. Recordings

Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Speaker 11
Speaker 12
Speaker 13
Speaker 14
Speaker 15
Speaker 16
Speaker 17
Speaker 18
Speaker 19
Speaker 20
Speaker 21
Speaker 22
Speaker 23
Speaker 24
Speaker 25
Speaker 26
Speaker 27
Speaker 28
Speaker 29
Speaker 30
Speaker 31
Speaker 32
Speaker 33
Speaker 34
Speaker 35
Speaker 36
Speaker 37
Speaker 38
Speaker 39
Speaker 40
Speaker 41
Speaker 42
Speaker 43
Speaker 44
Speaker 45
Speaker 46
Speaker 47
Speaker 48
Speaker 49
Speaker 50
Speaker 51
Speaker 52
Speaker 53
Speaker 54
 

References:

Rojczyk, A. and A. Porzuczek. (2019). “Durational properties of Polish geminate consonants”. Journal of the Acoustical Society of America 146(6). 4171-4182.

Tokarski, J. (ed.). 1980. Słownik wyrazów obcych PWN. Warszawa: Państwowe Wydawnictwo Naukowe.

 

Learn&Relax

ProSpeech LIF Language in Focus
ul. Słowików 18/5, 41-940 Piekary Śląskie

tel.:
e-mail: office@prospeech.pl

ProSpeech - Professional English Pronunciation Training
Poleć stronę
Zapytanie
Facebook
Wersja mobilna