From 283be02579c1b506ab936190df5e985b1e953c4c Mon Sep 17 00:00:00 2001 From: Lilian Gasser <gasserli@ethz.ch> Date: Tue, 29 Jan 2019 19:08:23 +0100 Subject: [PATCH] worked on notnames and additoinalinfo 1951, 1971 --- data/lists/not_names.txt | 20 +++++++++++ data/lists/wrongly_identified_speakers.txt | 39 ++++++++++++++++++++-- data/politicians/MPs_additionalInfo.csv | 13 ++++++++ 3 files changed, 70 insertions(+), 2 deletions(-) diff --git a/data/lists/not_names.txt b/data/lists/not_names.txt index 42ed3b55..044045dc 100644 --- a/data/lists/not_names.txt +++ b/data/lists/not_names.txt @@ -2,28 +2,39 @@ Alinea Alter Ari Art +ausser besser bietet +darin drehen +eher +ess Fällen fasse Ferner ferner findet Gallen +gehen Gründe hausen Herren Herr +Hunger immer Kasse Kollege Kollega komme Kosten +lassen Leider leider lieber +liegen +Masse +Minister +neben nehmen neu nicht @@ -34,6 +45,7 @@ Seite selber Sinne später +Ständer Steuer StGallen Stimmen @@ -44,14 +56,20 @@ Tunnel Ueber Hans Walter +Wegen Werner +Weitere weiterer +Welt Wer wissen Wort Worten Ziel +Zuerst +allemand autre +Berne Biffer biffer cerner @@ -61,6 +79,7 @@ cause dernier durant grande +Nicolas ouvert peu pilier @@ -71,3 +90,4 @@ rédiger tirer vote delle +tener diff --git a/data/lists/wrongly_identified_speakers.txt b/data/lists/wrongly_identified_speakers.txt index b2bd1b77..db6bb1a1 100644 --- a/data/lists/wrongly_identified_speakers.txt +++ b/data/lists/wrongly_identified_speakers.txt @@ -4,16 +4,27 @@ speaker not identifiable: 1891/20026465: Zweifel one time not identified --> is it a different one (not the Landammann) or was he already mentioned before? 1925/20029836,37,87: Seiler (CANTON MISSING) Berichterstatter [4810, 4815] 1925-03-28 00:00 9 1925/20029943: Welti (CANTON MISSING) [5655, 5656] 1925-09-29 00:00 6 +1971/20000592: ['M', 'Simon', 'Kohler', 'rassorteur', 'de', 'la', 'majorité'] 7 +1951/20034994: found a name: Studer ['Studer'] 0 Studer (CANTON MISSING) [5130, 5141] + found a name: Kunz ['Kunz'] 0 Kunz (CANTON MISSING) [3017, 3019] +1951/20034996: found a name: Studer ['Studer'] 0 Studer (CANTON MISSING) [5130, 5141] +1951/20035991: Dietschi (CANTON MISSING) Berichterstatter [1350, 1351] 1951-10-02 00:00 9 ['Dietschi', 'Berichterstatter'] +1951/20035171: Perrin (CANTON MISSING) rapporteur [3935, 3939] 1951-12-07 00:00 0 ['Perrin', 'rapporteur'] speaker not uniquely identified when he spoke the second time: -------------------------------------------------------------- 1925/20029924: Keller-Aargau Berichterstatter (first time), Keller Berichterstatter (after) 1925/20029928,29: Keller Berichterstatter (also first time), maybe check title of document... +1951/20034982: Perrin-Corcelles rapporteur (first time), after: found a name: M. Pétrin, rapporteur ['Pétrin', 'rapporteur'] 0 Perrin (CANTON MISSING) rapporteur [3935, 3939] +1951/20034995: after: Kunz (CANTON MISSING) [3017, 3019] 1951-04-03 00:00 21 ['Kunz'] +1951/20034996: Studer identifier is split into two words ---------------------------------- -1925/20029945: found a name: Schmid-Oberentf elden ['Schmid', 'Oberentf', 'elden'] 0 Schmid (CANTON MISSING) [4639, 4660] +1925/20029945, 1951/20035173: found a name: Schmid-Oberentf elden ['Schmid', 'Oberentf', 'elden'] 0 Schmid (CANTON MISSING) [4639, 4660] +1971/20000498: ['M', 'Muf', 'ny', 'rapporteur', 'de', 'la', 'majorité'] 7 --> finds Muff but is Mugny +1951/20034978,79,94: found a name: Bringolf- Schaff hausen ['Bringolf', 'Schaff', 'hausen'] 0 Bringolf (CANTON MISSING) [707, 706] identified as speech start but is in text: @@ -36,10 +47,22 @@ look for typical terms such as gestellt, gesagt, etc. 1971/20000007: La seconde réaction qu'a suscité chez moi l'intervention de M. Weber est le doute: 1971/20000007: Herr Kollega Gut hat es gesagt: 1971/20000007: Noch eine Antwort an Kollege Clottu -1971/20000010: Nun noch etwas zu Richard Müller. Erstens +1971/20000010,11: Nun noch etwas zu Richard Müller. Erstens 1971/20000024: Noch ein Wort zu Herrn Ständerat Wenk 1971/20000024: Herr Kollege Heimann stellt sich schliesslich gegen einen Finanzausgleich mit dem Hinweis +1971/20000093: Meine erste Frage an den Bundesrat lautet +1971/20000093: found a name: In zwei wesentlichen Punkten bin ich mit Herrn Kollega Biel absolut einverstanden ['zwei', 'wesentlichen', 'Punkten', 'Kollega', 'Biel', 'absolut', 'einverstanden'] 1 Biel Walter (Zürich ZH) [426] +1971/20000614: Zu Herrn Fischer +1951/20035112: Schmid (CANTON MISSING) [4639, 4646, 4660] 1951-09-26 00:00 27 ['Antrag', 'Schmid'] +Bundesrat not found: +-------------------- +1951/20035017,26: Petitpierre (CANTON MISSING) [3955, 3956] 1951-04-03 00:00 8 ['Petitpierre', 'conseiller', 'fédéral'] +1951/20035018,20,77,83: Rubattel (CANTON MISSING) [4381, 4382] 1951-04-03 00:00 6 ['Rubattel', 'conseiller', 'fédéral'] + +weird layout: +------------- +1971/20000663: de MM. Knüsel et Leu (there must be more speech starts, this is from a list of cantons and people inside a speech, !!! Layout) term very similar to one name is actually another name ------------------------------------------------------ @@ -59,6 +82,18 @@ person not yet in council 1971/20000055: Debétaz +person has entry date 29.11.71 but is not yet active (presumably): +------------------------------------------------------------------ +1971/20000587: Tanner Paul starts officiall on 29.11.71, discussion is on 30.11.71 --> finds two! +1971/20000588: one Kohler starts 29.11.71, discussion is on 30.11.71 --> finds two! +1971/20000726: one Muheim starts 29.11.71, discussion is on 8.12.71 --> finds two! + + +two people with same last name and same citizenship +--------------------------------------------------- +1951/20034993: Eggenberger Grabs + + Appenzeller ----------- 1894/20026597: Sonderegger diff --git a/data/politicians/MPs_additionalInfo.csv b/data/politicians/MPs_additionalInfo.csv index 20e10248..4e055601 100644 --- a/data/politicians/MPs_additionalInfo.csv +++ b/data/politicians/MPs_additionalInfo.csv @@ -19,3 +19,16 @@ Bühler,Peter Theophil,,Bünden Welti,Franz,,Basel Schmid,Arthur,,Oberentfelden Schmid,Jacques,,Olten +Müller,Alfred,,Amriswil +Müller,Hans Gottfried,,Aarberg +Müller,Alban,,Olten +Perrin,Tell,,Chaux +Perrin,Paul,,Corcelles +Schmid-Ruedin,Philipp,,Philip +Studer,Ernst,,Burgdorf +Meier,Christian,,Netstal +Kunz,Alois,,Hergiswil +Cottier,Henry,,Lausanne +Bringolf,Walther,,Schaff +Bringolf,Richard,,Peilz +Roth,August,,Frauenfeld -- GitLab