From b8a5295671a6d3fce6e59007e24dbfade39a98b7 Mon Sep 17 00:00:00 2001
From: Mirko Birbaumer <mirko.birbaumer@hslu.ch>
Date: Tue, 11 Mar 2025 22:30:40 +0000
Subject: [PATCH] adapted BF computations

---
 notebooks/Model Comparison/MC_2_1.ipynb | 92 ++++++++++++++++++-------
 1 file changed, 68 insertions(+), 24 deletions(-)

diff --git a/notebooks/Model Comparison/MC_2_1.ipynb b/notebooks/Model Comparison/MC_2_1.ipynb
index 2275e72..18139ac 100644
--- a/notebooks/Model Comparison/MC_2_1.ipynb	
+++ b/notebooks/Model Comparison/MC_2_1.ipynb	
@@ -48,6 +48,30 @@
     "plt.rcParams['figure.figsize'] = [10, 3]"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Zuerst werden wir den Bayes-Faktor analytisch berechnen. Wir können das Modell wie folgt schreiben:\n",
+    "\\begin{align*}\n",
+    "\\theta & \\sim \\text{Beta}(\\alpha, \\beta)\\\\\n",
+    "y & \\sim \\text{Bernoulli}(p=\\theta)\n",
+    "\\end{align*}\n",
+    "Dann gilt für die marginale Wahrscheinlichkeit\n",
+    "\\begin{equation*}\n",
+    "p(y)=\\binom{n}{h}\\frac{B(\\alpha+h, \\beta+n-h)}{B(\\alpha, \\beta)}\n",
+    "\\end{equation*}\n",
+    "wobei $B$ die Betafunktion ist (nicht zu verwechseln mit der Beta-Verteilung), $n$ die Anzahl Versuche und $h$ die Anzahl Erfolge.\n",
+    "\n",
+    "Da wir uns nur für den relativen Wert der marginalen Wahrscheinlichkeit für zwei unterschiedliche Modelle interessieren, können wir den Binomialkoeffizienten $\\binom{n}{h}$ weglassen:\n",
+    "\n",
+    "\\begin{equation*}\n",
+    "p(y)\\propto \\binom{n}{h}\\frac{B(\\alpha+h, \\beta+n-h)}{B(\\alpha, \\beta)}\n",
+    "\\end{equation*}\n",
+    "\n",
+    "Dieser Ausdruck ist im folgenden Code-Block implementiert:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 10,
@@ -71,6 +95,21 @@
     "    return p_y"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "`betaln` ist der natürliche Logarithmus der Beta-Funktion."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Unser Datensatz beinhaltet 30 Würfe, wobei neunmal $K$ geworfen wurde. Wir vergleichen nun zwei Modelle, eine Beta-Verteilung mit $\\alpha_1=4$ und $\\beta_1=8$ und dann eine Beta-Verteilung mit $\\alpha_2=8$ und $\\beta_2=4$.\n",
+    "\n"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 12,
@@ -104,23 +143,27 @@
     "print(round(BF))"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Für den Bayes-Faktor ergibt sich also 11, d.h., das Modell mit $\\text{Beta}(\\alpha=4, \\beta=8)$ hat eine elfmal höhere Plausibilität als das Modell mit $\\text{Beta}(\\alpha=8, \\beta=4)$, woraus wir schliessen, dass die erste Prior-Verteilung besser zu den Daten passt, was hier keine \\\"Uberraschung ist. "
+   ]
+  },
   {
    "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 18,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Collecting preliz\n",
-      "  Downloading preliz-0.11.0-py3-none-any.whl.metadata (6.1 kB)\n",
+      "Requirement already satisfied: preliz in /opt/conda/lib/python3.10/site-packages (0.11.0)\n",
       "Requirement already satisfied: matplotlib>=3.5 in /opt/conda/lib/python3.10/site-packages (from preliz) (3.9.1.post1)\n",
-      "Collecting numba>=0.59 (from preliz)\n",
-      "  Downloading numba-0.61.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (2.8 kB)\n",
+      "Requirement already satisfied: numba>=0.59 in /opt/conda/lib/python3.10/site-packages (from preliz) (0.61.0)\n",
       "Requirement already satisfied: numpy>=1.22 in /opt/conda/lib/python3.10/site-packages (from preliz) (1.26.4)\n",
-      "Collecting scipy<1.13,>=1.9.1 (from preliz)\n",
-      "  Downloading scipy-1.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)\n",
+      "Requirement already satisfied: scipy<1.13,>=1.9.1 in /opt/conda/lib/python3.10/site-packages (from preliz) (1.12.0)\n",
       "Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib>=3.5->preliz) (1.2.1)\n",
       "Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.10/site-packages (from matplotlib>=3.5->preliz) (0.12.1)\n",
       "Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.10/site-packages (from matplotlib>=3.5->preliz) (4.56.0)\n",
@@ -129,22 +172,15 @@
       "Requirement already satisfied: pillow>=8 in /opt/conda/lib/python3.10/site-packages (from matplotlib>=3.5->preliz) (8.3.2)\n",
       "Requirement already satisfied: pyparsing>=2.3.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib>=3.5->preliz) (3.2.1)\n",
       "Requirement already satisfied: python-dateutil>=2.7 in /opt/conda/lib/python3.10/site-packages (from matplotlib>=3.5->preliz) (2.9.0.post0)\n",
-      "Collecting llvmlite<0.45,>=0.44.0dev0 (from numba>=0.59->preliz)\n",
-      "  Downloading llvmlite-0.44.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.8 kB)\n",
-      "Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib>=3.5->preliz) (1.17.0)\n",
-      "Downloading preliz-0.11.0-py3-none-any.whl (514 kB)\n",
-      "Downloading numba-0.61.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.8 MB)\n",
-      "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.8/3.8 MB\u001b[0m \u001b[31m31.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
-      "Downloading scipy-1.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (38.4 MB)\n",
-      "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m38.4/38.4 MB\u001b[0m \u001b[31m66.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n",
-      "Downloading llvmlite-0.44.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.4 MB)\n",
-      "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m42.4/42.4 MB\u001b[0m \u001b[31m51.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n",
-      "Installing collected packages: scipy, llvmlite, numba, preliz\n",
-      "  Attempting uninstall: scipy\n",
-      "    Found existing installation: scipy 1.13.1\n",
-      "    Uninstalling scipy-1.13.1:\n",
-      "      Successfully uninstalled scipy-1.13.1\n",
-      "Successfully installed llvmlite-0.44.0 numba-0.61.0 preliz-0.11.0 scipy-1.12.0\n"
+      "Requirement already satisfied: llvmlite<0.45,>=0.44.0dev0 in /opt/conda/lib/python3.10/site-packages (from numba>=0.59->preliz) (0.44.0)\n",
+      "Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib>=3.5->preliz) (1.17.0)\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "The PostScript backend does not support transparency; partially transparent artists will be rendered opaque.\n"
      ]
     },
     {
@@ -164,7 +200,15 @@
     "priors = ((alpha_1, beta_1),(alpha_2, beta_2))\n",
     "with plt.rc_context({\"lines.linewidth\": 3}):\n",
     "    for a, b in priors:\n",
-    "        ax = pz.Beta(a, b).plot_pdf()"
+    "        ax = pz.Beta(a, b).plot_pdf()\n",
+    "plt.savefig(\"BF_priors_beta.eps\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Nun berechnen wir den Bayes-Faktor mit Hilfe von _Sequential Monte Carlo_ (SMC). Ein Nebenprodukt vom SMC Sampler ist die Schätzung der Marginal Likelihood."
    ]
   },
   {
-- 
GitLab