update alignment

souzatharsis · Dec 19, 2024 · 95bfad4 · 95bfad4
1 parent 9252b8e
commit 95bfad4
Show file tree

Hide file tree

Showing 31 changed files with 1,220 additions and 529 deletions.
diff --git a/tamingllms/_build/.doctrees/environment.pickle b/tamingllms/_build/.doctrees/environment.pickle
diff --git a/tamingllms/_build/.doctrees/markdown/intro.doctree b/tamingllms/_build/.doctrees/markdown/intro.doctree
diff --git a/tamingllms/_build/.doctrees/markdown/preface.doctree b/tamingllms/_build/.doctrees/markdown/preface.doctree
diff --git a/tamingllms/_build/.doctrees/markdown/toc.doctree b/tamingllms/_build/.doctrees/markdown/toc.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/alignment.doctree b/tamingllms/_build/.doctrees/notebooks/alignment.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/evals.doctree b/tamingllms/_build/.doctrees/notebooks/evals.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/output_size_limit.doctree b/tamingllms/_build/.doctrees/notebooks/output_size_limit.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/safety.doctree b/tamingllms/_build/.doctrees/notebooks/safety.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/structured_output.doctree b/tamingllms/_build/.doctrees/notebooks/structured_output.doctree
diff --git a/tamingllms/_build/html/_images/fakealign.png b/tamingllms/_build/html/_images/fakealign.png
diff --git a/tamingllms/_build/html/_sources/notebooks/alignment.ipynb b/tamingllms/_build/html/_sources/notebooks/alignment.ipynb
diff --git a/tamingllms/_build/html/_sources/notebooks/safety.ipynb b/tamingllms/_build/html/_sources/notebooks/safety.ipynb
@@ -2461,6 +2461,18 @@
     "Having said that, I want to be clear that further investigation is needed before one could claim that the dataset is unsafe. Here, we only show anecdotal evidence that the dataset contains unsafe content for our particular case study. We do not claim that the dataset is unsafe per se. Instead, a superior experiment would have constructed a proper dataset that more closely matches what safe conversations look like in the application domain we are studying."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Takeaways\n",
+    "\n",
+    "- Safety is a complex problem and there is no one-size-fits-all solution.\n",
+    "- Starting with a well-aligned policy is key to developing a robust data and evaluation framework.\n",
+    "- Domain experts are key to this process and should be involved in the development of the evaluation framework from the start.\n",
+    "- While custom safety filters can be effective, carefully evaluate pre-built solutions that may offer better performance and cost trade-offs for your specific use case\n"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},

diff --git a/tamingllms/_build/html/_static/alignment/fakealign.png b/tamingllms/_build/html/_static/alignment/fakealign.png
diff --git a/tamingllms/_build/html/markdown/intro.html b/tamingllms/_build/html/markdown/intro.html
@@ -226,7 +226,7 @@
 <hr>
           <div class="content" role="main" v-pre>
 
-  <section class="tex2jax_ignore mathjax_ignore" id="introduction">
+  <section id="introduction">
 <span id="intro"></span><h1><a class="toc-backref" href="#id1" role="doc-backlink"><span class="section-number">2. </span>Introduction</a><a class="headerlink" href="#introduction" title="Permalink to this heading">¶</a></h1>
 <blockquote class="epigraph">
 <div><p>I am always doing that which I cannot do, in order that I may learn how to do it.</p>
@@ -304,7 +304,7 @@ <h2><a class="toc-backref" href="#id5" role="doc-backlink"><span class="section-
 <li><p>Share their own experiences and solutions with the community</p></li>
 <li><p>Propose new chapters or sections that address emerging challenges</p></li>
 </ul>
-<p>The repository can be found at <a class="reference external" href="https://github.com/souzatharsis/tamingllms">https://github.com/souzatharsis/tamingllms</a>. Whether you’ve found a typo, have a better solution to share, or want to contribute an entirely new section, your contributions are welcome.</p>
+<p>The repository can be found at https://github.com/souzatharsis/tamingllms. Whether you’ve found a typo, have a better solution to share, or want to contribute an entirely new section, your contributions are welcome.</p>
 </section>
 <section id="a-note-on-perspective">
 <h2><a class="toc-backref" href="#id6" role="doc-backlink"><span class="section-number">2.5. </span>A Note on Perspective</a><a class="headerlink" href="#a-note-on-perspective" title="Permalink to this heading">¶</a></h2>
@@ -416,7 +416,7 @@ <h3><a class="toc-backref" href="#id14" role="doc-backlink"><span class="section
 <h2><a class="toc-backref" href="#id15" role="doc-backlink"><span class="section-number">2.10. </span>About the Author(s)</a><a class="headerlink" href="#about-the-author-s" title="Permalink to this heading">¶</a></h2>
 <p>Dr. Tharsis Souza is a computer scientist and product leader specializing in AI-based products. He is a Lecturer at Columbia University’s Master of Science program in Applied Analytics, (<em>incoming</em>) Head of Product, Equities at Citadel, and former Senior VP at Two Sigma Investments. He also enjoys mentoring under-represented students &amp; working professionals to help create a more diverse global AI ecosystem.</p>
 <p>With over 15 years of experience delivering technology products across startups and Fortune 500 companies, Dr. Souza is also an author of numerous scholarly publications and is a frequent speaker at academic and business conferences. Grounded on academic background and drawing from practical experience building and scaling up products powered by language models at early-stage startups, major institutions as well as advising non-profit organizations, and contributing to open source projects, he brings a unique perspective on bridging the gap between LLMs promised potential and their practical implementation challenges to enable the next generation of AI-powered products.</p>
-<p>Dr. Tharsis holds a Ph.D. in Computer Science from UCL, University of London following an M.Phil. and <a class="reference external" href="http://M.Sc">M.Sc</a>. in Computer Science and a <a class="reference external" href="http://B.Sc">B.Sc</a>. in Computer Engineering.</p>
+<p>Dr. Tharsis holds a Ph.D. in Computer Science from UCL, University of London following an M.Phil. and M.Sc. in Computer Science and a B.Sc. in Computer Engineering.</p>
 </section>
 </section>
 

diff --git a/tamingllms/_build/html/markdown/preface.html b/tamingllms/_build/html/markdown/preface.html
@@ -208,13 +208,13 @@
 <hr>
           <div class="content" role="main" v-pre>
 
-  <section class="tex2jax_ignore mathjax_ignore" id="preface">
+  <section id="preface">
 <h1><span class="section-number">1. </span>Preface<a class="headerlink" href="#preface" title="Permalink to this heading">¶</a></h1>
 <blockquote class="epigraph">
 <div><p>Models tell you merely what something is like, not what something is.</p>
 <p class="attribution">—Emanuel Derman</p>
 </div></blockquote>
-<p>An alternative title of this book could have been “Language Models Behaving Badly”. If you are coming from a background in financial modeling, you may have noticed the parallel with Emanuel Derman’s seminal work “Models.Behaving.Badly” <span id="id1">[<a class="reference internal" href="#id118" title="E. Derman. Models.Behaving.Badly.: Why Confusing Illusion with Reality Can Lead to Disaster, on Wall Street and in Life. Free Press, 2011. ISBN 9781439165010. URL: https://books.google.co.uk/books?id=lke_cwM4wm8C.">Derman, 2011</a>]</span>. This parallel is not coincidental. Just as Derman cautioned against treating financial models as perfect representations of reality, this book aims to highlight the limitations and pitfalls of Large Language Models (LLMs) in practical applications (of course baring the fact Derman is an actual physicist and legendary author, professor and quant; I am not).</p>
+<p>An alternative title of this book could have been “Language Models Behaving Badly”. If you are coming from a background in financial modeling, you may have noticed the parallel with Emanuel Derman’s seminal work “Models.Behaving.Badly” <span id="id1">[<a class="reference internal" href="#id124" title="E. Derman. Models.Behaving.Badly.: Why Confusing Illusion with Reality Can Lead to Disaster, on Wall Street and in Life. Free Press, 2011. ISBN 9781439165010. URL: https://books.google.co.uk/books?id=lke_cwM4wm8C.">Derman, 2011</a>]</span>. This parallel is not coincidental. Just as Derman cautioned against treating financial models as perfect representations of reality, this book aims to highlight the limitations and pitfalls of Large Language Models (LLMs) in practical applications (of course baring the fact Derman is an actual physicist and legendary author, professor and quant; I am not).</p>
 <p>The book “Models.Behaving.Badly” by Emanuel Derman, a former physicist and Goldman Sachs quant, explores how financial and scientific models can fail when we mistake them for reality rather than treating them as approximations full of assumptions.
 The core premise of his work is that while models can be useful tools for understanding aspects of the world, they inherently involve simplification and assumptions. Derman argues that many financial crises, including the 2008 crash, occurred partly because people put too much faith in mathematical models without recognizing their limitations.</p>
 <p>Like financial models that failed to capture the complexity of human behavior and market dynamics, LLMs have inherent constraints. They can hallucinate facts, struggle with logical reasoning, and fail to maintain consistency across long outputs. Their responses, while often convincing, are probabilistic approximations based on training data rather than true understanding even though humans insist on treating them as “machines that can reason”.</p>
@@ -224,7 +224,7 @@ <h1><span class="section-number">1. </span>Preface<a class="headerlink" href="#p
 <section id="references">
 <h2><span class="section-number">1.1. </span>References<a class="headerlink" href="#references" title="Permalink to this heading">¶</a></h2>
 <div class="docutils container" id="id2">
-<div class="citation" id="id118" role="doc-biblioentry">
+<div class="citation" id="id124" role="doc-biblioentry">
 <span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id1">Der11</a><span class="fn-bracket">]</span></span>
 <p>E. Derman. <em>Models.Behaving.Badly.: Why Confusing Illusion with Reality Can Lead to Disaster, on Wall Street and in Life</em>. Free Press, 2011. ISBN 9781439165010. URL: <a class="reference external" href="https://books.google.co.uk/books?id=lke_cwM4wm8C">https://books.google.co.uk/books?id=lke_cwM4wm8C</a>.</p>
 </div>

diff --git a/tamingllms/_build/html/markdown/toc.html b/tamingllms/_build/html/markdown/toc.html
@@ -202,7 +202,7 @@
   <img src="../_static/tamingcoverv1.jpg" style="background-color:white; width:50%;" alt="Taming LLMs Cover" />
  </a>
 <hr class="docutils" />
-<section class="tex2jax_ignore mathjax_ignore" id="taming-llms">
+<section id="taming-llms">
 <h1><a class="reference external" href="https://www.souzatharsis.com/tamingLLMs">Taming LLMs</a><a class="headerlink" href="#taming-llms" title="Permalink to this heading">¶</a></h1>
 <section id="a-practical-guide-to-llm-pitfalls-with-open-source-software">
 <h2><em>A Practical Guide to LLM Pitfalls with Open Source Software</em><a class="headerlink" href="#a-practical-guide-to-llm-pitfalls-with-open-source-software" title="Permalink to this heading">¶</a></h2>