Documentation

Cslib.MachineLearning.PACLearning.Defs

PAC Learning #

This file defines the Probably Approximately Correct (PAC) learning model introduced by Valiant [Val84], generalized to an arbitrary label type β and parameterized by a family of distributions 𝒟 on α × β.

A concept class C over domain α with labels in β is a collection of functions α → β. A learning algorithm receives a labeled sample drawn i.i.d. from an unknown joint distribution D on α × β and must produce a hypothesis whose 0-1 error is within ε of the best concept in C, with probability at least 1 - δ.

The single definition IsPACLearnerFor captures the realizable, agnostic, and noise-tolerant settings by varying the distribution family 𝒟:

The accuracy and confidence parameters ε and δ are elements of the subtype Set.Ioo (0 : ℝ≥0) 1, which bundles the value together with the proof that it lies in the open interval (0, 1), ensuring the learning condition is non-vacuous.

All declarations live under the Cslib.MachineLearning.PACLearning namespace so that generic names like error and optimalError do not pollute the parent namespace.

Main definitions #

Binary classification #

When β = Bool, concepts correspond to subsets of α. The section Binary Classification provides:

Main statements #

References #

Core Definitions #

@[reducible, inline]
abbrev Cslib.MachineLearning.PACLearning.ConceptClass (α : Type u_1) (β : Type u_2) :
Type (max u_1 u_2)

A concept class over domain α with label type β is a set of functions α → β. For binary classification (β = Bool), this is equivalent to a collection of subsets of α via the characteristic function.

Equations
Instances For
    @[reducible, inline]
    abbrev Cslib.MachineLearning.PACLearning.LabeledSample (α : Type u_1) (β : Type u_2) (m : ) :
    Type (max u_2 u_1)

    A labeled sample of size m over domain α with label type β is a finite sequence of (point, label) pairs.

    Equations
    Instances For
      @[reducible, inline]
      abbrev Cslib.MachineLearning.PACLearning.Learner (α : Type u_1) (β : Type u_2) (m : ) :
      Type (max u_1 u_2)

      A learner using m samples is a function that takes a labeled sample and produces a hypothesis (a function from the domain to the label type).

      Equations
      Instances For
        noncomputable def Cslib.MachineLearning.PACLearning.error {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] (D : MeasureTheory.Measure (α × β)) (h : αβ) :

        The prediction error (0-1 loss) of a hypothesis h under a joint distribution D on α × β, defined as the probability that the prediction disagrees with the label: D({(x, y) | h(x) ≠ y}).

        Equations
        Instances For
          noncomputable def Cslib.MachineLearning.PACLearning.optimalError {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] (D : MeasureTheory.Measure (α × β)) (C : ConceptClass α β) :

          The optimal error of a concept class C under a joint distribution D, defined as the infimum of error D c over all concepts c ∈ C. When C is empty this is , making the PAC learning condition vacuously true.

          Equations
          Instances For

            PAC Learners #

            def Cslib.MachineLearning.PACLearning.IsPACLearnerFor {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] (m : ) (ε δ : (Set.Ioo 0 1)) (C : ConceptClass α β) (𝒟 : Set (MeasureTheory.Measure (α × β))) :

            IsPACLearnerFor m ε δ C 𝒟 asserts that there exists a learner using m samples that is (ε, δ)-correct for the concept class C over the distribution family 𝒟: for every probability measure D ∈ 𝒟 on α × β, the probability (over i.i.d. samples from D) that the learner's hypothesis has error exceeding opt_C(D) + ε is at most δ.

            The parameters ε and δ are elements of Set.Ioo (0 : ℝ≥0) 1, bundling the value with the proof that it lies in (0, 1). This ensures the condition is non-vacuous: ε < 1 prevents the error threshold from exceeding the maximum possible error under a probability measure, and δ < 1 prevents the confidence bound from being trivially satisfied.

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For
              def Cslib.MachineLearning.PACLearning.IsRPACLearnerFor {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] (m : ) (ε δ : (Set.Ioo 0 1)) (C : ConceptClass α β) (𝒟 : Set (MeasureTheory.Measure (α × β))) :

              IsRPACLearnerFor m ε δ C 𝒟 asserts that there exists a randomized learner using m samples that is (ε, δ)-correct for the concept class C over the distribution family 𝒟. A randomized learner draws internal randomness ω from a probability space (Ω, Q) and acts as the deterministic learner A(ω).

              For every probability measure D ∈ 𝒟, the failure probability function ω ↦ D^m{S | error(A(ω)(S)) > opt_C(D) + ε} must be Q-a.e. measurable, and its expectation over ω must be at most δ.

              The randomness space Ω : Type* is universe-polymorphic; the universe is an implicit parameter of IsRPACLearnerFor, and downstream statements reference it via the pattern IsRPACLearnerFor.{_, _, u}. Fix u := 0 for the usual case of a standard randomness space.

              A deterministic learner (IsPACLearnerFor) is the special case Ω = PUnit; see IsPACLearnerFor.toIsRPACLearnerFor.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For
                theorem Cslib.MachineLearning.PACLearning.IsPACLearnerFor.toIsRPACLearnerFor {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {m : } {ε δ : (Set.Ioo 0 1)} {C : ConceptClass α β} {𝒟 : Set (MeasureTheory.Measure (α × β))} (h : IsPACLearnerFor m ε δ C 𝒟) :
                IsRPACLearnerFor m ε δ C 𝒟

                Every deterministic PAC learner is in particular a randomized PAC learner (with the trivial one-point randomness space PUnit).

                theorem Cslib.MachineLearning.PACLearning.IsPACLearnerFor.antitone_family {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {m : } {ε δ : (Set.Ioo 0 1)} {C : ConceptClass α β} {𝒟 𝒟' : Set (MeasureTheory.Measure (α × β))} (h𝒟 : 𝒟 𝒟') (h : IsPACLearnerFor m ε δ C 𝒟') :
                IsPACLearnerFor m ε δ C 𝒟

                The deterministic PAC learner predicate is antitone in the distribution family: a learner for a larger family 𝒟' is also a learner for any subfamily 𝒟 ⊆ 𝒟'.

                theorem Cslib.MachineLearning.PACLearning.IsPACLearnerFor.mono_δ {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {m : } {ε δ₁ δ₂ : (Set.Ioo 0 1)} ( : δ₁ δ₂) {C : ConceptClass α β} {𝒟 : Set (MeasureTheory.Measure (α × β))} (h : IsPACLearnerFor m ε δ₁ C 𝒟) :
                IsPACLearnerFor m ε δ₂ C 𝒟

                A PAC learner with confidence δ₁ is also a PAC learner with any weaker confidence δ₂ ≥ δ₁: the failure-probability bound only gets looser.

                theorem Cslib.MachineLearning.PACLearning.IsPACLearnerFor.mono_ε {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {m : } {δ ε₁ ε₂ : (Set.Ioo 0 1)} ( : ε₁ ε₂) {C : ConceptClass α β} {𝒟 : Set (MeasureTheory.Measure (α × β))} (h : IsPACLearnerFor m ε₁ δ C 𝒟) :
                IsPACLearnerFor m ε₂ δ C 𝒟

                A PAC learner with accuracy ε₁ is also a PAC learner with any weaker accuracy ε₂ ≥ ε₁: the bad event {error > opt + ε} only shrinks.

                theorem Cslib.MachineLearning.PACLearning.IsPACLearnerFor.antitone_C {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {m : } {ε δ : (Set.Ioo 0 1)} {C C' : ConceptClass α β} (hC : C C') {𝒟 : Set (MeasureTheory.Measure (α × β))} (h : IsPACLearnerFor m ε δ C' 𝒟) :
                IsPACLearnerFor m ε δ C 𝒟

                The deterministic PAC learner predicate is antitone in the concept class: a learner for a larger class C' is also a learner for any subclass C ⊆ C', since the agnostic benchmark optimalError _ C ≥ optimalError _ C' makes the error requirement easier.

                theorem Cslib.MachineLearning.PACLearning.IsRPACLearnerFor.antitone_family {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {m : } {ε δ : (Set.Ioo 0 1)} {C : ConceptClass α β} {𝒟 𝒟' : Set (MeasureTheory.Measure (α × β))} (h𝒟 : 𝒟 𝒟') (h : IsRPACLearnerFor m ε δ C 𝒟') :
                IsRPACLearnerFor m ε δ C 𝒟

                The randomized PAC learner predicate is antitone in the distribution family. The universe of the randomness space Ω is pinned so the hypothesis and conclusion share it.

                theorem Cslib.MachineLearning.PACLearning.IsRPACLearnerFor.mono_δ {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {m : } {ε δ₁ δ₂ : (Set.Ioo 0 1)} ( : δ₁ δ₂) {C : ConceptClass α β} {𝒟 : Set (MeasureTheory.Measure (α × β))} (h : IsRPACLearnerFor m ε δ₁ C 𝒟) :
                IsRPACLearnerFor m ε δ₂ C 𝒟

                A randomized PAC learner with confidence δ₁ is also a randomized PAC learner with any weaker confidence δ₂ ≥ δ₁. Unlike mono_ε or antitone_C, this does not touch the integrand, so it carries the AEMeasurable part through unchanged.

                PAC Learnability #

                A concept class C is PAC learnable over the distribution family 𝒟 if for every accuracy ε ∈ (0, 1) and confidence δ ∈ (0, 1), there exists a sample size m admitting a deterministic (ε, δ)-PAC learner for C. Here ε and δ are elements of the subtype Set.Ioo (0 : ℝ≥0) 1.

                Equations
                Instances For

                  A concept class C is randomized PAC learnable over the distribution family 𝒟 if for every accuracy ε ∈ (0, 1) and confidence δ ∈ (0, 1), there exists a sample size m admitting a randomized (ε, δ)-PAC learner for C. The randomness space is pinned to Type 0 at the learnability level; IsRPACLearnerFor itself remains universe-polymorphic.

                  Equations
                  Instances For

                    Deterministic PAC learnability implies randomized PAC learnability.

                    theorem Cslib.MachineLearning.PACLearning.IsPACLearnable.antitone_family {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {C : ConceptClass α β} {𝒟 𝒟' : Set (MeasureTheory.Measure (α × β))} (h𝒟 : 𝒟 𝒟') (h : IsPACLearnable C 𝒟') :

                    PAC learnability is antitone in the distribution family: a subfamily of a learnable family is learnable.

                    theorem Cslib.MachineLearning.PACLearning.IsPACLearnable.antitone_C {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {C C' : ConceptClass α β} (hC : C C') {𝒟 : Set (MeasureTheory.Measure (α × β))} (h : IsPACLearnable C' 𝒟) :

                    PAC learnability is antitone in the concept class: a subclass of a learnable class is learnable.

                    theorem Cslib.MachineLearning.PACLearning.IsRPACLearnable.antitone_family {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {C : ConceptClass α β} {𝒟 𝒟' : Set (MeasureTheory.Measure (α × β))} (h𝒟 : 𝒟 𝒟') (h : IsRPACLearnable C 𝒟') :

                    Randomized PAC learnability is antitone in the distribution family.

                    Sample Complexity #

                    @[reducible, inline]
                    abbrev Cslib.MachineLearning.PACLearning.LearnerModel (α : Type u_3) (β : Type u_4) [MeasurableSpace α] [MeasurableSpace β] :
                    Type (max (max u_4 u_3) u_4 u_3)

                    A learner model is a predicate on (sample size, accuracy, confidence, concept class, distribution family) that classifies which sample sizes admit a learner of the given kind. Instantiating with IsPACLearnerFor gives the deterministic model; with IsRPACLearnerFor gives the randomized one.

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For
                      noncomputable def Cslib.MachineLearning.PACLearning.sampleComplexity {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] (L : LearnerModel α β) (C : ConceptClass α β) (ε δ : (Set.Ioo 0 1)) (𝒟 : Set (MeasureTheory.Measure (α × β))) :

                      The sample complexity of a concept class C under a learner model L, at accuracy ε ∈ (0, 1) and confidence δ ∈ (0, 1) over distribution family 𝒟, is the smallest sample size m with L m ε δ C 𝒟. Specialize with L := IsPACLearnerFor for the deterministic model and L := IsRPACLearnerFor for the randomized one.

                      Caveat: because sInf on returns 0 for the empty set, this definition returns 0 when no learner exists (e.g., a concept class of infinite VC dimension). It is only meaningful when the defining set {m | L m ε δ C 𝒟} is nonempty. The IsPACLearnable.sampleComplexity_* variants below discharge this nonemptiness from a learnability hypothesis.

                      Equations
                      Instances For
                        noncomputable def Cslib.MachineLearning.PACLearning.rsampleComplexity {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] (C : ConceptClass α β) (ε δ : (Set.Ioo 0 1)) (𝒟 : Set (MeasureTheory.Measure (α × β))) :

                        The randomized sample complexity of C, i.e. sampleComplexity instantiated at the randomized learner model IsRPACLearnerFor. The randomness space is pinned to Type 0.

                        Equations
                        Instances For

                          Monotonicity of Sample Complexity #

                          These lemmas are all special cases of the following observation: if {m | L₁ m ε₁ δ₁ C₁ 𝒟₁} ⊆ {m | L₂ m ε₂ δ₂ C₂ 𝒟₂} and the first set is nonempty, then the sample complexity under (L₂, ε₂, δ₂, C₂, 𝒟₂) is at most the sample complexity under (L₁, ε₁, δ₁, C₁, 𝒟₁). The nonemptiness hypothesis is essential: sInf on returns 0 for an empty set, so without it the inequality can fail at the degenerate boundary. The IsPACLearnable-flavoured variants at the end of this section discharge that witness from a learnability hypothesis.

                          theorem Cslib.MachineLearning.PACLearning.sampleComplexity_le_of_forall {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {L₁ L₂ : LearnerModel α β} {ε₁ δ₁ ε₂ δ₂ : (Set.Ioo 0 1)} {C₁ C₂ : ConceptClass α β} {𝒟₁ 𝒟₂ : Set (MeasureTheory.Measure (α × β))} (hL : ∀ {m : }, L₁ m ε₁ δ₁ C₁ 𝒟₁L₂ m ε₂ δ₂ C₂ 𝒟₂) (h : ∃ (m : ), L₁ m ε₁ δ₁ C₁ 𝒟₁) :
                          sampleComplexity L₂ C₂ ε₂ δ₂ 𝒟₂ sampleComplexity L₁ C₁ ε₁ δ₁ 𝒟₁

                          General pointwise monotonicity of sampleComplexity: if every witness sample size for (L₁, ε₁, δ₁, C₁, 𝒟₁) is also a witness for (L₂, ε₂, δ₂, C₂, 𝒟₂), then the latter's sample complexity is at most the former's (provided the former is attained).

                          theorem Cslib.MachineLearning.PACLearning.sampleComplexity_antitone_δ {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {ε δ₁ δ₂ : (Set.Ioo 0 1)} ( : δ₁ δ₂) {C : ConceptClass α β} {𝒟 : Set (MeasureTheory.Measure (α × β))} (h : ∃ (m : ), IsPACLearnerFor m ε δ₁ C 𝒟) :

                          Deterministic sample complexity is antitone in the confidence parameter δ: weaker confidence requires no more samples.

                          theorem Cslib.MachineLearning.PACLearning.sampleComplexity_antitone_ε {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {ε₁ ε₂ δ : (Set.Ioo 0 1)} ( : ε₁ ε₂) {C : ConceptClass α β} {𝒟 : Set (MeasureTheory.Measure (α × β))} (h : ∃ (m : ), IsPACLearnerFor m ε₁ δ C 𝒟) :

                          Deterministic sample complexity is antitone in the accuracy parameter ε: weaker accuracy requires no more samples.

                          theorem Cslib.MachineLearning.PACLearning.sampleComplexity_mono_family {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {ε δ : (Set.Ioo 0 1)} {C : ConceptClass α β} {𝒟 𝒟' : Set (MeasureTheory.Measure (α × β))} (h𝒟 : 𝒟 𝒟') (h : ∃ (m : ), IsPACLearnerFor m ε δ C 𝒟') :

                          Deterministic sample complexity is monotone in the distribution family under : a smaller family (fewer distributions to cover) requires no more samples.

                          theorem Cslib.MachineLearning.PACLearning.sampleComplexity_mono_C {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {ε δ : (Set.Ioo 0 1)} {C C' : ConceptClass α β} (hC : C C') {𝒟 : Set (MeasureTheory.Measure (α × β))} (h : ∃ (m : ), IsPACLearnerFor m ε δ C' 𝒟) :

                          Deterministic sample complexity is monotone in the concept class under : a smaller class (weaker agnostic benchmark) requires no more samples.

                          theorem Cslib.MachineLearning.PACLearning.rsampleComplexity_antitone_δ {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {ε δ₁ δ₂ : (Set.Ioo 0 1)} ( : δ₁ δ₂) {C : ConceptClass α β} {𝒟 : Set (MeasureTheory.Measure (α × β))} (h : ∃ (m : ), IsRPACLearnerFor m ε δ₁ C 𝒟) :
                          rsampleComplexity C ε δ₂ 𝒟 rsampleComplexity C ε δ₁ 𝒟

                          Randomized sample complexity is antitone in the confidence parameter δ.

                          theorem Cslib.MachineLearning.PACLearning.rsampleComplexity_mono_family {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {ε δ : (Set.Ioo 0 1)} {C : ConceptClass α β} {𝒟 𝒟' : Set (MeasureTheory.Measure (α × β))} (h𝒟 : 𝒟 𝒟') (h : ∃ (m : ), IsRPACLearnerFor m ε δ C 𝒟') :
                          rsampleComplexity C ε δ 𝒟 rsampleComplexity C ε δ 𝒟'

                          Randomized sample complexity is monotone in the distribution family under .

                          Convenience variants conditional on learnability, which discharge the nonemptiness hypothesis (∃ m, IsPACLearnerFor m …) from an IsPACLearnable / IsRPACLearnable witness. Bodies go through sampleComplexity_le_of_forall directly rather than the top-level sampleComplexity_* lemmas, whose unqualified names would resolve as self-recursion inside these theorems' IsPACLearnable.* / IsRPACLearnable.* namespaces.

                          theorem Cslib.MachineLearning.PACLearning.IsPACLearnable.sampleComplexity_antitone_δ {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {C : ConceptClass α β} {𝒟 : Set (MeasureTheory.Measure (α × β))} (hL : IsPACLearnable C 𝒟) {ε δ₁ δ₂ : (Set.Ioo 0 1)} ( : δ₁ δ₂) :

                          sampleComplexity_antitone_δ for a learnable class: the nonemptiness hypothesis comes for free from IsPACLearnable.

                          theorem Cslib.MachineLearning.PACLearning.IsPACLearnable.sampleComplexity_antitone_ε {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {C : ConceptClass α β} {𝒟 : Set (MeasureTheory.Measure (α × β))} (hL : IsPACLearnable C 𝒟) {ε₁ ε₂ δ : (Set.Ioo 0 1)} ( : ε₁ ε₂) :

                          sampleComplexity_antitone_ε for a learnable class.

                          theorem Cslib.MachineLearning.PACLearning.IsPACLearnable.sampleComplexity_mono_family {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {C : ConceptClass α β} {𝒟 𝒟' : Set (MeasureTheory.Measure (α × β))} (hL : IsPACLearnable C 𝒟') (h𝒟 : 𝒟 𝒟') {ε δ : (Set.Ioo 0 1)} :

                          sampleComplexity_mono_family for a learnable class (learnability at the larger family 𝒟' is the hypothesis).

                          theorem Cslib.MachineLearning.PACLearning.IsPACLearnable.sampleComplexity_mono_C {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {C C' : ConceptClass α β} {𝒟 : Set (MeasureTheory.Measure (α × β))} (hL : IsPACLearnable C' 𝒟) (hC : C C') {ε δ : (Set.Ioo 0 1)} :

                          sampleComplexity_mono_C for a learnable class (learnability at the larger class C' is the hypothesis).

                          theorem Cslib.MachineLearning.PACLearning.IsRPACLearnable.rsampleComplexity_antitone_δ {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {C : ConceptClass α β} {𝒟 : Set (MeasureTheory.Measure (α × β))} (hL : IsRPACLearnable C 𝒟) {ε δ₁ δ₂ : (Set.Ioo 0 1)} ( : δ₁ δ₂) :
                          rsampleComplexity C ε δ₂ 𝒟 rsampleComplexity C ε δ₁ 𝒟

                          rsampleComplexity_antitone_δ for a randomized-learnable class.

                          theorem Cslib.MachineLearning.PACLearning.IsRPACLearnable.rsampleComplexity_mono_family {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] {C : ConceptClass α β} {𝒟 𝒟' : Set (MeasureTheory.Measure (α × β))} (hL : IsRPACLearnable C 𝒟') (h𝒟 : 𝒟 𝒟') {ε δ : (Set.Ioo 0 1)} :
                          rsampleComplexity C ε δ 𝒟 rsampleComplexity C ε δ 𝒟'

                          rsampleComplexity_mono_family for a randomized-learnable class.

                          Binary Classification #

                          When β = Bool, concepts correspond to subsets of α via the characteristic function. The symmetric-difference error P(h ∆ c) is the natural error metric, and it decomposes into false positive and false negative components.

                          The bridge lemma error_map_eq_hypothesisError connects the general error on α × Bool to the binary hypothesisError on α, showing they coincide for realizable distributions.

                          The symmetric-difference error of a hypothesis h with respect to a target concept c (both viewed as subsets of α) under distribution P, defined as P(h ∆ c).

                          Equations
                          Instances For

                            The false positive error P(h \ c) — points classified positive but not in the concept.

                            Equations
                            Instances For

                              The false negative error P(c \ h) — points in the concept but classified negative.

                              Equations
                              Instances For

                                The total hypothesis error decomposes as the sum of false positive and false negative errors, since h ∆ c = (h \ c) ∪ (c \ h) is a disjoint union.

                                theorem Cslib.MachineLearning.PACLearning.error_map_eq_hypothesisError {α : Type u_1} [MeasurableSpace α] (P : MeasureTheory.Measure α) (h c : Set α) (hh : MeasurableSet h) (hc : MeasurableSet c) :
                                (error (MeasureTheory.Measure.map (fun (x : α) => (x, decide (x c))) P) fun (x : α) => decide (x h)) = hypothesisError P h c

                                Under a realizable distribution P.map (x ↦ (x, c(x))), the general 0-1 error coincides with the binary hypothesisError P h c, where h and c are viewed as subsets of α via the characteristic function decide (· ∈ ·).