Documentation

Cslib.MachineLearning.PACLearning.VCDimension

VC Dimension for Concept Classes #

This file defines shattering and the Vapnik-Chervonenkis dimension for binary concept classes C : ConceptClass α Bool, i.e. sets of α → Bool classifiers. Each Boolean classifier c is identified with the subset c ⁻¹' {true} ⊆ α (the "positive set"), and C shatters a set W if every subset of W can be obtained as the positive set of some c ∈ C intersected with W. See also the Finset-based definitions in Mathlib.Combinatorics.SetFamily.Shatter.

Main definitions #

Main statements #

References #

A binary concept class C shatters a set W if for every subset W' ⊆ W, there exists a concept c ∈ C whose positive set c ⁻¹' {true} intersects W in exactly W'.

Equations
Instances For
    theorem Cslib.MachineLearning.PACLearning.SetShatters.subset {α : Type u_1} {C : ConceptClass α Bool} {W V : Set α} (hW : SetShatters C W) (hVW : V W) :

    Shattering is anti-monotone in the shattered set: if C shatters W and V ⊆ W, then C shatters V.

    theorem Cslib.MachineLearning.PACLearning.SetShatters.superset {α : Type u_1} {C C' : ConceptClass α Bool} {W : Set α} (hW : SetShatters C W) (hCC' : C C') :

    Shattering is monotone in the concept class: if C shatters W and C ⊆ C', then C' shatters W.

    theorem Finset.Shatters.toSetShatters {α : Type u_1} {𝒜 : Finset (Finset α)} {s : Finset α} (h : 𝒜.Shatters s) :
    Cslib.MachineLearning.PACLearning.SetShatters {c : αBool | t𝒜, ∀ (x : α), c x = decide (x t)} s

    If a finite set family 𝒜 shatters a finite set s in the sense of Mathlib's Finset.Shatters, then the concept class of characteristic functions of sets in 𝒜 shatters ↑s in the sense of SetShatters. This bridges Mathlib's finset-based shattering to the predicate used by the PAC learning lower bounds.

    noncomputable def Cslib.MachineLearning.PACLearning.vcDim {α : Type u_1} (C : ConceptClass α Bool) :

    The Vapnik-Chervonenkis dimension of a binary concept class C is the supremum of the cardinalities of finite sets shattered by C. Returns 0 when no finite set is shattered (i.e. the defining set is empty).

    Caveat: because sSup on returns 0 for unbounded sets, this definition is only meaningful when the VC dimension is finite — see HasFiniteVCDim.

    Equations
    Instances For

      A binary concept class C has finite VC dimension if there is a uniform upper bound on the cardinalities of finite sets it shatters. This is the hypothesis under which vcDim C is mathematically meaningful (otherwise vcDim returns 0 for unbounded shattered families via sSup on ).

      Equations
      Instances For
        theorem Cslib.MachineLearning.PACLearning.hasFiniteVCDim_iff {α : Type u_1} {C : ConceptClass α Bool} :
        HasFiniteVCDim C ∃ (N : ), ∀ (W : Finset α), SetShatters C WW.card N

        A class has finite VC dimension iff there is a uniform bound on the cardinality of every shattered finite set.