Crates.io | advancedresearch-hypo |
lib.rs | advancedresearch-hypo |
version | 0.2.0 |
source | src |
created_at | 2019-11-19 19:41:27.065711 |
updated_at | 2023-11-14 17:29:30.565526 |
description | Automatic hypothesis testing |
homepage | https://github.com/advancedresearch/hypo |
repository | https://github.com/advancedresearch/hypo.git |
max_upload_size | |
id | 182524 |
size | 10,216 |
This library consists of various algorithms for automatic hypothesis testing.
In the scientific method, one formulates a falsifiable hypothesis which is tested against experiments.
Instead of designing one experiment to test a single hypothesis, one can design an experiment to eliminate as many hypotheses as possible. This method is somewhat efficient when hypotheses are easy to generate and computationally fast to make predictions.
Automatic hypothesis testing can be used when hypotheses are generated in advance and the experiments can be configured and executed automatically.
An optimal guess is an experiment which is picked based on its capability to eliminate potentially false hypotheses.
Assume that an experiment simply returns a conclusion true
or false
.
Before performing the experiment, the conclusion is unknown.
To maximize the number of eliminated false hypotheses, one exploits the property that the conclusion of the experiment is yet unknown. The optimal guess is picked such that competing hypotheses disagree with each other.
The ideal experiment is one that eliminates half the hypotheses
no matter what the conclusion of the experiment will be.
This works in a similar way to binary search,
except that hypotheses do not need to be sorted.
In practice one can choose the experiment for which the number of hypotheses
that returns true
is as close as possible to half the total number of hypotheses.
In path semantical notation,
the following measure is minimized over e
:
abs(|h : (prediction e)| - |Hypothesis| / 2)
h : Hypothesis
e : Experiment
prediction : Hypothesis x Experiment -> bool
By repeating such experiments, one is more likely to minimize the number of experiments required to eliminate as many false hypotheses as possible.