API reference¶
hsmm module¶
-
class
hsmmlearn.hsmm.HSMMModel(emissions, durations, tmat, startprob=None, support_cutoff=100)¶ HSMM model base class.
This class provides the basic functionality to work with hidden semi-Markov models:
- Sampling from a HSMM via the
sample()method; - Fitting the parameters of a HSMM to a sequence of observations via the
fit()method.
These methods are agnostic as to what HSMM is being used: the specifics of the model are all encapsulated in the properties tmat, emissions, and durations, which control the transition matrix, the emission distribution, and the duration distribution. In particular, the emission distribution can be discrete or continuous.
-
__init__(emissions, durations, tmat, startprob=None, support_cutoff=100)¶ Create a new HSMM instance.
Parameters: - emissions (hsmmlearn.emissions.AbstractEmissions) – Emissions distribution to use.
- durations (numpy.ndarray or list of random variables.) – Durations matrix with shape (n_states, n_durations). If a list of
random variables is passed in, all RVs must be subclasses of
scipy.stats.rv_discrete. In this case, the duration probabilities are obtained from the support of the RVs, from 1 to support_cutoff. - tmat (numpy.ndarray, shape=(n_states, n_states)) – Transition matrix.
- startprob (numpy.ndarray, shape=(n_states, )) – Initial probabilities for the Markov chain. If this is
None, the uniform distribution is assumed (all states are equally likely). - support_cutoff (int) – Maximal duration to take into account. This is used when passing in
a list of random variables for the durations, which will then be
sampled from 1 to
support_cutoff.
-
decode(obs)¶ Find most likely internal states for a sequence of observations.
Given a sequence of observations, this method runs the Viterbi algorithm to find the most likely sequence of corresponding internal states. “Most likely” should be interpreted here (as in the classical Viterbi algorithm) as the sequence of states that maximizes the joint probability P(observations, states).
Parameters: obs (numpy.ndarray, shape=(n_obs, )) – Observations. Returns: states – Reconstructed internal states. Return type: numpy.ndarray, shape=(n_obs, )
-
fit(obs, max_iter=20, atol=1e-05, censoring=True)¶ Fit the parameters of a HSMM to a given sequence of observations.
This method runs the expectation-maximization algorithm to adjust the parameters of the HSMM to fit a given sequence of observations as well as possible. The HSMM will be updated in place, unless an error occurs, in which case the original HSMM is restored.
Parameters: - obs (numpy.ndarray, shape=(n_samples, )) – Sequence of observations.
- max_iter (int) – Maximum number of EM steps to do before terminating.
- atol (float) – Absolute tolerance to decide whether the EM algorithm has converged.
- censoring (bool) – Whether to apply right-censoring.
Raises: NoConvergenceError– If the algorithm terminated abnormally before converging.
-
n_durations¶ The number of durations for this HSMM.
-
n_states¶ The number of hidden states for this HSMM.
-
sample(n_samples=1)¶ Generate a random sample from the HSMM.
Parameters: n_samples (int) – Number of samples to generate. Returns: observations, states – Random sample of observations and internal states. Return type: numpy.ndarray, shape=(n_samples, )
- Sampling from a HSMM via the
emissions module¶
-
class
hsmmlearn.emissions.AbstractEmissions¶ Base class for emissions distributions.
To create a HSMM with a custom emission distribution, write a derived class that implements some (or all) of the abstract methods. If you don’t need all of the HSMM functionality, you can get by with implementing only some of the methods.
-
copy()¶ Make a copy of this object.
This method is called by
hsmmlearn.hsmm.HSMMModel.fit()to make a copy of the emissions object before modifying it.
-
likelihood(obs)¶ Compute the likelihood of a sequence of observations.
This method is called by
hsmmlearn.hsmm.HSMMModel.fit()andhsmmlearn.hsmm.HSMMModel.decode().Parameters: obs (numpy.ndarray, shape=(n_obs, )) – Sequence of observations. Returns: likelihood Return type: float
-
reestimate(gamma, obs)¶ Estimate the distribution parameters given sequences of smoothed probabilities and observations.
The parameter
gammais an array of smoothed probabilities, with the entrygamma[s, i]giving the probability of finding the system in statesgiven all of the observations up to indexi:\[\gamma_{s, i} = P(s | o_1, \ldots, o_i ).\]This method is called by
hsmmlearn.hsmm.HSMMModel.fit().Parameters: - gamma (numpy.ndarray, shape=(n_obs, )) – Smoothed probabilities.
- obs (numpy.ndarray, shape=(n_obs, )) – Observations.
-
sample_for_state(state, size=None)¶ Return a random emission given a state.
This method is called by
hsmmlearn.hsmm.HSMMModel.sample().Parameters: - state (int) – The internal state.
- size (int) – The number of random samples to generate.
Returns: observations – Random emissions.
Return type: numpy.ndarray, shape=(size, )
-
-
class
hsmmlearn.emissions.GaussianEmissions(means, scales)¶ An emissions class for Gaussian emissions.
This emissions class models the case where emissions are real-valued and continuous, and the probability of observing an emission given the state is modeled by a Gaussian. The means and standard deviations for each Gaussian (one for each state) are stored as state on the class.
-
class
hsmmlearn.emissions.MultinomialEmissions(probabilities)¶ An emissions class for multinomial emissions.
This emissions class models the case where the emissions are categorical variables, assuming values from 0 to some value k, and the probability of observing an emission given a state is modeled by a multinomial distribution.
properties module¶
-
class
hsmmlearn.properties.Durations¶ Data descriptor for a durations distribution.
-
__get__(obj, type=None)¶ Return the durations distribution.
Returns: durations – Durations matrix. Return type: numpy.ndarray, shape=(n_states, n_durations)
-
__set__(obj, durations)¶ Update the durations distribution with new durations.
Parameters: - obj (hsmmlearn.hsmm.HSMMModel) – The underlying HSMMModel.
- durations (numpy.ndarray or list of random variables.) – This can be either a numpy array, in which case the number of rows must be equal to the number of hidden states, or a list of scipy.stats discrete random variables. In the latter case, the duration probabilities are obtained directly from the random variables.
-
-
class
hsmmlearn.properties.Emissions¶ Data descriptor for an emissions distribution.
-
__get__(obj, type=None)¶ Return the emissions distribution.
Returns: emissions – The emissions distribution. Return type: hsmmlearn.emissions.AbstractEmissions
-
__set__(obj, emissions)¶ Set the emissions distribution.
Parameters: - obj (hsmmlearn.hsmm.HSMMModel) – The underlying HSMMModel.
- emissions (hsmmlearn.emissions.AbstractEmissions) – The emissions distribution.
-
-
class
hsmmlearn.properties.TransitionMatrix¶ Data descriptor for a transitions matrix.
-
__get__(obj, type=None)¶ Return the transition matrix.
Returns: tmat – A transition matrix. Return type: numpy.ndarray, shape=(n_states, n_states)
-
__set__(obj, value)¶ Set a new transition matrix.
Parameters: - obj (hsmmlearn.hsmm.HSMMModel) – The underlying HSMMModel.
- value (numpy.ndarray, shape=(n_states, n_states)) – The new transition matrix. This must be a square matrix. If a transition matrix was previously assigned to the HSMM, the new transition matrix must have the same number of rows.
-