Skip to content

chapter 3,the cluster prediction’s issue. #367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
uammy opened this issue Dec 2, 2017 · 1 comment
Open

chapter 3,the cluster prediction’s issue. #367

uammy opened this issue Dec 2, 2017 · 1 comment

Comments

@uammy
Copy link

uammy commented Dec 2, 2017

the print “Probability of belongs to cluster 1:” should be “Probability of belongs to cluster 0:” ??
or i miss the meanings?

@Alescontrela
Copy link

I think this is the case as well. Earlier in the chapter he states:

A priori, we do not know what the probability of assignment to cluster 1 is, so we form a uniform variable on (0,1) . We call call this p1 , so the probability of belonging to cluster 2 is therefore p2=1−p1 .

and defines the variables as:

with pm.Model() as model:
    p1 = pm.Uniform('p', 0, 1)
    p2 = 1 - p1
    p = T.stack([p1, p2])
    assignment = pm.Categorical("assignment", p, 
                                shape=data.shape[0],
                                testval=np.random.randint(0, 2, data.shape[0]))

where p1 is the probability of belonging to the lower-mean (~120) cluster and p2 is the probability of belonging to the higher-mean (~190) cluster.

But later on he refers to the clusters as cluster 0 and cluster 1

we are interested in asking "Is the probability that x is in cluster 1 greater than the probability it is in cluster 0?", where the probability is dependent on the chosen parameters.

Where cluster 0 is the lower-mean cluster in this case and cluster 1 is the higher-mean cluster:

Screen-Shot-2019-04-24-at-8-21-30-PM

but then uses p_trace (p1) which is the probability of belonging to cluster 0 instead of using p2 (1-p1):

v = p_trace * norm_pdf(x, loc=center_trace[:, 0], scale=std_trace[:, 0]) > \
    (1 - p_trace) * norm_pdf(x, loc=center_trace[:, 1], scale=std_trace[:, 1])

print("Probability of belonging to cluster 1:", v.mean())

I think this typo stems from the swtichup in syntax of "cluster 1 and cluster 2" to "cluster 0 and cluster 1"

Alescontrela added a commit to Alescontrela/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers that referenced this issue Apr 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants