pasterdg.blogg.se - When to use pdf vs cdf

WHEN TO USE PDF VS CDF CODE

The reason can be that cdf is computed with polynomials approximations.

WHEN TO USE PDF VS CDF CODE

P.S.: if you try to get rid of the loops in your original code and have simply x = np.arange(x_lower, x_upper, (x_upper - x_lower) / (num_iter - 1)), cdf is again faster. Notice that also norm.pdf pre-initializes the denominator of the pdf, but in the for loop you are calling the method every time, slowing things down. Thus I tried to calculate the pdf having the constants already initialized: import scipy.stats as st This could be the reason why cdf is faster than pdf! So I checked again the C implementation for the cdf, and I saw that constants and coefficients of the polynomials that evaluate the special functions are not computed but stored in arrays and variables! For example, 1/sqrt(2) is contained in NPY_SQRT1_2.

If you look at this question, it seems that numerical values and hardware architecture can affect the time.

Therefore, to understand why the difference in the times, one should understand what happens in C. Sorry, I just realized my answer does not completely answer your question.įirst of all, NumPy also implements mathematical operations in C. So, even if NumPy is really fast, C is still faster in this case, I assume. In particular, the cumulative distribution function is computed from ndtr.c. SciPy implements special functions directly in C. If the integration is the case, cdf should have been much slower than pdf (maybe parallel computing can help a lot?) if the asymptotic approach is applied, I still think cdf might be a little slower than pdf.īelow shows some simple samples: import scipy.stats as stįor x in np.arange(x_lower, x_upper, (x_upper - x_lower) / (num_iter - 1)):Īnd here are the running results: 0:00:05.736985Ī quick look in the source code shows that simply returns the value for x of the pdf using NumPy: def _norm_pdf(x):įor the cdf, since we talk of a normal distribution, special functions are used (for the relation between them and the normal distribution, see here). That's why I cannot imagine cdf is faster than pdf. I know that there are some asymptotic approaches for norm.cdf, while it seems that in scipy, the integration of norm.pdf is used. In the upper frames of Fig8.1.4.1.2 the peak of the forecast PDF (red) is to the right of the peak of the M-climate PDF (blue), indicating that the forecast predicts warmer than normal conditions and the sharpness of the peak indicates fairly high probability. I'm wondering why the cdf is faster than pdf? A s teep slope of the CDF, or equivantly a narrow peak of the PDF, implies a high confidence in the forecast.

I'm now using scipy for some norm.pdf and norm.cdf calculations.